Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays on behavioral economics
(USC Thesis Other)
Essays on behavioral economics
PDF
Download
Share
Open document
Flip pages
Copy asset link
Request this asset
Transcript (if available)
Content
ESSAYS ON BEHAVIORAL ECONOMICS
by
Francesco Gabriele
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ECONOMICS)
May 2025
Copyright 2025 Francesco Gabriele
Ad Anna,
e a papà, mamma e Marco
ii
Acknowledgements
Life can only be understood backwards, but it must be lived forwards.
(Soren Kierkegaard, 1813 - 1855)
I have spent six incredible years during my doctoral studies at USC. This life changing experience
would never have been possible without the help and love of many people I have met in Los Angeles and
those who have supported me from afar.
I am truly indebted to my advisors. I want to express my heartfelt thanks to Paulina Oliva, my
main advisor, for the guidance and dedication she has invested in my research. To Robert Metcalfe, for
his feedback, mentorship and valuable perspectives. To Giorgio Coricelli, for believing in my potential
and for welcoming me in Los Angeles. And to Davide Proserpio, who has opened my eyes and helped
me navigate the field of Marketing and the business school market. I will always be grateful to all
the members of my dissertation committee for their wisdom, encouragement and trust throughout my
academic journey at USC.
At USC Department of Economics I have become a researcher in dialogue with Timothy Armstrong,
Vittorio Bassi, Daniel Bennett, Antonio Bento, Caroline Betts, Fanny Camara, Juan Carrillo, Thomas
Chaney, Dong Woo Hahm, Matthew Kahn, Yilmaz Kocer, Pablo Kurlat, Yu-Wei Hsieh, Jonathan Libgober,
Monica Morlacco, Jeff Nugent, Hashem Pesaran, Brijesh Pinto, Simon Quach, Geert Ridder, Guofu Tan
and Jeff Weaver. They have nurtured me academically, discussing my ideas and helping me develop a
critical spirit of research and curiosity.
Outside Kaprelian Hall I had great conversations with faculty at USC Marshall Business School. I
am in debt to Kristin Diehl, Anthony Dukes, Shantanu Dutta, Nikhil Malik, Dina Mayzlin, Yanhao Wei,
Sha Yang, Alex Miller, Poet Larsen and Ignacio Riveros, Daniel Sokol and Angela Zhou for many helpful
discussions and for inadvertently pushing me towards quantitative marketing research.
In graduate school I met many extraordinary scholars who provided me with invaluable comments
and feedback for my job market paper. I am very grateful to Susan Athey, Alessandro Bonatti, Tommaso
Bondi, David Byrne, Stefano Colombo, Giovanni Compiani, Dante Donati, Jean-Pierre Dubé, Chiara
Farronato, Michele Fioretti, Andrey Fradkin, Pedro Gardete, Matthew Gentzkow, Ginger Zhe Jin, Brett
Hollenbeck, Leonardo Madio, Andrea Mantovani, Noriaki Matsushima, Olivia Natan, Esteban RossiHansberg, Carlo Reggiani, Andrew Rhodes, Michelangelo Rossi, Stephan Seiler, Benjamin Shiller, Adam
Smith, Steven Tadelis, Francesco Trebbi, Giovanni Ursino and Miguel Villas-Boas.
iii
A warm thank to all the participants I interacted with at seminars at UTDT, Nova SBE, HKUST, Telecom
Business School, EARIE Summer School, USC, USC Marshall, USF, CalState Long Beach, Università
Bocconi, Università Cattolica, Compass LexEcon London and at conferences, the 16th Paris Conference
on Digital Economics, 2024 EWMES, 2024 SOCAE, 2024 ACLEC, 2024 AFE, 51st EARIE, 13th IBEO,
22nd ZEW Conference on the Economics of Information, 7th Doctoral Workshop on The Economics of
Digitization, 22nd IIOC, 2024 Imperial College PhD Conference, 2024 WoPA, 16th TSE Digital Economics
Conference, 2023 SOCAE, 2023 Milan PhD Economics Workshop and 2023 TADC.
I would like to thank Lodovico Pizzati, for having me as TA of Economics of European Integration
Maymester class at KU Leuven, and Luca Vittorio Colombo, for having me as summer visiting PhD
candidate at Università Cattolica. I also have to thank Gabriele Favaró, without whose help I would
never have graduated, or found a job. I also express my gratitude for the support I have received from
staff at USC Department of Economics, especially to Annie Le, Young Miller, Irma Alfaro and Alexander
Karnazes, for student assistance from OIS Office of International Services and for technical support from
USC Stevens Center for Innovation, as well as for funding support from Dornsife PhD Academy and USC
Marshall Initiative on Digital Competition.
And lastly, my immense gratitude goes to the Marketing faculty at HKUST Business School, especially
Jiewen Hong, Kristiaan Helsen, Ralf Van Der Lans, Mengze Shi, Song Lin, Amy Dalton, Tianyu Han and
Sang Kyu Park, for taking a chance on me and allowing me to realize my dream in Hong Kong.
I could not have made it this far without all the friends I met at USC Department of Economics,
especially Marco and Alessandro, my classmates, Daniel, Zhan, Wesley, Chang, Qitong, Zheng, Jack, Tri
and Kota. And all the PhDs who have welcomed me and have kept me company in these years: Andreas,
Eunjee, Jingbo, Zhen, Rachel, Yejia, Grigory, Sam, Karim, Islamul, Ruozi, Rajat, Dario, Nicolas, Clement,
Amy, Monira, Juan, Jingyi, Yi-Ju, Fatou and Bardia, as well as Matt, Margarita, Minji, Wenda, Jin, Paul,
Mychaela, Vanshika, Juliana, Santiago, Pierre, Aruj, Julian, Alison, Saloni, Ignacio, Juan, Hamzah, Joshua,
Sankalp, Matteo, Patricia and Thalis.
Together with all the friends Los Angeles gifted me: Domenico, Gianlorenzo, Giovanni, Andrea,
Pietro and Joaquín, the amazing fellowship of Bentley, and then Tommaso and Wendy, Letizia and Alex,
Luca and Mambo, Vinnie and Stefano, Marco Sozzi, il Maestro, Francesco, Simone and Vanessa, Pietro
and Rachele, Faro and Anna, Akira and Mio, Manu, Steven, Calvin, Fati, Vicky and Lorenzo, Matteo
and Michelle, Brittany, Marco and Elisa, Maria Giulia and Marco, Angel, Marcela, Camilla, Costanza,
Nicole, Chiara, Elisa, Ludovica, Catherine, Aya, Adeline and Simon, Sam and Penny, Mika, Kelsey and
Emma, Guido, Maurizio and Mauro, Nancy, Adele and Franz, Mario and Rae, Anna and Giorgio, Marco
and Mike, Teresa, Marianna, David and Aloisa, Patsy and Brett. And the people I have found in San
Francisco: Cedomir, Maria and Teresa, Martina, Maria Teresa, Elisabetta and Tommaso, with Giovanni,
Maria and Marta, Francesco and Laura, Federico and Carlo, Victor and Kate, Elisabetta and Filippo.
I am full of gratitude to the many friends who patiently have not given up on me, even after all these
years spent so far away. Thanks Pigi, Stringo and Marta, Giacomo and Maddalena, Vincenzo and Chiara,
Stew and Chiara, Simone and Anna, Michele and Teresa, Michela, Marilù, Depo, Pietro and Ruk, Elisa,
Giovanni and Tommaso, Michele and Doge, and Marta.
iv
I am grateful to all my beloved family. A heartfelt thanks to my uncles, my aunts, my many cousins
and my grandparents, nonna Marisa and nonno Bruno, and nonna Annalisa and nonno Enrico, for still
guiding me with their example. And to my new family, thanks to Valerio and Giovanna, Margherita,
Francesco and Tommaso, nonni Lina and Antonio, for always making me feel at home. The unwavering
support of my family in pursuing my dream, even if they did not fully understand it or did not want to
see me so far away from home, has brightened the most difficult moments of this journey. I have always
been ambitious, perhaps selfish and stubborn, but my greatest achievement is to make you proud of the
man I become and what I am building with my own hands. Grazie papà, grazie mamma, grazie Marco:
vi voglio bene!
At last, the person who deserves the most always comes in the end. Grazie Anna, c’è tutta una vita
che ci attende, non vedo l’ora di viverla con te. Ti amo con tutto il mio cuore!
v
Table of Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Chapter 1: The Welfare Effects of Behavior-based Price Discrimination in E-commerce . . . . . . . 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Partner Company and Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Field Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Descriptive Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 Causal Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Demand: Elasticity to Exogenous Treatments . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.1 Reduced-form Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.2 Demand Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Utility Maximization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.3 Identification and Likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4.4 Estimation of the Demand Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Elasticity of Demand to Treatments . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5 Supply: Endogenous Pricing Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.1 Policy Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Condition of Optimal Treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Optimal Treatment Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.5.2 Estimation of the Policy Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.5.3 Counterfactual Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Alternative Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Policy Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.5.4 Behavior-based Price Discrimination . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.6 Welfare Analysis and Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
vi
1.6.1 Aggregate Welfare Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.6.2 Welfare Analysis of Behavior-based Price Discrimination . . . . . . . . . . . . . . . 31
1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Chapter 2: Regret, Relief, Envy and Gloating: An Experimental Investigation of Counterfactual
Emotions in Children . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3 Reduced-form Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.1 Linear Mixed-effect Model of Emotion Rating . . . . . . . . . . . . . . . . . . . . . 41
2.3.2 Logistic Mixed-effect Model of Choice Adaptation . . . . . . . . . . . . . . . . . . . 43
2.4 Discrete-choice Demand Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.4.1 Demand Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.4.2 Estimation Strategy and Instrumental Variables . . . . . . . . . . . . . . . . . . . . 47
2.4.3 Demand Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.5 Discussion and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Chapter 3: From Non-deterministic Preferences to Reduced Value-based Neuronal Activity:
Stochastic Choices in Human Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 Stochastic Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3 Neuroeconomics of Stochastic Choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3.1 Bounded rationality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3.2 Noisy Neuronal Activity Encoding Value Signal . . . . . . . . . . . . . . . . . . . . 56
3.4 Dopamine Decrease in Human Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.5 Evidences of Age-related Changes in Choice Behavior . . . . . . . . . . . . . . . . . . . . . 57
3.5.1 Learning During Lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5.2 Intertemporal Choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5.3 Age-related Risk Propensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Appendix A: Appendix to Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
A.1 Technical Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
A.1.1 Proof of Proposition 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
A.1.2 Derivation of log-Likelihood Functions . . . . . . . . . . . . . . . . . . . . . . . . . 74
A.1.3 Derivation of Demand Elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
A.1.4 Proof of Proposition 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
A.2 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
A.2.1 Description of the Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
vii
A.2.2 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
A.3 Omitted Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
A.4 Omitted Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
A.5 Machine Learning and Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
A.5.1 Ensemble Machine Learning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 107
A.5.2 Ensemble Learning: Bagging, Random Forest, Adaptive Boosting, Gradient Boosting and Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
A.5.3 Estimation of Primitives of the Policy Model . . . . . . . . . . . . . . . . . . . . . . 111
Appendix B: Appendix to Chapter 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
B.1 Omitted Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
B.2 Omitted Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Appendix C: Appendix to Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
C.1 Critical Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
viii
List of Tables
1.1 Estimates of demand parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2 Estimates of treatment elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.3 Variables in the vector of covariates Xi
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.4 Estimates of counterfactual simulations of expected profit . . . . . . . . . . . . . . . . . . . 26
1.5 Estimates of counterfactual simulations of expected profit for new and returning users . . 28
A.1 Summary statistics of main variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
A.2 Correlation matrix for continuous variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
A.3 Summary statistics of technology controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
A.4 Summary statistics of time controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
A.5 Summary statistics of location controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
A.6 Likelihood of conversion with new user dummy and purchase history dummy . . . . . . 88
A.7 Gross order with new user dummy and purchase history dummy . . . . . . . . . . . . . . 88
A.8 Likelihood of conversion with discount percentage and minimum purchase threshold . . 89
A.9 Gross order with discount percentage and minimum purchase threshold . . . . . . . . . . 89
A.10 Estimates of treatment elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
A.11 Performance of classification ensemble methods . . . . . . . . . . . . . . . . . . . . . . . . 90
A.12 Performance of regression ensemble methods . . . . . . . . . . . . . . . . . . . . . . . . . . 91
A.13 Estimates of counterfactual simulations of conversion rate . . . . . . . . . . . . . . . . . . 91
A.14 Estimates of counterfactual simulations of conversion rate for new and returning users . . 92
B.1 Descriptive statistics of experiment participants . . . . . . . . . . . . . . . . . . . . . . . . . 113
B.2 Upward comparison: Regret and envy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
B.3 Downward comparison: Relief and gloating . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
B.4 Choice adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
B.5 Estimates of demand model parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
B.6 Estimates of demand model parameters. Downward comparison: Relief and gloating . . 117
B.7 Estimates of demand model parameters. Upward comparison: Regret and envy . . . . . . 117
C.1 Summary of critical review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
ix
List of Figures
1.1 Gross order distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Gross order distribution per treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Average treatment effect by discount and minimum purchase threshold
. . . . . 10
1.4 Aggregate producer surplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.5 Aggregate consumer surplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.6 Welfare analysis of behavior-based price discrimination . . . . . . . . . . . . . . . . . . . . 32
2.1 Four conditions of the Wheels of Fortune task . . . . . . . . . . . . . . . . . . . . . . . . . . 38
A.1 Screenshots of McDonald’s app (Source: Taunton (2023)) . . . . . . . . . . . . . . . . . . 93
A.2 Homepage of Alpha website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
A.3 Treatment per control, semi-treatment and full-treatment group . . . . . . . . . . . . . . . 94
A.4 Example of conversion funnel experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
A.5 Number of converted sessions by gross order per semi-treatment discount percentage . . 97
A.6 Number of converted sessions by gross order per full-treatment minimum purchase threshold 98
A.7 Graphical intuition of constrained utility maximization problem . . . . . . . . . . . . . . . 99
A.8 Graphical analysis of optimal policy marginal variation . . . . . . . . . . . . . . . . . . . . 100
A.9 Distributions of optimal policy function b() per treatment . . . . . . . . . . . . . . . . . 101
A.10 Treatment distributions of profit and indirect utility per policy . . . . . . . . . . . . . . . . 102
A.11 Treatment distributions of profit and indirect utility per policy for new users . . . . . . . 103
A.12 Treatment distributions of profit and indirect utility per policy for returning users . . . . 104
A.13 Welfare analysis for new users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
A.14 Welfare analysis for returning users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
B.1 Propensity of risk by age and by gender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
B.2 Propensity of risk by age and by comparison with partial feedback . . . . . . . . . . . . . 119
B.3 Propensity of risk by age and by comparison with complete feedback . . . . . . . . . . . . 119
B.4 Emotional evaluation by age and by gender . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
B.5 Emotional evaluation by age and by comparison with partial feedback . . . . . . . . . . . 120
B.6 Emotional evaluation by age and by comparison with complete feedback . . . . . . . . . . 121
B.7 Propensity of risk by age and by gender conditional on comparisons with partial and
complete feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
B.8 Emotional evaluation by age and by gender conditional on comparisons with partial and
complete feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
x
Abstract
This dissertation consists of three essays and explores different topics in Behavioral Economics. Each
chapter presents a research perspective of economic behavior in different context. The first covers topics
at the intersection of field experiment, empirical industrial organization and quantitative marketing, the
second combines subjects of experimental economics and development of human emotions, while the
third reviews both decision theory and neuroeconomics.
The first chapter focuses on how information technology enables digital platforms to use consumers’
data, and how such information asymmetries shape marketing strategies and economic outcomes. In
particular, the first essay seeks to understand the role of personalization in pricing strategies and the
welfare consequences of e-commerce retailers’ power to price discriminate. E-commerce retailers have
the power to price discriminate based on users’ online past purchase behavior. The chapter develops
a structural model of consumer demand and a pricing policy model to quantify the welfare effects of
behavior-based price discrimination (BBPD). Using data from a randomized controlled experiment on a
cosmetics e-commerce site, the structural estimation reveals elastic demand to price discount treatments.
With nonparametric estimation via machine learning, the counterfactual analysis tests different pricing
algorithms and shows that personalized price discrimination increases e-commerce profit by 24% and
consumer surplus by 4%, relative to uniform pricing. This result suggests that exploiting past purchase
history is profitable for the monopolistic e-commerce: BBPD complements targeting discounts by generating an additional 11% gain in producer surplus without harming loyal customers. The analysis
contributes to the current public policy debate about pricing strategies in digital markets as the welfare
analysis has implications for privacy policy.
The second chapter studies the development of counterfactual emotions in childhood. Counterfactual
thinking involves the imagination of hypothetical alternatives to reality to compare experienced reality
with counterfactual scenarios. Renewed interest in behavioral research has focused on the development
of human counterfactual emotions that allow for such fundamental cognitive process. Albeit the use
of simple paradigms in human development studies, a coherent picture of the evolution of these emotions in childhood has yet to emerge. The chapter provides an original contribution by using laboratory
experimental data to estimate a discrete-choice demand model where children consider counterfactual
scenarios. The analysis measures the effects of counterfactual emotions in a risky decision-making task
and controls for age within different cohorts of children. Results are consistent with previous findings.
xi
Counterfactual emotions such as regret, relief, envy and gloating tend to emerge around 6 years of age.
The incidence of counterfactual emotions is salient especially for emotions such as regret and envy, suggesting that children are more sensitive to negative rather than positive emotions. The novelty of this
essay consists in applying a microeconometric demand framework to a behavioral research question in
order to provide estimates which corroborate previous findings.
The third chapter investigates the hypothesis that randomness in decision-making of elderly results
from two different, yet connected effects. The essay argues that, on the one hand, the exhibited differences
in choice behavior between youngsters and elderly is caused by the evolution of substantial preferences.
Whereas, on the other hand, variation in decision-making for older individuals is a consequence of the
intrinsic stochasticity of neuronal activity and age-related deterioration of the dopaminergic system. The
first effect results from evolving preferences and stochastic utility functions. The chapter discusses the
random utility framework which accounts for non-deterministic preferences changing during lifetime.
The second factor derives from the decay of brain circuits encoding decision-specific value signal. The
essay presents the drift-diffusion model which considers increasing stochasticity of neuronal activity
as a biological consequence of dopamine decay in old individuals. By reviewing recent results in the
neuroscientific literature, the chapter supports the thesis that both components play a role in elderly
decision-making. The implications of this novel perspective are notably relevant in terms of welfare
evaluation. Indeed, stochastic feature of underlying preferences cannot necessarily be deemed as welfare
decreasing. Conversely, the comparison of decision values is affected by an error term which is a
computational noise. Thus, the worsening of neuronal circuits supporting decision values results strictly
welfare decreasing.
xii
Chapter 1
The Welfare Effects of Behavior-based Price
Discrimination in E-commerce
1.1 Introduction
“In the future, what you pay will be determined by where you live and who you are. It’s unfair, but that doesn’t
mean it’s not going to happen”(Vernon Keenan, "On the Web, Price Tags Blur",
The Washington Post, September 2000).
With the growing digitization of market transactions, the development of customers’ recognition
technologies has enabled digital companies to collect thorough data about their users. As digital information has immense economic potential, nowadays users’ tracking is predominant in many industries
ranging from online retailers to tech apps.1 Behavioral information underpins targeted marketing strategies such as personalized pricing. The ability of online platforms to identify customers allows them to
offer different prices to individuals based on observable characteristics.2 Personalized pricing strategies
based on past purchase history are regarded as behavior-based price discrimination (BBPD).
A new era in the regulation of personal data privacy and deceptive conduct of digital companies began
in 2012 with the FTC passage of the E-STOP Act.3 Lately, mounting concerns and public debate over the
commercial use of customers’ data, led to the introduction of the California Consumer Privacy Act in US,
1As reported by Colombo (2016), examples of companies operating in different industries which price discriminate across
consumers on the basis of their online behavior are Amazon, eBay, Wal-Mart, Netflix, and Uber. Also Pazgal and Soberman
(2008) and Esteves (2014) discuss various examples of industries in which companies collect, process and utilize users’ data for
profiling and price discrimination purposes.
2Price discrimination refers to the firms’ power to price different market segments based on directly observable consumers’
characteristics. A recent episode, for instance, saw McDonald’s being accused of price profiling. Taunton (2023) reports that
the fast food chain was criticized for price-gouging after some customers were offered menus at higher prices though the
McDonald’s app. The comparison of two screenshots reveals that the same meal deal was offered at $14 and at $17In addition,
the combo of McFlurry desserts had two different prices: $7 or $8.50 (Figure A.1). It was also reported that McDonald’s
spokesperson replied that app offers differ between users, based on a variety of factors. McDonald’s spokesperson expressly
claimed: “Due to the personalization of our app, not all customers will see the same deals, and as an example a deal may be offered to
encourage use of the app on the customer’s next visit”.
3The Ensuring Shoppers Transparency in Online Pricing (E-STOP) Act of 2012 directs the Federal Trade Commission to
promulgate rules requiring an Internet merchant with annual gross revenue of more than $1 million to disclose to each consumer,
prior to the final purchase of any good or service, the use of a price-altering computer program, defined as one that: (i) accesses
a consumer’s personal information; and (ii) uses such information to alter the merchant’s selling price of a good or service.
1
and the EU adoption of the General Data Protection Regulation in 2018. Although these regulations aim
at preserving consumer privacy by enhancing the data protection obligations for digital businesses, price
discrimination practices are not utterly prevented. These policies indeed require clients to explicitly give
their consent before receiving personalized contents, which limits access to users’ information and hinders
the implementation of price discrimination for e-commerce retailers. However, while setting different
prices based on consumer personal data violates these obligations, offering personalized promotions or
discounts does not constitute an infringement (Baik & Larson, 2023).
Further disagreement concerns to what extent personalized marketing strategies resemble data-based
price discrimination and, consequently, what are the welfare implications of such practices. If on the one
hand third-degree price discrimination has been shown to benefit the monopolistic producer, on the other
hand it is still a matter of debate what are the consequences on consumer surplus (e.g., Bergemann et al.,
2015; Kallus & Zhou, 2021; De Cornière et al., 2023). Different fields of research have studied personalized
pricing. A large body of theoretical works has examined how behavior-based price discrimination affects
profit and welfare in competitive markets (e.g., Esteves, 2010; Chen & Pearcy, 2010; Choe et al., 2018;
Esteves et al., 2022).4 Whereas, a more limited applied literature has attempted to empirically test such
theoretical predictions using controlled experiments (e.g., Rossi et al., 1996; Mahmood, 2014; Brokesova
et al., 2014).5 As discussed by Mahmood (2014), although very sophisticated models have studied price
discrimination under different market hypotheses, yet a clear understanding of the mechanisms driving
the pricing strategies in real world settings, is still deficient. This paper investigates personalized pricing
strategies by studying the welfare implications of behavior-based price discrimination in e-commerce.6
The main contribution of this work is to develop an empirical framework to assess the welfare
effects of online price discrimination. Such framework illustrates how e-commerce platforms leverage
users’ information to personalize pricing, and can explain why some pricing algorithms fail to improve
profit. However, quantifying the welfare consequences of behavior-based price discrimination requires
information on producer pricing strategy and consumers’ demand, which are hardly disclosed by digital
companies. Even with access to observational data, the endogeneity of firms’ management guidelines,
marketing decisions and privacy policy, poses a serious obstacle in estimating demand and supply
4Esteves (2009) surveys the economics of behavior-based price discrimination, while Fudenberg and Villas-Boas (2012)
offers a general discussion of theoretical models and Belleflamme and Peitz (2015) provides textbook presentation of BBPD.
Theoretical works date back to Villas-Boas (1999) and Fudenberg and Tirole (2000), and address the economic implications
of price discrimination on firms’ profit and welfare effects. Theoretical results on personalized pricing based on consumers’
history suggest that the dominant strategy of competitive firms is to offer discounts with the twofold purpose of poaching
competitors’ customers and extracting profits from loyal clients.
5A canonical example is Rossi et al. (1996) which presents an novel empirical method using panel dataset of purchases
to investigate the value of consumer-level information. The paper estimates a Bayesian hierarchical choice model to derive
optimal targeted coupons. By empirically analyzing the net gain in revenues with different information sets, the potential
of purchase history for targeted discounts is analyzed. Instead, Mahmood (2014) and Brokesova et al. (2014) provide an
experimental approach with a focus on loyalty rewards’ profitability. These papers represent the first studies of behavioral
price discrimination adopting controlled laboratory experiments.
6Caillaud and De Nijs (2014) defines behavior-based price discrimination as “the practice of offering different prices to different
customers according to their past purchase history”. In the related literature BBPD has also been referred to as customer relationship
management based pricing (e.g., Shin & Sudhir, 2010), pricing with customer recognition (e.g., Fudenberg & Villas-Boas, 2006),
or one-to-one pricing (e.g., Rossi et al., 1996). An example of this practice is illustrated by Streitfeld (2000) reporting how in
2000 Amazon was accused of selling the same DVDs at different prices based on customers’ purchase histories. By deleting the
online cookies identifying them as regular clients, customers could observe the DVDs’ price fall from $26.24 to $22.74.
2
functions. Using large scale randomized controlled trials (RCTs) to evaluate BBPD effectiveness can
overcome these challenges (e.g., Gordon et al., 2019; Gosnell et al., 2020). This research developed thanks
to the collaboration with so-called partner Alpha, a tech company that sells customers’ segmentation
technology to digital businesses.7 The partner carried out a testing phase on a beauty products ecommerce site via RCT. Specifically, it randomized two different types of treatments: unconditional
discounts on purchases and discounts conditional on a minimum purchase threshold. Experimental
treatments result salient to e-commerce buyers. This paper estimates a structural demand model and
uses a pricing policy model to investigate the e-commerce optimal pricing strategy. Both models take
advantage of the exogenous variation in pricing treatments induced by RCT to estimate the demand
primitives and treatments’ causal effects on demand. By simulating different pricing algorithms, the
counterfactual analysis shows that uniform pricing results suboptimal. In comparison, behavioral price
discrimination increases welfare on both sides of the market as elastic users benefit from price discounts,
while targeting treatments stimulate demand and, consequently, profits.
The estimated demand model uses a discrete-continuous choice framework (Lee & Allenby, 2014).
Users enter the e-commerce platform, randomly receive a discount treatment, and choose how much to
spend, or not to purchase at all. The demand model features utility-maximizing consumers, who allocate
their exogenous endowment between the e-commerce aggregate purchase and the outside option (Kim et
al., 2023). Demand heterogeneity is accounted by the addition of an idiosyncratic shock in the quasi-linear
utility functional form. The identification strategy leverages concavity assumptions to define a sufficient
condition of optimality, which yields a lower and upper bound on the support of the error distribution for
the empirical demand. Since the support thresholds are nonlinear in the demand parameters, variation in
the distribution of e-commerce sales and randomized treatments identifies preferences. The estimation of
demand primitives reveals that discounts are salient to customers (e.g., Esteves & Reggiani, 2014; Hitsch
et al., 2021). Indeed, the model shows that consumers not benefiting the discount, are more responsive
to price changes of e-commerce goods. Furthermore, demand elasticity increases in the magnitude of
discounts, and depends on the interaction between discount percentages and minimum purchase levels.
On the supply side, a pricing policy models how the monopolistic policy-maker develops its optimal
pricing strategy (Miller & Hosanagar, 2020). The problem of the e-commerce firm is to select, conditional
on individual-specific information, the combination of discount and minimum purchase threshold to be
assigned to each customers in order to maximize expected profit. Each pricing treatment has a causal
impact on both the probability of conversion (extensive margin) and the expected spending (intensive
margin). A treatment is defined optimal if its causal effect on expected revenue dominates treatment
expected cost. Among different treatments satisfying this condition, the policy-maker assigns to each
user the treatment which maximizes the product between the extensive and intensive margin, weighted
by the treatment. For a treatments’ finite set, this assignment criterion produces a personalized treatment
algorithm. Analytically, heterogeneous average treatment effects on conversion rate and online spending
are estimated via supervised machine learning. Counterfactual analysis introduces alternative pricing
policies, whose assignment rule differs from the optimal treatment policy’s. Policy evaluation compares
7Alpha is a fictitious name of the company in this research partnership. Alpha is a tech company based in Silicon Valley with
offices in Europe, Asia, South America and Oceania. It was reported that Alpha generates an estimated annual revenue of $9M.
3
counterfactual policies through a consistent estimator of profit (e.g., Hitsch et al., 2023; Smith et al., 2023).
Information about users’ past purchase history is relevant to enhance the efficiency of discount targeting
algorithms. Behavioral analysis decomposes demand between new users, individuals who have never
visited on the e-commerce before, and returning users, who have already browsed the platform (Chu et
al., 2019). Returning users are more profitable than new users: the optimal pricing policy increases the
expected profit of returning users by 43%, whereas new users exhibit a 29% increase. Nevertheless, new
users’ demand appears to be elastic both at the extensive and intensive margin, while returning users
are very inelastic at the intensive margin. Such heterogeneity between the two demand segments can be
exploited by the policy-maker. By targeting only the causal impact on the extensive margin, a conversiontargeted policy generates an additional 11% profit gain compared to the aggregate optimal counterfactual
for returning users’ market share (e.g., Shiller, 2020). Thus, the e-commerce with monopolistic access to
users’ information, has incentives to discriminate its pricing policy based on customers’ history.
To conduct the welfare analysis, the demand of online users is combined with the pricing policy of
e-commerce firm, which allows to quantify the producer and consumer surplus. Simulated demands
under counterfactual policies are associated different distributions of both profit and indirect utility.
Relative to the uniform pricing, the optimal treatment policy increases e-commerce profit by 24.16%. This
algorithm accounts for demand heterogeneity by assigning to each user the treatment with the greatest
causal impact on both the extensive and intensive margin. Consumers also benefit from targeting price
discounts as, at the aggregate level, consumer surplus rises by 4.15%. While users do not benefit from any
discount with uniform pricing, users who receive the treatment, access the e-commerce goods’ basket at
a discounted price, contributing to the overall increase in consumer surplus (Gehrig et al., 2012; Carroni,
2018; Esteves & Shuai, 2023). If the e-commerce could observe users’ purchasing history, it would be
profit-increasing for the monopolist to integrate this information within its pricing algorithm. Unlike
aggregate price discrimination, which is achieved through the optimal treatment policy, behavior-based
price discrimination involves implementing different treatment policies for different segments of past
purchasing history. First-time individuals would still be assigned to the optimal treatment policy, which
remains the profit-maximizing policy and generates a 18.30% increase in producer surplus. Conversely,
returning users would be allocated to the conversion-targeted policy, which further drives an increase in
e-commerce profitability by 11.21% (+28.50%), relative to the optimal treatment policy (+25.63%).
Since the consumer surplus of returning users increases more (+4.89%) than new users’ (+3.65%),
loyal customers benefit more from the optimal treatment policy. Besides, behavior-based price discrimination would not harm them. Switching from optimal treatment policy to conversion-targeted policy
neither decreases nor increases the returning consumer welfare. Thus, by complementing personalized
discounts with users’ history information, behavioral price discrimination increases total welfare, particularly on users with past purchase experience (e.g., Colombo, 2018). In this sense, BBPD effectively
serves as a market segregation mechanism: e-commerce takes advantage of differences in demand elasticity revealed by past purchase information, to assign history-specific pricing algorithms to different
consumers’ segments. From a theoretical perspective, behavior-based price discrimination corresponds
to third-degree price discrimination (e.g., Cowan, 2012; Bergemann et al., 2022). However, different consumers’ groups are not charged different prices, but rather are assigned to different pricing algorithms.
4
Within the same segment and for the same, corresponding pricing policy, two users with different covariates might receive different discount treatments. This mechanism is thus closer to imperfect first-degree
price discrimination, which, at least theoretically, converges to perfect price discrimination as the set of
covariates increases and the treatment grid expands (e.g., Shiller et al., 2013).8
This paper contributes to the development of a recent empirical literature focusing on targeted pricing
in digital markets. It expands the research on personalized marketing by investigating how past purchase records affect the pricing strategy of e-commerce with monopolistic access to users’ information.9
Recent empirical works on personalization have developed new quantitative research methods through
advances in machine learning (ML). Supervised ML has contributed to a growing literature on targeting
implementation and pricing policy evaluation. This study contributes to an emerging marketing literature which combines ML techniques with randomized field experiments. Simester et al. (2020) clarifies
the advantage of using “randomized-by-action” instead of “randomized-by-policy” experiments. Since
randomized-by-action design allows the evaluation of new policies without additional experiments, an
off-policy approach is preferable for evaluating alternative policies that are not observable in the data.
Random assignment of policy treatments combined with off-policy profit evaluation have also been employed in latest publications (e.g., Yoganarasimhan et al., 2020; Liu, 2023; Hitsch et al., 2023; Dubé &
Misra, 2023).
Notably, Dubé and Misra (2023) is the first paper to investigate the welfare implications of personalized
pricing through scalable machine learning tools. Using data from a pricing experiment, they estimate
the demand model and builds a targeted pricing policy to quantify expected profits with personalized
prices, relative to a uniform pricing benchmark. This paper is complementary to Dubé and Misra
(2023) in the welfare analysis, although there are considerable differences. Since only new, prospective
customers are included in their samples, Dubé and Misra (2023) studies the welfare implications of
personalized pricing, but does not explore how consumer purchase information affects pricing strategies.
This paper contributes by specifically studying behavior-based price discrimination. Secondly, while
controlled experiments in Dubé and Misra (2023) randomize the monthly fee for an employer-employee
matching online platform, this paper leverages variation in percentage discounts of e-commerce checkouts, providing a more general framework for digital retailers. Furthermore, the focus on business-toclient transactions corroborates the external validity of this work, compared to Dubé and Misra (2023)
which simply refers on business-to-business decisions. Lastly, both papers measure consumer welfare
by simulating pricing policies with machine learning. However, Dubé and Misra (2023) estimates a logit
demand model using a weighted likelihood bootstrap algorithm and then calibrates the personalized
pricing policy. Instead, this paper first develops a structural model of consumer demand which accounts
for baskets of goods, as typical of e-commerce transactions, rather than single purchases. Pricing policies
8As argued in Dubé and Misra (2023), “Statistical uncertainty typically limits the segmentation to an imperfect form of targetability.
The approximation is also typically closer under unit demand since personalization typically cannot target a different price to each
inframarginal unit purchased by a consumer”.
9A summary by Arora et al. (2008) presents the main research challenges and the literature gaps related to both firms and
customers behavior in personalization contexts. More recently, through a comprehensive review of personalized marketing
literature, Chandra et al. (2022) defines personalization as “a strategy to gain a competitive advantage, encompassing learning,
matching, and delivering products and services to customers”. The authors argue that personalization per se aims at improving
customer satisfaction, thereby inducing loyalty. In this sense, personalized marketing can improve customer welfare.
5
are simulated through ML and policy evaluation adopts a consistent estimator of expected profit. At the
current stage of research, this paper is the first to explore behavior-based price discrimination combining
field experimental data, theoretical models and ML estimation methods. The results have important
implications for data privacy policy and the regulation of retail digital market.
This paper is structured as follows. Section 1.2 presents the partner company and data. Section
1.3 outlines the field experiment and discusses the empirical analysis. Sections 1.4 and 1.5 develop the
demand model and the pricing policy model, respectively. Section 1.6 implements the welfare analysis
and examines the welfare consequences of behavior-based price discrimination. Section 1.7 concludes
the paper.
1.2 Partner Company and Data
Alpha is an international tech company based in Silicon Valley which implements market segmentation
technology to improve e-commerce performances. Its patented software analyzes real-time consumers
behavior to deliver personalized contents with the goal of optimizing targeted business outcomes (Figure A.2). Business of Alpha’s customers varies considerably, ranging from electronics companies and car
manufacturers to fashion websites. This paper focuses on the case study between Alpha and a cosmetics
brand. Alpha ran several randomized controlled trials on its client e-commerce site, and access to dataset
of RCTs was provided under confidentiality partnership.10
Field experiment. In the experiment each session is randomly assigned to the control or the treatment
group. The RCT treatment t is a promotional pop-up that appears on the screen of the user’s device, with
two components: a percentage discount on gross order and a minimum purchase threshold
.
The customer gets the discount on the e-commerce order only if the amount spent is at least equal to the
minimum threshold. Combinations of and define control and two treatment groups: control C
if = 0 and = 0 (Figure A.3a), semi-treatment ST if > 0 and = 0 (Figure A.3b) and
full-treatment FT if > 0 and > 0 (Figure A.3c). Control approximates uniform pricing, while
semi- and full-treatment resemble linear and nonlinear pricing schemes. By construction, discounts
and purchase levels are orthogonal and discrete.11 Propensity scores of control, semi-treatment
and full-treatment are 15%, 50% and 35%, respectively.
Collaborators at Alpha disclosed relevant information regarding the RCT experimental features. First,
the size of depends on products’ category. For beauty items whose price varies from AC10 to AC40, the
discount percentage was capped at 20% to contain the cost of the experiment. Second, the magnitude
of
is associated with distribution of the cart values when the customer reaches the cart page
for the first time, and also with the gross order
. Once the treatment is received, it is possible, but
unlikely, that the minimum purchase is lower than the cart value. This information is important to
10Reciprocal collaboration was signed by means of mutual nondisclosure agreement (MNDA), which considers confidential
information all the information regarding Alpha’s identity, products, services, customers, business, finances, and IT infrastructure. In the MNDA, purposes to use Alpha confidential information are stated as potential mutual business opportunities.
11The treatment grid consists of a 5×5 matrix such that the percentage discount ∈ {0, 5, 10, 15, 20} and the minimum
purchase threshold ∈ {0, 15, 25, 35, 45}.
6
assure that the demand constraint imposed by
is actually binding as the user learns about it.
Moreover, probabilities of assignment to each treatment group are specified ex-ante the experiment,
thereby providing experimental control for weighting estimators in the context of causal inference.
Finally, to control for potential learning opportunities and strategic behavior of returning users, RCT
sampling excludes those users who have already been part in the experiment.
Conversion funnel. The shopping experience that each user takes by browsing the e-commerce site and
ultimately by converting the session into a purchase, can be divided into several steps. The online session
is the unit of analysis in a conversion funnel of three stages.12 Each session begins as useri logs-in the site.
The identification of different users relies on the IP address of any device connected to the e-commerce
computer network. As the session begins, treatment t is randomly assigned to user i. By the time user i
browses to the cart page for the first time, both discount and purchase threshold are received in
the form of a promotion pop-up. The discount is automatically added to the order summary page. Once
attained the check-out page, user i pays (1 − ) if the gross order
is equal to or greater than
,
otherwise user i gets no discount. Screenshots of the conversion funnel from a representative session are
presented in the Online Appendix (Figure A.4).
Data. The representative sample counts approximately 5 million sessions that occurred over 13 months,
between January 1st, 2022 and January 31st, 2023. Since the RCT was conducted online, sessions took place
in many countries with an 80% predominance of European countries such as France, Italy, Spain, Portugal,
United Kingdom, Germany and the Netherlands. For each observation, variables about characteristics
of user, treatment, session, history, technology, time and location are available. Data analysis focuses on
both the probability of purchase (extensive margin) and the gross order (intensive margin). On average,
the attainment rate of cart page is 30% and conversion rate is 6%. Conditional on converting the session,
the average gross order is AC45.42 with an average of 5 products and 29 minutes per session. Users with at
least one previous visit (returning users) are 26%, new users 74%. Summary statistics between treatment
groups and control groups are balanced, as shown in Table A.1. Section A.2 presents the dataset in detail.
1.3 Field Experiment
Behavior-based price discrimination is the ability of the firm to offer different prices to consumers with
different past purchase histories (Fudenberg & Villas-Boas, 2006). Because of its information advantage
about customers, the monopolistic firm is able of setting different prices to increase profit. The implicit
rationale of BBPD lies in the assumption that users with different purchase histories actually exhibit
diverse demand elasticities. This chapter investigates such hypothesis through reduced-form analysis
(Section 1.3.1) and casual inference (Section 1.3.2).
12Marketing literature (e.g., Seiler & Yao, 2017) usually divides conversion funnel into four levels: awareness, interest, desire,
and action. Given the descriptive purpose of this framework and the dataset characteristics, the three steps leading to purchase
are properly described by awareness, interest and action. In this setting, user i attains the first stage once the cart page is visited
for the first time, Then, the second stage is set as the user i selects a gross order above the minimum purchase threshold
. At last, by converting the session, user i reaches the last level.
7
1.3.1 Descriptive Evidence
In the representative sample, there are three new users (74%) for every returning user (26%). However,
at the end of the conversion funnel, this ratio becomes one to one: among customers who converted the
session, 54% are new users and 46% are loyal users. Furthermore, new users spend on average AC42 per
session, compared to AC49 average expense of loyal customers. In relative terms, returning individuals
appear to be more willing to purchase, and to spend more than most of new users (Figure 1.1).
0 20 40 60 80 100
Gross Order (EUR)
0
2000
4000
6000
8000
10000
12000
14000
Number of Sessions
45.4
42.1 49.3
Aggregate Demand
New Users
Returning Users
Figure 1.1: Gross order distribution
Note: The histogram displays the number of converted sessions by gross order. Demand counts 271,918 converted sessions.
The plot includes only sessions which were converted with a gross order below AC100. Black histogram depicts the aggregate
demand, which has 261,157 sessions with AC45.4 mean. Of the aggregate demand, 46% are returning users (red plot, AC49.3
mean), while 54% are new users (blue plot, AC42.1 mean). The plotted densities are computed using a Gaussian kernel smoother.
The following regression tests the association between purchase histories and conversion probability:
= 0 + x + + +
, (1.3.1)
where i indexes the individual (and session),
is the new user dummy,
is the purchase history
dummy and
is the conversion dummy.13 Table A.6 displays the estimated coefficients for control, semitreatment and full-treatment groups. Each column includes additional controls in xi
, which account for
the characteristics of user, session, history, technology, time and location. For returning users with no
past purchases, the conversion likelihood is around 10%. Relative to this group, the estimates indicate
that costumers with purchase history exhibit between 60% and 92% more odds of buying. Conversely,
new users are associated with a reduction in the purchase probability between 10% and 13%. Statistically
significant correlation highlights demand heterogeneity among these groups at the extensive margin.
Is heterogeneity of purchase histories also correlated to demand intensive margin? This linear specification examines whether purchase decisions are associated, on average, with higher gross order
:
log() = 0 + x + + +
, (1.3.2)
13 = 1 if the user has no history, namely it is the first session on the e-commerce website. While, for the returning user
= 0, = 0 if the user has never purchased before, otherwise = 1 if the user has converted the session at least one.
8
where xi
is as in (1.3.1). Coefficients in Table A.7 should be interpreted in terms of percentage variation
of gross order. For both control and treatment, how much is spent on the e-commerce is correlated to
demand segments defined between new and returning users. In particular, new users are associated
with a decrease of order magnitude between AC2.2 and AC3.1 (3-5% of the aggregate average) compared to
returning customers. Figure 1.2 compares spending distributions of new and loyal users, for both semiand full-treatment. On the one hand, new and returning users show a similar, higher spending pattern
relative to the control group, when they receive unconstrained discounts (Figure 1.2a). On the other
hand, new customers show a propensity to spend greater than loyal clients in the case of conditional
discounts (Figure 1.2b). On the whole, new and returning customers seem to differ in terms of elasticity
of aggregate demand to discount dt and minimum purchase levels MPTt
.
1.3.2 Causal Inference
RCTs are experiments where the assignment mechanism is independent to individual characteristics,
and the experimenter controls for the propensity score, namely the probability of being assigned to
the treatment. By the time users in the treatment group browse for the first time to the the cart page
during the session, they receive a promotional pop-up that appears on their device screen. The treatment
varies according to two terms, the percentage discount on e-commerce spending and the minimum
purchase threshold
. Since RCTs have the potential to establish a cause-effect relationship between
the treatment and the targeted outcome, this analysis aims at assessing the impact of discount treatments
on the extensive and intensive margins of demand.14
0 10 20 30 40 50 60 70 80 90 100
Gross Order (q)
0
500
1000
1500
2000
2500
3000
3500
4000
Number of Sessions
Control
Returning Users
New Users
(a) Semi-Treatment
0 10 20 30 40 50 60 70 80 90 100
Gross Order (q)
0
500
1000
1500
2000
2500
Number of Sessions
Control
Returning Users
New Users
(b) Full-Treatment
Figure 1.2: Gross order distribution per treatment
Note: The panels show the distributions of converted sessions for semi-treatment and full-treatment, for returning and new
users. Sessions with gross order greater than AC100 are not depicted. Black histogram shows the control group demand (39,552
sessions with AC40.00 mean and AC17.36 st.dev.). (a) Semi-treatment counts 134,274 sessions: 47% of returning users (red plot,
AC39.88 mean) and 53% of new users (blue plot, AC37.75 mean). (b) Full-treatment has 87,331 sessions: 43% of loyal users (red
plot, AC40.40 mean) and 57% of new users (blue plot, AC38.02 mean). The plotted densities are computed using a Gaussian kernel
smoother. Figures A.5 and A.6 provide further graphical intuition.
14For a comprehensive review about field experiment and randomized controlled trials refer to Athey and Imbens (2017).
For a presentation about program evaluation, ITT, ATE and LATE see papers like Angrist and Imbens (1995) or Rubin (2005).
9
The potential of RCTs as causal inference instrument relies on the randomization of treatment assignment. Differences in the probability of conversion between groups can be causally attributed, on average,
to the assignment of treatments. Average treatment effect (ATE) measures the average causal effect of
treatment and is estimated in the following specifications:
= 0 + +
, (1.3.3)
= 0 + +
, (1.3.4)
where
is the conversion dummy,
is the compliance dummy (i.e., = 1{≥ }
) and
is the treatment assignment dummy. Figures 1.3a and 1.3b represent ATE estimates of Equations
1.3.3 and 1.3.4 respectively, at the aggregate level, and for new and returning customers. On the one
hand, Figures 1.3a compares levels of discount percentage with the probability of purchase. Since
ATE estimates exhibit an upward pattern, higher discounts increase the conversion rate. Conditional on
purchase history, loyal users result more responsive to discounts, with a conversion probability between
16 and 26%. The greater the discount
, the greater the difference of ATEs between new and returning
clients. On the other hand, Figures 1.3b relates the minimum spending level to benefit the discount
with the compliance rate, namely the probability that user aggregate order is greater than
.
Bulk discounts involve a compliance cost that needs to to be beared in order to benefit the price reduction.
At the aggregate level, the greater the demand constraint to access the discount, the lower the
compliance rate. Moreover, loyal users and new users exhibit a different pattern. As new users spend
less on average, they are also less likely that their demand
is greater than treatment minimum purchase
threshold.
5 10 15 20
Discount Percentage
0.14
0.16
0.18
0.20
0.22
0.24
0.26
Conversion Probability
All Users
New Users
Returning Users
(a) ATE on Conversion Probability
15 25 35 45
Minimum Purchase Threshold
0.025
0.050
0.075
0.100
0.125
0.150
0.175
0.200
MPT Compliance Probability
All Users
New Users
Returning Users
(b) ATE on Compliance Probability
Figure 1.3: Average treatment effect by discount and minimum purchase threshold
Note: Panel (a) reports the estimates of ATE with standard deviation for conversion probability. Panel (b) displays the estimates
of ATE with standard deviation for compliance probability. Black line refers to aggregate demand, blue line to new users and
red line to returning users.
Overall, new users and returning users respond differently to RCT treatments. On average, users
react to discounts by raising their conversion rate, and respond to bulk discounts by reducing their
compliance percentage. Individual characteristics linked to the past history of purchases reveal demand
10
heterogeneity both at the extensive and intensive margin. Discounts impacting on the demand extensive
margin, are more salient to loyal users. While spending constraints induced by affecting the
intensive margin, result pivotal for to new users. The following chapter introduces the demand model
in which a representative user solves the constrained utility maximization problem with exogenous
treatment. The model is then identified to estimate demand primitives and derive demand elasticity.
1.4 Demand: Elasticity to Exogenous Treatments
Price treatments would be profitable if, as the discount increases or as the minimum purchase constraint decreases, the conversion rate or the e-commerce spending, or both, increased. This chapter first
explores this trade-off between dt and MPTt
in stimulating demand by providing evidences of correlation between treatment components and demand elasticity at both the extensive and intensive margin
(Section 1.4.1). The analysis proceeds by developing a structural demand model (Section 1.4.2), presenting the identification argument (Section 1.4.3), and estimating the demand parameters and elasticity
(Section 1.4.4).
1.4.1 Reduced-form Analysis
Extensive margin. Does raising the discount necessarily increase the conversion rate? The following
logit regression examines the correlation between the discount percentage dt and the minimum purchase
threshold MPTt
, and the likelihood of conversion:
= 0 + x + + +
, (1.4.1)
where i and t index the individual session and the associated treatment respectively, xi
is as in (1.3.1), and
Yi
is the conversion dummy. Since the exogenous treatment consists of either a discount (semi-treatment)
or a conditional discount (full-treatment), offering an higher discount percentage does not necessarily
imply a greater marginal impact on the conversion rate, as the user demand could be overall inelastic or
could respond to a decrease in the minimum threshold instead. The estimates show that, on average, a
unit increase in the discount is positively associated with the conversion probability for semi-treatment
(2%) and full-treatment (3%). Results also indicate that there is no correlation between conversion rate
and the magnitude of minimum purchase thresholds (Table A.8).
Intensive margin. The following linear specification investigates the correlation between e-commerce
spending, measured by gross order qi
, and both the discount dt and the purchase threshold MPTt
:
log() = 0 + x + + +
, (1.4.2)
where xi
is as in (1.3.1) and coefficients are interpreted in terms of expenditure percentage variation.
Estimates support the hypothesis that both dt and MPTt are uncorrelated to gross order qi (Table A.9).
11
1.4.2 Demand Model
This section extends the general framework developed by Lee and Allenby (2014) in order to accommodate the dataset of the field experiment run by the partner company on one of its clients’ e-commerce
website. The demand model considers the purchase decision of a representative user i with the goal of
estimating the primitives of the demand of e-commerce goods that maximize their utility (Dubé, 2019).
The combination (dt
, MPTt) specifies the randomized treatment of the RCTs. Both the econometrician
and user i observe the exogenous treatment (dt
, MPTt). In their choice problem, user i considers good
low l and good high h, together with an outside good x which is the numeraire of the economy. Good l is
a bundle of e-commerce products defined by the L-vector products P
=1 such that the demand of l
is lower than MPTt (
P
=1 < ). Whereas good h is a bundle of e-commerce items defined by the
H-vector products P
ℎ=1 ℎℎ such that the demand of h
ℎ
is at least equal to MPTt (
P
ℎ=1 ℎℎ ≥ ).
Price of good l is normalized to 1, while unit price of good h is (1 – dt). Preferences are represented by
a quasi-linear functional form of direct utility.15 This assumption guarantees that the utility function is
concave for both goods’ demand
and
ℎ defined over a non-negative domain.16 In models of consumer
choice, two further assumptions are introduced: linearity and additive separability. Thus, utility function
is linear in the outside good x, and is additively separable for the interior goods l and h. Linearity
ensures that the value of the outside good x does not satiate, while additive separability simplifies the
integration of the likelihood function. The quasi-linear additively separable direct utility is
(
, ℎ
, 0
) =
ln(
+ 1) + ℎ
ℎ
ℎ
ln(ℎ
ℎ
+ 1) + 0
0
, (1.4.3)
where
,
ℎ
and
0
represent the purchased quantities of good l, h and outside good of useri.17 Parameters
and ℎ correspond to the baseline marginal utility, or the marginal utility for no consumption of goods
l and h, respectively. Therefore, a higher marginal utility (ℎ)
implies less likelihood of a corner solution
for interior good l (h). The baseline marginal utility of the outside good is captured by 0. Additionally,
parameters and ℎ represent a satiation parameter. Flexibility in the degree of satiation allows for
greater satiation effect in the consumption of l (h) associated to higher values of (ℎ)
. Error terms and
ℎ indicate a latent idiosyncratic shock which are known to user i and unobserved by the econometrician.
Good-specific heterogeneity is introduced through normally distributed errors, that is ∼ (
, 2
) and
ℎ ∼ (ℎ, 2
ℎ
). Errors and ℎ are assumed independent and identically distributed.
User i solves their utility maximization problem subject to a set of constraints. The first corresponds
to the non-negative constraints imposed on purchased quantities. The demand of outside good
0
is
assumed strictly positive, implying that the indirect utility of user i is non-negative at the corner solution.
15Quasi-linear utility functions are linear in one argument, generally the numeraire. Quasi-linear preferences can be
represented by the utility function (1, 2, . . . , ) = 1 +(2, . . . , ) where 1 is the numeraire commodity and is strictly
concave. In (1.4.3) is logarithmic, which is increasing and concave.
16In the model marginal utility is thus continuous, positive and decreasing both in good l and in good h, i.e.,
∂
∂(ℎ)
=
(ℎ)
(ℎ)
(ℎ)
(ℎ)
+1
> 0 and ∂
∂(ℎ)
∂(ℎ)
= −
(ℎ)(ℎ)
(ℎ)
((ℎ)
(ℎ)
+1)2
< 0.
17The quasi-linear additively separable utility specification in (1.4.3) corresponds to a linear expenditure system form of the
generalized constant elasticity of substitution (CES) utility function. Hanemann (1984) introduces the CES utility functional
form () = P
=1(
)
(
)
− 1
which converges to (1.4.3) as approaches zero. Bhat (2008) shows that the quasilinear form closely approximates a subutility function profile based on a combination of and values for a given , with
a subutility function based exclusively on or values.
1
The demand of good l (h) is non-negative to allow for corner solutions in case of no purchase by user i:
0
> 0,
≥ 0, ℎ
≥ 0. (1.4.4)
By definition of interior good low l and good high h, the treatment constraint reflects how the demand
of good l (h) is defined with respect to the exogenous minimum purchase threshold
:
<
, ℎ
≥
. (1.4.5)
Related to the demand constraint imposed by
, interior goods l and h are perfect substitutes in case
of no corner solution. A strictly positive demand
implies no demand of good h and vice versa:
·
ℎ
= 0. (1.4.6)
A further constraint comes from the assumption of discreteness of the interior goods’ demand. Purchased
quantities
and
ℎ
belong to the set of natural numbers N0 capped at Q, where = − 1. Such upper
bound guarantees non-negative demand for the outside option:
, ℎ
∈ {0, 1, 2, . . . , }. (1.4.7)
The last constraint specifies the linear combination of purchased quantities of good l and h and the
numeraire of user i weighted by their unit prices. The budget constraint imposes that the exogenous
endowment
is allocated between the outside good, good l with unit price normalized to 1 and good
h weighted by the treatment percentage discount (1 − )18:
+ (1 − )
ℎ
+
0
≤
. (1.4.8)
Ultimately, user i takes two decisions. First, they decide whether to convert the session or not. If they
do not, then
= 0 ,
ℎ
= 0 and trivially
0
=
. Otherwise, if they buy, they select either
< or
ℎ
≥
. Then,
> 0 and
0
= −
or
ℎ
> 0 and
0
= −(1−)
ℎ
given treatment(
, ).
Thus, three demand equilibria arise: No Conversion (also corner solution, i.e.,
= 0,
ℎ
= 0,
0
= ),
Low Conversion (i.e.,
> 0,
ℎ
= 0,
0
= −
) and High Conversion (i.e.,
= 0,
ℎ
≥
,
0
= − (1 − )
). Figure A.7 graphically depicts the three equilibria conditioned on treatments. The
next section presents the utility maximization problem, derives the sufficient conditions of identification
strategy, and estimates demand primitives and elasticity.
Utility Maximization Problem
The utility maximization problem of user i is represented by a constrained optimization linear programming problem. The quasi-linear direct utility in (1.4.3) defines the objective function, while (1.4.4),
(1.4.5), (1.4.6), (1.4.7) and (1.4.8) include both equality and inequality constraints. The purchase decision
of user is therefore formulated as
18Because the Marshallian demand of (2, . . . , ) for quasi-linear utility functional forms does not depend on , it is not
subject to wealth effect, thus no further assumptions are imposed on the exogenous endowment .
13
max
(
,ℎ
,0
)
(
, ℎ
, 0
) =
ln(
+ 1) + ℎ
ℎ
ℎ
ln(ℎ
ℎ
+ 1) + 0
0
..
+ (1 − )
ℎ
+
0
≤
, (1.4.9)
, ℎ
∈ {0, 1, 2, . . . , },
<
, ℎ
≥
,
·
ℎ
= 0,
0
> 0.
A demand equilibrium is a triplet {
, ℎ
, 0
}
∗ which solves the maximization utility problem (1.4.9). It
depends on an exogenous treatment (
, ), parameter set {0,
,
, ℎ, ℎ} and the idiosyncratic
shock realizations and ℎ. At any triplet {
, ℎ
, 0
}
∗
, the budget constraint in (1.4.8) is binding because
the marginal utility 0 of the outside good is strictly positive. By substitution method, the constrained
optimization problem described in (1.4.9) can be equivalently formulated through a unconstrained linear
programming problem as follows
max
(
,ℎ
)
∗
(
, ℎ
) =
ln(
+ 1) + ℎ
ℎ
ℎ
ln(ℎ
ℎ
+ 1) + 0
−
− (1 − )
ℎ
..
, ℎ
∈ {0, 1, 2, . . . },
<
, ℎ
≥
,
ℎ
= 0, (1.4.10)
−
− (1 − )
ℎ
≥ 0.
Although their equivalence, the crucial difference between (1.4.9) and (1.4.10) depends on the degrees of
freedom of linear programming. The objective function in (1.4.9) is an unconstrained utility function
of goods l and h and the outside good, while
∗
in (1.4.10) is a constrained utility function of goods l and
h satisfying the budget constraint. The objective function
∗ preserves increasing satiation and negative
concavity, which correspond to diminishing marginal utility:
∂
∗
(
, ℎ
)
∂(ℎ)
=
(ℎ)
(ℎ)
(ℎ)
(ℎ)
+ 1
− 0(1 − 1{
(ℎ)
≥ }
) > 0, (1.4.11a)
∂
∗
(
, ℎ
)
∂(ℎ)
∂(ℎ)
= −
(ℎ)(ℎ)
(ℎ)
((ℎ)
(ℎ)
+ 1)2
< 0. (1.4.11b)
Given the initial assumptions of additively separability of goods l and h and linearity of the outside good,
both (1.4.11a) and (1.4.11b) depend only the demand of the interior good. This implies that the sufficient
conditions to rationalize the observed demand equilibrium {
, ℎ
}
∗
evaluate each realized purchase of
interior goods independently. In addition, the binding mutual exclusivity constraint in (1.4.6) further
simplifies the comparison condition and, consequently, the likelihood function is analytically inferable.
1.4.3 Identification and Likelihood Function
The identification argument leverages a random utility framework for rationalizing the demand predicted by the constrained utility maximization problem. The introduction of an idiosyncratic shock
allows for demand heterogeneity which first order conditions would fail to satisfy. Variation in purchased quantities could potentially be driven by heterogeneity in consumers’ preferences, unobservable
characteristics of products and exogenous changes in market factors. The error terms included in (1.4.3)
14
account for any unobserved covariates which would affect the demand of goods and ℎ. The identification strategy takes advantage of the concavity property of utility function
∗ which entails the existence
of a global maximum at the realized demand {
∗
, ℎ
∗
}. Since the utility function
∗
is characterized by
a unique, global maximum within the proximity of the empirical distribution of purchased quantities,
then negative convexity guarantees the condition of optimality by comparing the indirect utility at the
realized demand with respect to the counterfactual indirect utility for neighboring discrete grid points.
Thus, the sufficient condition for the observed demand {
∗
, ℎ
∗
} to solve (1.4.10) derives from the
property of utility function
∗
to be concave at each quantity of any demand equilibrium. By considering
a neighboring grid Δ ∈ {−, } of {
∗
, ℎ
∗
} with ∈ N0, the sufficient condition for optimality is
∗
(
∗
, −∗
) > maxn
∗
(
∗
+ Δ, −∗
) | (
∗
+ Δ, −∗
) ∈
o
Δ∈{−,}
∀ ∈ {, ℎ}
where =
n
(
, −
) | −
− (1 − )
−
≥ 0, (1.4.12)
, −
∈ {0, 1, 2, . . . , },
·
−
= 0o
.
Optimality conditions yield a lower bound
and an upper bound
on the support of the distribution
of error where ∈ {, ℎ}, which are consistent with the realized interior goods’ demand {
∗
, ℎ
∗
}
within the neighboring (−, ) interval.
Proposition 1. Demand equilibrium {
∗
, ℎ
∗
} is defined in the interval (
,
) of the support of error
distribution ∀ ∈ {, ℎ} where:
= ln 0(1 − 1{∗
≥ }
)Δ
− ln
ln
∗
+ 1
(
∗
− Δ) + 1
, (1.4.13a)
= ln 0(1 − 1{∗
≥ }
)Δ
− ln
ln (
∗
+ Δ) + 1
∗
+ 1
. (1.4.13b)
Proof. Refer to Section A.1.1 in the Online Appendix.
Parameter captures the marginal utility for no consumption of good m. Its estimate depends on the
share of corner solutions in the empirical demand distribution. The lower bound
goes to −∞ for
∗
= 0 , whereas for non-zero demand it is inversely proportional to . This results in a small value
of for a higher proportion of corner solutions as the smaller gets, the greater upper boundary
becomes in order to account for a larger share of zeros. Likewise, the satiation parameter implies
a weaker preference, or lower satiation, for good m as decreases. For larger fractions of corner
solutions, parameter reduces to allow for a greater upper boundary
. The relation between the
number of no purchases in the empirical distribution and the marginal changes of preferences’ parameters
is corroborated by comparative statics.19 The empirical distribution of purchased quantities contributes
in the identification of . While variation in would shift both
and
without modifying the
error interval, different magnitudes in the demand entail different likelihoods given a positive estimate
for the satiation parameter . Within the strictly positive demand distribution, analytical comparative
19 ∂
∂
|
∗
=0 = −
1
< 0 and ∂
∂
|
∗
=0 = −
Δ
Δ+1
1
ln
Δ+1 < 0.
15
statics show that both thresholds
and
depend on , but not on .20 However, satiation
expands the likelihood interval of error in a not trivial manner, since the sign of ∂
∂
and ∂
∂
is not
uniquely defined.21
Given the assumption of independent and identically normally distributed errors and ℎ, the
likelihood of the empirical demand distribution is computed by integrating the joint density of both
errors in the intervals specified in (1.4.13a) and (1.4.13b). For demand parameters’ space ∈ Θ where
Θ = {0,
, ℎ,
, ℎ,
,
, ℎ, ℎ}, multinomial likelihood function (Θ) is defined as
(Θ|
∗
, −∗
,
, ) = Y
=1
Z
h
−
i
·
Z
−
−
h
− − −
−
−
i
−
∀ ∈ {, ℎ},
(1.4.14)
where
is the indicator function 1∗
≥0
incorporating (1.4.6). In case
∗
= 0 the lower boundary
converges to −∞. Similarly, the upper boundary
approaches +∞ if the upper neighboring grid
∗
+ Δ violates the budget constraint, that is
∗
= . In Section A.1.2 treatment-specific log-likelihood
functions are derived.
1.4.4 Estimation of the Demand Model
Demand parameters Θ are estimated through maximum likelihood estimation (MLE). The goal of
maximum likelihood estimation is to find the values of the demand primitives that maximize the loglikelihood function over the parameter space Θ. Therefore, treatment-specific MLE is
^ = argmax
∈Θ
ℓ
|
, ℎ
,
,
∀ ∈ {, , }, (1.4.15)
where ℓ
is the log-likelihood function for each combination of (
, ). In addition, the parameters
of aggregate demand are estimated through the algorithm as follows
^ = argmax
∈Θ
X
=1
Pr
= ¯
ℓ
|
, ℎ
,
,
, (1.4.16)
where ¯ ∈ = {, , }. The identification argument requires to assign a neighboring grid Δ ∈
{−, } with ∈ N0. By default, n is equal to 1.22 Table 1.1 reports the estimates of (1.4.15) for control,
semi-treatment and full-treatment and results of (1.4.16) for aggregate demand. The aggregate estimates
provide evidence that pricing treatments impact the preference primitives of consumers.
Users derive more utility from one unit of either good l or good h than from one unit of the numeraire
((ℎ) > 0). At the aggregate level, the marginal utility of good l is almost twice than the baseline utility
of consuming one unit of the outside good, given the same normalized unit price. While the marginal
utility of purchasing one unit of good h at discounted unit price (1−) is four time greater. Furthermore,
20 ∂
∂∗
=
2
Δ
(∗
−Δ)+1
∗
+1
1
ln
∗
+1
(∗
−Δ)+1 > 0 and ∂
∂∗
=
2
Δ
∗
+1
(∗
+Δ)+1
1
ln
(∗
+Δ)+1
∗
+1 > 0.
21 ∂
∂
=
1
− Δ
(∗
−Δ)+1
∗
+1
1
ln
∗
+1
(∗
−Δ)+1 and ∂
∂
=
1
− Δ
∗
+1
(∗
+Δ)+1
1
ln
(∗
+Δ)+1
∗
+1 .
22Since the choice of = 1 could be deemed arbitrary, a sensitivity analysis is performed in which n is set equal to 5 and 10
to evaluate the sensitivity of counterfactual estimates. Results show that the estimates with = 1 are robust across different
neighboring grid magnitudes.
Table 1.1: Estimates of demand parameters
Control Semi-Treatment Full-Treatment Aggregate Demand
0 0.483 0.451 0.623 0.563
[0.396,0.515] [0.331,0.458] [0.559,0.681] [0.491,0.616]
1.419 1.300 1.045
[1.392,1.511] [1.215,1.333] [0.973,1.093]
0.537 0.726 0.959
[0.495,0.534] [1.333,0.736] [0.937,0.976]
0.453 0.302 0.045
[0.443,0.480] [0.281,0.321] [0.018,0.056]
1.876 1.738 1.111
[1.785,1.903] [1.708,1.840] [1.060,1.186]
ℎ 1.440 1.077 2.196
[1.363,1.480] [1.031,1.152] [2.119,2.247]
ℎ 0.510 0.928 0.636
[0.486,0.525] [0.902,0.942] [0.622,0.659]
ℎ 0.477 0.077 0.392
[0.462,0.500] [0.055,0.093] [0.363,0.403]
ℎ 1.834 1.203 1.683
[1.770,1.901] [1.145,1.259] [1.606,1.732]
Note: The results of the structural estimation of Θ for control group, semi-treatment group, full-treatment group and aggregate
demand. Parameters 0, and ℎ represent the marginal utilities of numeraire and of goods and ℎ, and ℎ correspond
to the satiation parameter for each interior goods, and (ℎ) and (ℎ) are the mean and the standard deviation of the normally
distributed error (ℎ) which accounts for good-specific unobserved heterogeneity. 95% bootstrap confidence intervals are in
square brackets (501 repetitions). The dataset includes sessions in the control group and only sessions where the treatment was
received (1,800,446 observations).
the good l exhibits a higher satiation rate than the discounted good h ( > ℎ). This results from the
percentage difference in unit prices of the two interior goods. Because a unit of good costs more than
good h,
implies weaker preference, or greater satiation, for good l with respect to good h. Table 1.1
also shows the estimates of mean and standard deviation of good-specific heterogeneity. The mean of
the idiosyncratic demand shocks’ distribution for good l is lower than that for good h ( < ℎ).
Elasticity of Demand to Treatments
The estimation of demand primitives allows to compute demand elasticity. Own-price and crossprice elasticities are derived in Section A.1.3. Table 1.2 reports the price elasticity (ℎ) and the cross-price
elasticity 0 where estimates measure the percentage change in demand associated to a 10% change in
unit price. Endowment
is set to AC101 to guarantee that demand of the outside good is always positive.
Aggregate results show that the demand of discounted good h is less elastic than the demand of full-price
good l. A 10% reduction in the unit price increases the demand of good l by 23% and the demand of
good h by 12%. These estimates support the hypothesis of diminishing responsiveness in demand to
pricing treatments at the intensive margin. Ceteris paribus, a further decrease in price would affect the
demand for the good h, which already has a discounted (1 − ) unit price, by 47% less than the demand
17
for the good l which has a full unit price. The cross-price elasticity is zero, implying that a 10% discount
has no impact on the demand for the numeraire. Changes in the price of interior goods would not affect
the demand for the outside good. Table A.10 presents the estimates of different levels of price change.
Demand elasticity increases as the magnitude of discounts raises. Overall, a discount percentage between
5 and 20% increases the demand of good l between 21 and 26% and between 12 and 13% for the good h.
Table 1.2: Estimates of treatment elasticity
Control Semi-Treatment Full-Treatment Aggregate Demand
Low () High (ℎ) Low () High (ℎ) Low () High (ℎ)
(ℎ)
-1.312 -1.272 -1.639 -2.434 -2.307 -1.235
[-1.468,-1.221] [-1.381,-1.147] [-1.791,-1.633] [-2.536,-2.384] [-2.407,-2.186] [-1.305,-1.116]
0 0.022 0.023 0.015 0.012 0.011 0.018
[-0.047,0.204] [-0.178,0.028] [-0.046,0.108] [-0.043,0.092] [-0.113,0.097] [-0.029,0.178]
Note: Own-price and cross-price elasticities derived in Section A.1.3. Discount is set to 10% and endowment to AC101. 95%
bootstrap confidence intervals are in square brackets (501 repetitions). The dataset includes sessions in the control group and
only sessions where the treatment was received with a order gross equal to or lower than AC100.
In conclusion, the demand analysis documents that pricing treatments are salient to users. On average, a 10% discount leads to a demand increase between 12 and 23%. Customers that pay non-discounted
price exhibit a more elastic demand than users that comply to and benefit
. Furthermore, demand responsiveness differs depending on the interaction between discount and minimum purchase
constraints. Hence, the heterogeneity of demand elasticity plays a crucial role in designing how pricing
treatments should be assigned to different groups of customers. To address the motives behind personalization and price discrimination, the next chapter studies the pricing problem of a monopolistic
e-commerce and investigates how personalized treatment algorithms impact on profit.
1.5 Supply: Endogenous Pricing Function
This chapter explores how the e-commerce can take advantage of the heterogeneity in users’ demand
elasticity to maximizes its profit, and how customers respond to endogenous pricing functions. This
framework allows to estimate the causal impact of promotion treatments on expected profit, and to compare the profitability of different pricing algorithms. A policy model empirically simulates personalized
pricing algorithms (Section 1.5.1) and exploits variation in RCT treatments and users’ characteristics
to estimate expected profit (Section 1.5.2). The counterfactual analysis evaluates the profit-maximizing
pricing policy with alternative policies (Section 1.5.3). At last, behavior-based price discrimination is
introduced within this supply framework (Section 1.5.4).
1.5.1 Policy Model
This section models the statistical decision problem of optimal pricing policy for an e-commerce
company in the spirit of Berger (2013), and deepens the work on personalized discount targeting by
Miller and Hosanagar(2020). The model fits the characteristics of RCT treatments, namely the interaction
18
between the discount percentage dt and the purchase threshold MPTt on e-commerce spending. By
making the treatment an endogenous variable of the model, the analysis defines the condition of optimal
treatment and focuses on the profit-maximizing assignment rule.
The model considers a monopolistic policy-maker firm F and a continuous mass of users, indexed by
i. As user i accesses the e-commerce website, firm F costlessly learns the vector of observable covariates
Xi
. It has to select the treatment Ti
in the treatment space . Any treatment Ti
is an unordered pair of
discount percentage dt and minimum purchase threshold MPTt of gross order Qi
. Thus, = {
, }
defines the treatment combination that user i is assigned to. Given Ti
, user i decides whether to purchase
or not, through the binary variable ∈ {0,1}. If Yi = 1, user i converts the session and selects the gross
order Qi which is defined as =
P
=1 and is currency-measurable. Firm F observes yi and qi
,
namely the realization of the stochastic variables Yi and Qi
. Marginal costs c are assumed zero. The
problem of firm F is to select a priori the treatment
∗
∈ which maximizes expected profit.
The conditional response functions are defined as
() = Pr( = 1 | = , = ), (1.5.1)
() = E[
| = 1, = , = ], (1.5.2)
() = Pr( ≥
| = 1, = , = ). (1.5.3)
Equation 1.5.1 describes the probability that, conditional on characteristics x and given treatment t,
user i converts the session. Equation 1.5.2 specifies the expected gross order Qi as a random variable,
conditional on user i conversion, and Equation 1.5.3 defines the conditional probability that expenditure
Qi
is equal to or greater than threshold MPTt
. Given observables x, the expected profit of user i for
treatment t is
E
Π( | )
= ()()
()(1 − ) −
+ ()
1 − ()
() −
. (1.5.4)
Equation 1.5.4 captures two mutually exclusive events. Either useri converts the session with a probability
() and () is greater than or equal to MPTt with a probability (), then they benefit the discount
dt and pay ()(1 − ). Or user i purchases, selects () below MPTt
, and pays (). For each user i,
firm F selects the optimal treatment ∈ which maximizes expected profit. The following assignment
rule describes such treatment decision
∗
() = argmax
∈ " dim(
X
)
=1
()
h
()()(1 − ) + (1 − ())()
i
#
. (1.5.5)
Despite its conciseness, Equation 1.5.5 explains the logic of counterfactual analysis. Among all the
possible pairs Ti
in the space , firm F must select, a priori, that treatment set
∗
which maximizes
the expected profit of user i. Nevertheless, the theoretical challenge addressed by this model consists
in a counterfactual comparison between the hypothetical optimal scenario and different unobservable
scenarios, in terms of expected profit. The next section derives the condition for the profit-maximizing
treatment, estimates the optimal assignment rule, and implements policy evaluation for counterfactual
analysis.
19
Condition of Optimal Treatment
Unordered pairs = {
, } encompasses different pricing schemes depending on the magnitude of dt and MPTt as described in Section 1.2. If ∈ {0}
2
, user i receives no treatment and the control
group applies, namely t = C. It defines a uniform pricing policy and represents the benchmark case of any
other policy intervention. Instead, linear pricing policy imposes a linear relationship between price and
demand, namely ∈ R
+ ×{0}. In this case, semi-treatment t = ST occurs when userireceives discount dt
independently to the realization of gross order Qi
. Conversely, non-linear pricing arises if ∈ R
+ × R
+,
that is when the minimum purchase threshold MPTt
imposes a necessary condition on user i demand
of Qi
in order to benefit the discount dt
. This case defines full-treatment t = FT.23 Conditional average
treatment effects (CATEs) on the conversion rate () and on the gross order () are respectively
Δ() = () − (), (1.5.6a)
Δ() = () − (). (1.5.6b)
Equations 1.5.6a and 1.5.6b quantify the average causal impact of treatment t on user i demand extensive
and intensive margin, relative to the counterfactual control case. As the control group defines the
benchmark of CATEs, the condition for optimal treatment follows the equivalent intuition, namely
E
Π
| = , =
> E
Π
| = , =
. (1.5.7)
Inequality 1.5.7 imposes that firm F must select the treatment = for customer i with covariates =
as long as the associated expected profit is strictly greater than the expected profit in the counterfactual
scenario of treatment t = C. CATEs in (1.5.6a) and (1.5.6b) allow (1.5.7) to be rewritten as follows
Δ()
Δ()
+ ()
Δ()
+ ()
Δ()
| {z }
Causal effect of treatment on expected revenue
>
()()()
| {z }
Expected cost of treatment
. (1.5.8)
The condition for optimal treatment implies that the causal effect of treatment t on expected revenue has
to dominate the expected cost of treatment t. With a positive product of CATEs, the first term of expected
revenue raises the left-hand side of (1.5.8) with concordant signs for the second and third term. Whereas
with a negative product of CATEs, the first term of expected profit reduces the left-hand side of (1.5.8)
with opposite signs for the second and third term. The cost of treatment t is unequivocally positive,
while the signs and the magnitudes of the three terms quantifying the causal impacts of treatment t on
conversion probability and e-commerce spending, are the object of estimation.
Optimal Treatment Policy
The optimal pricing function incorporates the economic intuition of (1.5.8) in the following proposition.
Proposition 2. Optimal treatment policy is defined as
() = Δ()Δ() + ()Δ() + ()Δ() − ()()()
. (1.5.9)
23Lastly, ∈ {0} × R
+ is trivial and not considered in the discussion.
20
Proof. Refer to Section A.1.4 in the Online Appendix.
The optimal treatment policy applies the necessary condition of optimal treatment. For any treatment t
such that () > 0, it is strictly profitable to assign treatment t to user i. Conversely, for all the treatment
t such that () ≤ 0, firm F maximizes its profit by assigning user i to the control group.
As both CATEs converge to zero, the optimal policy approaches to zero. Thus, for pricing treatments
with no causal impact on neither the conversion likelihood nor the e-commerce spending, the optimal
treatment policy recommends no discounts at all. Instead, for elastic responses at the extensive margin or
at the intensive margin (or both), variation of Δ() and Δ() is driven by () and (), and by
() and (), respectively (Figure A.8). This results in a comparative, non-trivial behavior of (),
affected by both the marginal expected gain of treatment on revenue and the expected cost of treatment.
Finally, the assignment law in (1.5.5) is rearranged in terms of optimal pricing function () as follows
∗
() =
argmax
∈−{}
() > 0
if () ≤ 0 ∀ ∈ −{}
.
(1.5.10)
The algorithm defined by Equation 1.5.10 compares the values of all the strictly positive policies ()
among pairs ∈ −{}
, and selects that
∗
which maximizes (). If none of the treatment satisfies this
condition, then
∗
= is the expected profit-maximizing treatment of user i with covariates x. Given a
finite set of treatments , this assignment criterion in (1.5.10) simulates a personalized pricing algorithm.
1.5.2 Estimation of the Policy Model
The estimation methodology follows three steps. The first step presents the data collected from the
online field experiment. The second step estimates the conditional responses and CATEs for the demand
extensive and intensive margin. The primitives of the model = {(), (), (), Δ(), Δ()}
for ∈ are estimated by matching the demand predictions of users with equal characteristics and
different treatments. In the third step the optimal assignment rule for a subset of covariates is
simulated, given the estimated primitives b.
Experiment. In the experimental phase firm F runs a RCT experiment. User i is randomly assigned
to one of the three treatment groups = {, , }. The realization of each pair = {
, }
depends on exogenously specified probabilities that are known to the econometrician (Section 1.2). The
representative sample is defined by the set = {
,
,
,
, }
=1. It reports the purchased decision
Yi
, the MPTt compliance response Ci
, the realized gross order Qi
, the assigned treatment Ti
, and the
vector of covariates Xi of the I users in the experiment. Table 1.3 summarizes the variables retained in the
vector Xi from a broader set of individual characteristics, while Table A.1 provides further information
of the experimental dataset .
Estimation. The estimation procedure applies the method presented in Chernozhukov et al. (2018) to
evaluate the primitives of the model via cross-validation. It implements ensemble learning by defining an
arbitrary supervised machine learning (ML) algorithm and leveraging repeated splits of the experimental
data sample to construct ensemble estimates across different subsamples. For each iteration of the sample
21
Table 1.3: Variables in the vector of covariates Xi
Aggregate
Variables Observations Mean St.Dev.
Time to session (minute) 5,087,685 12.96 17.64
Number of pages 5,087,685 14.31 15.95
Attained cart (0/1) 5,087,685 0.29 0.45
Time to cart (minute) 1,456,608 9.83 11.90
New user (0/1) 5,087,685 0.74 0.48
Purchase history (0/1) 5,087,685 0.03 0.17
Number of sessions 5,087,685 1.77 3.36
Number of prior visits 1,923,779 8.89 22.12
Number of prior purchases 1,923,779 0.18 1.25
Number of prior abandoned carts 638,710 3.78 8.51
Note: Covariates of RCT dataset used in Algorithm 6 to estimate the primitives in classification and regression ensemble ML.
split, the data set is randomly divided into calibration (or training) sample and validation (or testing)
sample. The supervised learning algorithm is trained on the calibration subsample and employed for
out-of-sample predictions on the validation subsample. Furthermore, the obtained predictions are used
in a constructed regression on the observed dependent variables of the validation split. Eventually, the
estimated coefficients of both predictions and regressions are averaged and recombined to improve the
predictive performance of ensembled estimates.
The implementation of estimation uses nonparametric supervised machine learning algorithms. The
goal of supervised learning is to learn a function that maps inputs (i.e., independent variables) into a
labeled output (i.e., dependent variable). In general, this class of models has three main advantages. First,
the lack of assumptions regarding the underlying function increases the predictive power of the algorithm.
Second, the flexibility of the model allows a large number of functional forms to fit the underlying data
generating process. Third, predictive performances are improved with respect to constrained, simpler
parametric algorithms. Furthermore, the analysis leverages ensemble methods to improve the robustness
of statistics by combining the predictions of different estimators built with multiple learning algorithms.
Two families of ensemble methods are considered: averaging methods and boosting methods. Averaging
methods construct several estimators independently and then average their predictions. Instead, boosting
methods built base estimators sequentially and weighted combination of estimators aims at reducing the
bias. In particular, bagging method (BAG) and random forest (RF) are considered for averaging methods,
while adaptive boosting (AB), gradient boosting (GBDT) and neural network (NN) are evaluated for
boosting methods.24 For each method, supervised learning is trained in the dataset for classification
with conversion decision Yi as the labeled target, and for regression with gross order Qi as the prediction
output. Section A.5.1 presents a detailed discussion of ensemble ML methods.
The comparison of predictive performance between BAG, RF, AB, GBDT, and NN depends on dis24Surveys such as Sagi and Rokach (2018) and Mienye and Sun (2022) present a comprehensive review of traditional, novel
and state-of-the-art ensemble methods together with applications and examples. For more technical discussion about ensemble
deep learning Yang et al. (2023) provides a valid reference.
22
tributions of b() and b(). For classification methods, accuracy score ((), b()) is computed,
while for regression methods the coefficient of determination 2
((), b()) is calculated. The analysis is performed for all the treatment groups and it confirms that GradientBoostingClassifier and
GradientBoostingRegressor are the best-performing prediction models for both classification and regression. Indeed, GBDT Classifier predicts 81% of variation in validation sample, whereas GBDT Regressor is able to match the spending distribution in testing sample by 40% (Tables A.11 and A.12).
Gradient boosting decision tree is used to build ensemble predictions which aggregate the estimators of
100 GBDTs, calibrated on 80% of the training sample of data set , stratified by treatment conditions.
The outcomes are estimates of the conditional response and treatment effect functions for all treatments
b = {b(), b(), b(), b(), Δc(), Δc(), b()}. At last, the estimates of the primitives of the
policy model are combined to estimate discrete realizations of () to extrapolate distribution on targeting policy function b
() and b
() (Figure A.9). Pseudocode of the GBDT estimation procedure
is illustrated in the Online Appendix (Algorithm 6).
Simulation. In the pricing policy model, how firm F selects the treatment
∗
∈ which maximizes
expected profit responds to the optimal optimal assignment rule. Given estimates b, (1.5.10) becomes
b∗
() =
argmax
∈{, }
b() > 0
if b() ≤ 0 ∀ ∈ {, },
(1.5.11)
where the targeting policy function is replaced by its estimates b
() and b
(), and the treatment
set = {, , }. The algorithm works by comparing () among treatment groups by varying
realizations of dt and MPTt
. Since the realization probabilities of dt and MPTt are exogenously determined
by experimental design, then the algorithm normalizes the comparison of treatment policies between
groups given an exogenous variation within groups. Equation 1.5.11 replicates the optimal assignment
rule for covariates and allows to simulate the distributions of conversion probability and expected
profit under targeting policy function ().25
1.5.3 Counterfactual Analysis
Alternative Policies
In order to compare the impact on expected profit of optimal treatment pricing policy in (1.5.9)
through (1.5.11), alternative treatment policy functions are considered. These functions are defined
according to a different necessary condition relative to (1.5.7). First, non-treatment policy function is
() = −1, (1.5.12)
and represents the non-treatment, non-personalized benchmark of policy evaluation. It corresponds
to a uniform pricing scheme because all users are assigned to the control group, irrespective of their
characteristics
. Regardless the causal effects of discount treatments on users’ demand, firm F commits
itself not to price discriminate and offers no discounts. Second, the assignment criterion of non-targeted
25Simulation subsample size is set 20% of the whole dataset, such as the validation sample of ML cross-validation.
23
treatment policy averages, for each treatment, the causal effect among all users, and only assigns the
pricing treatment with the greatest average impact on aggregate demand. Such treatment function is
non-targeted since it disregards heterogeneity of demand elasticity to different discounts among users.
Thus, the non-targeted treatment policy results as follows
() = E[()]E[()] − E[()]E[()] − E[()]E[()]E[()]
, (1.5.13)
where the function is treatment-specific, but not user-specific due to the introduction of the expectation
operator in (). Indeed, all the terms varying in x are replaced by their expectations, for any treatment
Ti
. Moreover, alternative policies could specifically target the extensive and the intensive margin of
users’ demand. Conversion-targeted policy and revenue-targeted policy focus on the causal impact on
conversion probability and e-commerce spending respectively, and correspond to
() = Δ(), (1.5.14)
() = Δ(). (1.5.15)
These are two treatment-specific, user-specific policies where the targeting conditions are defined by
CATEs Δ() and Δ(), respectively. Equations 1.5.14 and 1.5.15 assess the effectiveness of treatment
Ti
, conditional on observables x, based on the extent to which the treatment marginally influences the
probability of conversion or the magnitude of expenditure. Differences in treatment elasticity at the
extensive and intensive margin are captured by the conversion-targeted and the revenue-targeted policy
respectively, while the complementary effect is not accounted, similarly to the expected cost in (1.5.8). At
last, the conversion-revenue treatment effect policy
() =
1{Δ()>0}Δ()
1{Δ()>0}Δ()
() (1.5.16)
combines (1.5.14) and (1.5.15) weighted by the probability of complying to MPTt for (1.5.3).26 In this
case, the assignment condition of Equation 1.5.16 imposes that both CATEs must be strictly positive,
that is the considered treatment has to increase the conversion rate and to shift rightward the expected
spending distribution for user i. The conversion-revenue treatment effect function targets a limited, very
responsive set of users that exhibit uplifting marginal effects on both demand margins.
Policy Evaluation
The set = {
,
,
,
, }
=1 represents the realized RCT sample. It collects the observed conversion
decision yi and compliance response ci
, the realized gross order qi
, the assigned treatment ti
, and the
vector of covariates xi of the I users in the field experiment. The goal of evaluating different pricing
policies relates the problem of counterfactual policy evaluation in the literature on machine learning. If
a user randomly received a certain treatment in the RCT, while a targeting policy would have assigned
them to a different treatment, the counterfactual response is not observable. Since the realized sample set
features controlled propensity scores of the treatment groups, the Horvitz–Thompson estimator with
known sampling probabilities provides unbiased counterfactual estimation (Horvitz & Thompson, 1952;
26Trivially, () = 1 for = and = .
24
Huber, 2014). The analysis of policy evaluation takes advantage of inverse probability weighting (IPW)
estimator whose weighting reduces the bias of unweighted estimators, yielding a consistent estimate of
the profitability of the proposed policy.
Policy evaluation compares the performance, in terms of conversion rate and expected profit, of
alternative policies with different targeting function ()for the same set . Thus, the inverse probability
weighting estimator of probability of conversion is
b
() =
1
"X
=1
1{=}
1{b()≤0}
Pr( = )
+
1
dim( ) − 1
dim(
X
)−1
=1
1{=}
1{b()>0}
Pr( = )
!#, (1.5.17)
and the IPW estimator of expected profit is
b
() =
1
"X
=1
1{=}
1{b()≤0}
Pr( = )
() +
1
dim( ) − 1
dim(
X
)−1
=1
1{=}
1{b()>0}
Pr( = )
()
1 −
!#.
(1.5.18)
Table 1.4 summarizes the estimates of (1.5.18) at the aggregate level for each policy. IPW estimator
has the advantage to run counterfactual simulations by controlling for propensity scores and varying
the treatment set dimension. Therefore, Table 1.4 considers a reduced treatment space in the first
and second column. In the third column, the profit estimates of alternative pricing policies associated
to the combined treatment space = {, , } are commented. To make these estimates more
intuitive, comparison between alternative policies is framed in terms of percentage variation, defined as
Δc() =
b
()−b
b
%
, where the non-treatment, uniform policy stands for the reference point.
Table 1.5 considers differences in policy evaluation ascribable to demand heterogeneity between new
and loyal customers. It reproduces the same counterfactual analysis of Table 1.4 conditional on sessions
of new users and returning users, respectively. In the Online Appendix, Tables A.13 and A.14 replicate
the same exercise for (1.5.17). The evaluation discussion focuses on the expected percentage change in
profit, with the uniform pricing policy as the benchmark case.
All the policies result in an increase in expected profit, relative to the uniform pricing policy. First,
the non-targeted policy does not account for demand heterogeneity in the response of users, but allows
for the assignment criterion to vary within in the treatment set. The estimate indicates that non-targeted
discounts increase the profit on average by 32%. The reduction in the unit price induced by the nonpersonalized discount treatment stimulates the demand, and consequently raises the expected profit.
Consistent with the estimates of elasticity in Section 1.4.4, this result confirms that price discounts, at
the aggregate level, are salient to customers. However, the non-targeted policy increases profitability
compared to uniform pricing, but it is unable to capture the causal impact of personalized treatment
assignment on profit.
The conversion-targeted policy and the revenue-targeted policy focus on demand elasticity at the
intensive and extensive margin, namely whether or not a certain discount has a positive impact on
a specific user’s conversion rate or e-commerce spending. The estimated effects of the treatment on
the probability of purchase and on the size of the gross order identify those customers characterized by
sufficiently elastic demand, who would be more likely to purchase or would purchase more, given a price
reduction. Comparing the two target-specific policies highlights how personalized discounts affect the
25
Table 1.4: Estimates of counterfactual simulations of expected profit
Semi-Treatment Full-Treatment Combined Treatments
Policy b
() Δc () b
() Δc () b
{ , }() Δc{ , }()
Uniform 0.0030 – 0.0013 – 0.0016 –
Non-targeted 0.0011 –67.93% 0.0013 0.00% 0.0021 +31.77%
(0.466) (0.687) (1.934)
Conversion-targeted 0.0029 +0.321% 0.0014 +8.10% 0.0021 +28.00%
(1.156) (0.671) (1.813)
Revenue-targeted 0.0035 +15.21% 0.0014 +11.38% 0.0023 +38.98%
(1.212) (0.677) (1.876)
Conversion-revenue 0.0035 +15.21% 0.0014 +11.42% 0.0023 +39.01%
(1.212) (0.679) (1.876)
Optimal treatment 0.0036 +18.63% 0.0014 +12.61% 0.0023 +39.63%
(1.218) (0.679) (1.881)
Note: The results of simulation analysis of expected profit of alternative policies and optimal treatment policy. First and second
column consider the treatment space = {, } and = {, }, respectively. The third column simulates the pricing
algorithm with the combined treatment space = {, , }. Comparison between alternative policies is framed in terms
of percentage variation defined as Δc() =
b
()−b
b
%
, where the non-treatment, uniform policy stands for the
benchmark case. The estimates of expected profit of (1.5.18) per counterfactual policies and optimal treatment policy uses
dataset sessions where the treatment was received. The testing dataset is equal to 20%. Semi-treatment testing dataset features
19,908 observations, full-treatment dataset has 36,748 sessions, and the combined treatments testing dataset counts 56,656 data
points. Standard errors in parenthesis.
expected profit through the intensive and the extensive margin, respectively. At the aggregate level, the
revenue-targeted policy increases profit by 39%, 10 percentage points more than the conversion-targeted
case (28%), suggesting that, on average, users are more responsive at the intensive margin rather than at
the extensive margin, for different price treatments. The algorithm targeting CATE on revenue results
more effective at increasing expected profit than another that prioritizes causal impact on conversion
rate. Considering the absolute magnitude of margins corroborates this finding, as marginal variations in
Δ(), rather than in Δ(), result less effective to substantially increase the expected profit.
Furthermore, the conversion-revenue treatment effect policy combines the two previous policies in
a more restrictive way. It targets a user only if the pricing treatment positively impacts both conversion
probability and expected revenue relative to the control counterfactual. Estimates show that such pricing
policy has the same return as the revenue-oriented policy (+39%), indicating that positive CATEs at
the extensive margin are strongly associated to positive CATEs at the intensive margin. At last, the
optimal treatment policy is the profit-maximizing pricing policy (+40%). It accounts for heterogeneity
of treatment effects at the extensive margin (Δ()) and intensive margin (Δ()), combined to the
baseline response variation of the controlled group (() and ()) for discrete levels of treatment.
Unlike all alternative policies, the optimal treatment function also takes into consideration the expected
cost of treatment (()()()). The optimal treatment condition specifies a user-specific treatment
policy under the condition that the causal impact of treatment t on the expected revenue dominates the
expected cost of treatment t. Therefore, the results show that accounting for individual-specific demand
heterogeneity, evaluating treatment-specific variation in the response of extensive and intensive margins
26
and controlling for policy-specific expected costs, leads to the most effective personalized policy.
Reduced-form analysis and causal inference estimates in Section 1.3 support the hypothesis that
demand heterogeneity among consumers is associated with different purchase histories. Loyal customers
exhibit a higher propensity to purchase and to purchase more, relative to new users. Moreover, estimates
average treatment effect highlight how new and returning users respond differently to RCT treatments.
On average, demand of returning individuals at the extensive margin is more elastic to different level of
discount compared to new clients. In contrast, variation in minimum purchase threshold mainly
affect response of first-time users at the intensive margin, while returning users are less elastic to this
spending constraint. Information of users’ history is fundamental to assess the effectiveness of treatments
and, consequently, the causal impact on targeted outcomes to improve personalization algorithms.
Table 1.5 presents the estimates of counterfactual simulation on expected profit for new and returning
users. Compared to the latter, loyal individuals are more responsive to treatments on average. The optimal
treatment policy raises profit of returning users by 44%, while new users exhibit an 29% increase. The
aggregate demand result of the equivalent policy (40%) falls between these two estimates, but is closer to
44% (returning users) rather than 29% (new users). This is consistent with the fact that returning users
weight more in conditioning the profit as they are 46% of e-commerce demand, despite representing
only 26% of the entire sample. Policy estimates of new users segment are in line with the corresponding
estimates at the aggregate level in Table 1.4. This comes at no surprise as these results are derived from
policy simulation based on 74% of the dataset.
Remarkably, the optimal treatment policy fails to maximize the expected profit for the group of
returning users. Conversion-targeted policy leads to a 48% increase in profit of returning users, which
features an additional 11% profit gain compared to the optimal treatment counterfactual. Because the
segment of returning users reveal a more responsive extensive margin, the conversion-targeted policy
performs better in addressing the heterogeneity of these customers. It uniquely targets the treatment
effect on conversion rate and does not account for the intensive margin effect. This last finding is crucial
because it corroborates the intuition that it results profitable for the e-commerce firm to adopt different
pricing policies depending on whether a user belongs to a certain segment, which is defined according
to observable, purchase history-specific characteristics. In the next section, this intuition is formalized
through the definition of behavior-based price discrimination.
1.5.4 Behavior-based Price Discrimination
Definition 1. For any treatment policy , behavior-based price discrimination arises if
(| ) = (| ) and (| ) 6= (| - ).
Definition 1 specifies the necessary condition for BBPD to occur in the pricing policy model. Behaviorbased price discrimination takes place when firm F adopts different treatment policies, depending on
the history of users’ past purchases. In this specific application, behavior-based demand segmentation
considers whether the user is new to the e-commerce site or has browsed the e-commerce before. Based
on results in Table 1.5, all new users are assigned to the optimal treatment policy, while returning
customers to the conversion-targeted policy. It is incentive compatible for the e-commerce company to
27
behavior-based price discriminate at the following condition.
Proposition 3. Behavior-based price discrimination is profitable for firm F only if there exist two treatment policies
Table 1.5: Estimates of counterfactual simulations of expected profit for new and returning users
New users
Semi-Treatment Full-Treatment Combined Treatments
Policy ^
() Δc () ^
() Δc () ^
{ , }() Δc{ , }()
Uniform 0.0113 – 0.0057 – 0.0069 –
Non-targeted 0.0113 0.00% 0.0057 0.00% 0.0069 0.00%
(1.5504) (1.1980) (2.3773)
Conversion-targeted 0.0103 -8.20% 0.0065 +13.17% 0.0079 +15.03%
(1.3008) (1.1941) (2.1248)
Revenue-targeted 0.0121 +7.86% 0.0063 +9.32% 0.0088 +28.45%
(1.4779) (1.1797) (2.3059)
Conversion-revenue 0.0122 +8.01% 0.0065 +13.10% 0.0086 +25.54%
(1.4787) (1.1942) (2.3084)
Optimal treatment 0.0126 +11.62% 0.0064 +10.73% 0.0089 +29.83%
(1.4961) (1.1883) (2.3176)
Returning users
Semi-Treatment Full-Treatment Combined Treatments
Policy ^
() Δc () ^
() Δc () ^
{ , }() Δc{ , }()
Uniform 0.0228 – 0.0113 – 0.0134 –
Non-targeted 0.0086 –62.46% 0.0113 0.00% 0.0184 +37.22%
(2.1011) (3.2380) (6.2570)
Conversion-targeted 0.0288 +26.47% 0.012 +6.09% 0.0198 +48.33%
(3.9209) (3.1911) (6.0011)
Revenue-targeted 0.0268 +17.43% 0.0123 +9.21% 0.0191 +42.44%
(3.9530) (3.2057) (6.1085)
Conversion-revenue 0.0288 +26.44% 0.0123 +9.28% 0.0196 +46.17%
(3.9210) (3.2057) (6.1085)
Optimal treatment 0.0277 +21.75% 0.0125 +11.13% 0.0192 +43.81%
(3.9724) (3.2117) (6.1167)
Note: The estimates of the counterfactual analysis of alternative policies and optimal treatment policy conditional on new
users’ sessions and on returning users’ sessions. First and second column consider the treatment space = {, } and
= {, }, respectively. The third column simulates the pricing algorithm described in (1.5.11) and implemented by
Algorithm 6 where the treatment space = {, , } combines = and = . The first section presents the estimates
of expected profit’s simulation analysis of (1.5.18) of new users per counterfactual policies and optimal treatment policy.
The second section shows the results of expected profit’s simulation analysis of (1.5.18) of returning users per counterfactual
policies and optimal treatment policy. Comparison between alternative policies is framed in terms of percentage variation
defined as Δc() =
b
()−b
b
%
, where the non-treatment, uniform policy stands for the benchmark case. For new
users, the simulation dataset includes only sessions where the treatment was received and the testing dataset is set to 20%.
Semi-treatment testing dataset features 5,517 observations, while full-treatment dataset has 8,353 observations. The combined
treatments testing dataset counts 13,870 sessions. For returning users, semi-treatment testing dataset features 3,146 observations,
while full-treatment dataset has 5,073 observations. The combined treatments testing dataset counts 8,219 sessions. Standard
errors in parenthesis.
28
and such that
Π (| ) > Π (| ) and Π (| - ) > Π (| - ).
Proposition 3 provides the sufficient condition for behavior-based price discrimination to be incentive
compatible. Indeed, BBPD generates an additional 11% increase in profit from loyal users’ segment,
compared to the optimal treatment policy that maximizes expected profit at the aggregate level. Moreover,
behavior-based price discrimination complements personalized pricing. With respect to the benchmark
case of uniform policy, 89% of the extra profit derives from personalization of treatments, while 11%
generates from behavior-based discrimination in treatment policies.
Since it is able to observe user history, the e-commerce leverages its informational advantage and
exploits heterogeneity of demand elasticity between these population segments in order to increase
the profit. Counterfactual analysis shows that discriminating between treatment policies is profitable.
Behavior-based price discrimination complements personalized pricing. Relative to uniform pricing
counterfactual, 89% of the extra profit derives from personalization of treatments, while 11% comes
from behavior-based discrimination in treatment policies. The last chapter combines the estimates of
demand primitives and the simulated pricing treatment algorithms of policy evaluation, with the aim
of evaluating the welfare implications on consumer and producer surplus under different treatment
policies.
1.6 Welfare Analysis and Fairness
Does the optimal treatment policy increase producer surplus relative to uniform pricing at the aggregate level? Moreover, how does behavior-based price discrimination interact with personalized pricing in
terms of welfare implications? If they do, what segment of users benefits most from BBPD? By combining
the demand estimates in Section 1.4.4 and the causal estimates of counterfactual analysis in Section 1.5.2,
this last chapter addresses these research questions.
1.6.1 Aggregate Welfare Analysis
Given simulated demand and estimates of pricing policy derived in Section 1.5.2 and for treatment
space = {, , }, producer surplus (PS) for the policy function () corresponds to firm profit:
() =
X
=1
Π
(), ℎ
(), ()
=
X
=1
(1 − 1{()>0})(
() +
ℎ
()). (1.6.1)
Figure 1.4a depicts the aggregate profit distributions of uniform and optimal treatment policy for aggregate demand below AC101. The red curve dominates the blue one, indicating that optimal treatment policy
has a causal impact on both demand extensive and intensive margin. Intuitively, the optimal treatment
algorithm targets those customers whose demand is sufficiently elastic at the margin so that the increasing impact on revenue is strictly greater than the cost incurred by the firm to discount its prices. At the
aggregate level, the difference between the blue and red profit distributions reflects how the causal effect
on revenues net to expected cost varies. At the peak of the profit distribution is concentrated the mass
29
of users characterized by an elastic demand to discounts, which translates into greater gains in profits.
Consequently, personalized pricing increases the producer surplus compared to uniform pricing. In
Figure 1.4b profits are aggregated at the spending level, and the bars measure producer surplus. Since
the blue area is completely covered by the red one, the producer surplus with uniform prices is lower
than the producer surplus with customized prices. Uniform and optimal treatment producer welfare
are AC165,897.5 and AC205,978.5, respectively. Thus, the optimal treatment increases producer surplus by
24.16%, relative to the uniform pricing:
Δ =
−
%
= +24.16%.
0 20 40 60 80 100
Profit (EUR)
0
50
100
150
200
250
Number of Sessions
Uniform
Optimal Treatment
(a) Number of converted sessions by profit
0 20 40 60 80 100
Profit (EUR)
0
1000
2000
3000
4000
5000
6000
7000
8000
Producer Surplus (EUR)
Uniform
Optimal Treatment
(b) Producer surplus by profit
Figure 1.4: Aggregate producer surplus
Note: Panel (a) shows the distribution of aggregate profit. Panel (b) displays the distribution of aggregate producer surplus by
profit levels. Sessions with gross order greater than AC100 are not depicted. Blue plot refers to uniform policy (5,859 sessions
with AC28.38 mean), red plot to optimal treatment policy (5,859 sessions with AC35.21 mean). Figures A.10a and A.10b provide
additional details.
Consumer surplus (CS) is defined in (1.4.10), given estimates of aggregate demand parameters in
Table 1.1 and simulated demand and estimates of pricing policy derived in Section 1.5.2. Consumer
surplus is the sum of consumers indirect utility for the policy function ():
() =
X
=1
∗
(), ℎ
(),()
=
X
=1
ln(
() + 1) + ℎ
ℎ
ℎ
ln(ℎ
ℎ
() + 1) + 0
− (1 − 1{()>0})(
() +
ℎ
())
.
(1.6.2)
Figure 1.5a compares the uniform policy indirect utility distribution with that of optimal treatment
policy. For non-treatment pricing, the blue plot illustrates the welfare generated from the demand of
only good l and the outside option x. By combining the demand of both good l and good h, the red curve
features two peaks, depicting how consumer surplus is generated with personalized pricing. Optimal
treatment policy shifts the indirect utility distribution rightward. While uniform pricing sets a unit
price for the e-commerce good, personalized pricing expands the price menu through targeted discounts
30
and bulk discounts. Some individuals access the interior good at a discounted price, whereas other
are incentivized to marginally increase their spending to benefit from the price reduction. Whether the
optimal treatment policy also increases consumer welfare relative to the uniform policy, depends on
how the simulated demands translate into indirect utility estimates. The bars of Figure 1.5b stack the
consumer surplus for different levels of indirect utility. Graphically, the comparison between the red and
blue cumulative distributions measures how much personalized treatments change consumer welfare.
With uniform pricing consumer surplus equal to AC3,898,264 and optimal treatment consumer surplus
AC4,060,091.5, optimal treatment increases consumer welfare by 4.15%:
Δ =
−
%
= +4.15%.
20 40 60 80 100
Indirect Utility (EUR)
0
200
400
600
800
1000
Number of Sessions
Uniform
Optimal Treatment
(a) Number of converted sessions by indirect utility
0 20 40 60 80 100
Indirect Utility (EUR)
0
10000
20000
30000
40000
50000
60000
70000
80000
Consumer Surplus (EUR)
Uniform
Optimal Treatment
(b) Consumer surplus by indirect utility
Figure 1.5: Aggregate consumer surplus
Note: Panel (a) shows the distribution of aggregate indirect utility. Panel (b) displays the distribution of aggregate consumer
surplus by indirect utility levels. Sessions with gross order greater than AC100 are not depicted. Blue plot refers to uniform policy
(74,830 sessions with AC69.57 mean), red plot to optimal treatment policy (74,830 sessions with AC80.13 mean). Figures A.10c and
A.10d provide additional details.
1.6.2 Welfare Analysis of Behavior-based Price Discrimination
According to Definition 1, behavior-based price discrimination consists of adopting the same policy
for all returning user (i.e., conversion-targeted policy), while new individuals are assigned to a different
pricing policy (i.e., the optimal treatment policy).
For existing customers, the conversion-targeted policy maximizes the expected profit estimator as
also corroborated by the profit distributions in Figure 1.6a, where the purple distribution corresponds
to the deviation from the aggregate optimal treatment policy that defines BBPD. Figure 1.6b reports that
producer surplus associated to uniform and optimal treatment policies is AC29,553 and AC37,127.5 respectively, and the conversion-targeted policy generates a producer surplus equal to AC41,290.27 Therefore,
27In the Online Appendix, Figures A.11a and A.11b replicate the same analysis of producer surplus for new users. Producer
surplus generated by uniform and optimal treatment policies is AC44,935 and AC53,160, resulting in an 18.30% increase in producer
surplus.
31
the optimal treatment policy increases producer surplus by 25.63% compared to the uniform scenario,
while behavior-based price discrimination raises profit by an additional 11.21%,
Δ g =
−
%
= +11.21%. (1.6.3)
As far as consumer surplus is concerned, the optimal treatment policy increases consumer surplus of
returning users (4.89%) more than the average, aggregate consumer change (4.15%).28 The indirect utility
distributions for both the optimal treatment and the conversion-targeted policy are almost identical.
The economic intuition is that, unlike uniform policy, both policies provide access to good l. Moreover,
0 20 40 60 80 100
Profit (EUR)
0
10
20
30
40
50
Number of Sessions
Uniform
Optimal Treatment
BBPD
(a) Number of converted sessions by profit
0 20 40 60 80 100
Profit (EUR)
0
200
400
600
800
1000
1200
1400
1600
Producer Surplus (EUR)
Uniform
Optimal Treatment
BBPD
(b) Producer surplus by profit
20 40 60 80 100
Indirect Utility (EUR)
0
20
40
60
80
100
120
140
160
Number of Sessions
Uniform
Optimal Treatment
BBPD (c) Number of converted sessions by indirect utility
0 20 40 60 80 100
Indirect Utility (EUR)
0
2000
4000
6000
8000
10000
12000
Consumer Surplus (EUR)
Uniform
Optimal Treatment
BBPD (d) Consumer surplus by indirect utility
Figure 1.6: Welfare analysis of behavior-based price discrimination
Note: Panel (a) shows the distribution of profit of returning users. Panel (b) displays the distribution of producer surplus of
returning users by profit levels. Panel (c) illustrates the distribution of indirect utility of returning users. Panel (d) reports the
distribution of consumer surplus of returning users by indirect utility levels. For returning users, 10,991 sessions with gross
order below to AC100 are depicted. Blue plot refers to uniform policy, red plot to optimal treatment policy and purple plot to
conversion-targeted policy. Figures A.12a and A.12b provide additional details for producer surplus. Figures A.12c and A.12d
offer further graphical intuition for consumer surplus.
28In the Online Appendix, Figures A.11c and A.11d reproduce the same analysis of consumer surplus for new users.
Consumer surplus generated by uniform and optimal treatment policies is AC993,651.5 and AC1,029,956.5, resulting in an 3.65%
increase in consumer surplus.
32
the targeting mechanism of these two policies differs in considering the expected cost of treatment,
which is not part of the utility function, therefore is not reflected in distinct indirect utility distributions.
Figures 1.6c and 1.6d in fact underline how consumer surplus is not influenced by the e-commerce
switching from optimal treatment to conversion-targeted policy, as follows
Δg =
−
%
= +0.01%. (1.6.4)
1.7 Conclusion
This paper investigates the welfare effects of behavior-based price discrimination. By developing a
pricing model that allows for counterfactual simulations, it is shown that behavior-based personalized
discounts lead to a welfare improvement for both the e-commerce retailer and its customers.
Methodologically, the empirical analysis exploits field experimental data on a cosmetics e-commerce
website. Instead of employing observational data, the main advantage of using an RCT lies in the causal
estimation of heterogeneous treatment effects. By leveraging the exogenous treatment variation induced
by randomized trials, the structural demand addresses demand heterogeneity and estimates consumers’
elasticity. Consequently, randomly assigned treatments are endogenized according to a pricing policy
model. The supply model develops an optimal pricing algorithm, which is first trained via supervised
machine learning, and then used to simulate alternative pricing policies. At last, counterfactual policy
analysis shows how consumer welfare and monopolist profitability vary under different price discrimination algorithms. Personalized price discrimination increases e-commerce profit by 24% and consumer
surplus by 4%, compared to a uniform price benchmark. Furthermore, past purchase history is informative for the monopolistic firm, since implementing behavior-based price discrimination improves the
producer welfare by an additional 11%. While consumer surplus rises with personalized pricing, price
discrimination based on consumer purchase histories seems not to harm loyal clients.
This paper argues that a digital company with monopolistic access to users’ online history, adopts
behavior-based price discrimination as a demand segregation mechanism. To improve the efficiency
of personalized pricing algorithms, the e-commerce assigns history-specific pricing policies to different
consumers’ segments. The results show that committing to a unique pricing policy is suboptimal for
a monopolist whose users’ purchase records are available. In case of first-degree imperfect price discrimination, the set of users’ information maps into discount treatments, for a exogenously defined rule.
Instead, third-degree behavior-based price discrimination improves profit by inverting this relationship:
different market segments defined according to purchase history, are assigned to different pricing policies,
which compute the profit-maximizing treatment for each individual given user covarites.
The policy implications are significant. The public debate concerning fairness of tailored pricing
policies, fails to consider to what extent consumers benefit from personalized discounts, in addition to
producer welfare consequences. Moreover, limiting the information that the e-commerce retailers have
access to for personalized pricing purposes, could result detrimental to consumer welfare. Compared to
uniform prices, both sides of the market benefit from behavior-based price discrimination, as it increases
total welfare, particularly for loyal customers. In conclusion, this paper extends the research on behavior33
based price discrimination. By presenting a novel definition behavior-based price discrimination and
contributing to the empirical literature, this paper provides an original perspective in the current public
policy debate regarding e-commerce pricing strategies.
34
Chapter 2
Regret, Relief, Envy and Gloating: An
Experimental Investigation of
Counterfactual Emotions in Children
2.1 Introduction
In everyday decisions, people consider how the outcomes of their choices would have changed if an
event or an action in the past had turned out differently. We often find ourselves conjecturing about how
our lives would have gone if we had won the lottery, passed that university exam, or accepted another job
offer. For instance, imagine a student after a difficult meeting with his advisor, who may have thought
that if he had worked harder the meeting would have been better. This student speculates about how the
meeting would have been different, and compares a hypothetical alternative to the experienced reality.
The human ability of reflecting on states of the world and comparing them to unrealized possibilities is
called counterfactual thinking. Being capable of imaging alternatives to real situations and comparing
reality to possible alternatives is a cognitive advantage which improves decision-making (Smallman &
Summerville, 2018), causal reasoning (Spellman & Gilbert, 2014), motivation (Dyczewski & Markman,
2012), and capacity to learn from past mistakes (D. S. Weisberg & Gopnik, 2013). Counterfactual thoughts
are mental representations of alternatives to past events, states or actions which arise in counterfactual
thinking.
Related literature has identified two classes of counterfactuals. On the one hand, downward counterfactual thoughts are counterfactuals in which individuals consider alternatives worse than reality. On the
other hand, upward counterfactual thoughts are counterfactuals in which agents reflect on possibilities
better than actual situation (Markman et al., 1993). Counterfactual thoughts induce emotional responses
to past actions or events depending on the nature of the counterfactuals. These are called counterfactual
emotions: relief is related to downward comparisons, while regret refers to upward comparisons. Intuitively, people experience regret when their actions lead to a state of the world which is less desirable
than what it could have been, and relief is complement to regret as realized reality is more desirable than
what it could have been. Researchers have also identified other counterfactual emotions. For example,
35
counterfactual thinking may mediate emotional responses such as envy and gloating when counterfactual thoughts also consider how others would have been in both real and imagined scenarios. In this
case, people experience envy when their actions lead to a state of the world in which the other agent is
relatively better off, whereas they gloat when their actions led to scenario in which they are better off
than to other individuals.
In the last decade research has primarily investigated the developmental trajectories of counterfactual
emotions in children (e.g., Beck & Riggs, 2014; Rafetseder & Perner, 2014; Guerini et al., 2020) and it has
addressed the relation between counterfactual thinking and counterfactual emotions such as relief and
regret (e.g., D. P. Weisberg & Beck, 2010, 2012; O’Connor et al., 2012; O’Connor et al., 2014; McCormack
& Feeney, 2015; McCormack et al., 2016) and envy and gloating (e.g., Shamay-Tsoory et al., 2007; Dvash
et al., 2010; Dijk, 2017; Santamaría-García et al., 2017). Thus far, a coherent picture of the development
of these counterfactual emotional responses has yet to be delineated. In particular researchers have
attempted to pinpoint the age at which children first experience these emotions. Although a recent surge
of scholars’ interest toward the development of counterfactual thinking, considerable disagreement still
surrounds at what age counterfactual emotions arise in childhood.
In papers focusing on the development of counterfactual emotions in human decision-making, children are usually presented with a choice between two boxes to win a prize of stickers or similar prize.
After one course of action is taken, pupils are asked to judge the outcome by taking the counterfactual
alternative into consideration. Supported by results indicating that counterfactual emotions are cognitively mediated by counterfactual reasoning (Guttentag & Ferrell, 2004; Coricelli et al., 2005), the majority
of these developmental studies rely on the assumption that children who do not reason counterfactually,
do not experience regret and relief, or envy and gloating.
For instance, Van Duijvenvoorde et al. (2014) investigates the age differences in counterfactual emotions of relief and regret. With a range from 5 to 13 years of age, four groups of children and a group
of young adults performed a choice task in which they experienced a regret situation (i.e., the chosen
option was worse than the alternative), a relief situation (i.e., the chosen option was better than alternative) and a baseline situation where the chosen option was equal to the alternative. Results reveals that
both regret and relief counterfactual emotions emerged in childhood: children showed relief starting at
age of 5, while 7-year-old pupils exhibited regret. In McCormack et al. (2016) the development of the
counterfactual emotions of regret and relief were examined in two task-driven experiments in which
children chose between one of two gambles with different risks. In terms of experimental design, this
work presents a methodological improvement over previous researches as it addresses some questions
regarding the nature of participants’ choices and how emotion ratings are used to measure regret and
relief. Findings show that children as young as 6 or 7 years experienced counterfactual emotions in the
context of risky decision making. Also in Guerini et al. (2020) children between the ages of 3 and 10 years
completed a Wheels of Fortune task in which they chose between two gambles of differing risk. Unlike
most developmental experiments, children were in a position to feel responsible for the outcome they
obtained. The experimental design thus allowed to compare emotional responses between regret and
elation trials, and between relief and disappointment trials, depending on the information that children
received after making their choices. Children were shown to report negative emotional evaluations from
36
around 6 years of age, regardless of whether they compared their outcome to that missed by chance (partial feedback) or by choice (complete feedback). Such results corroborate the hypothesis that negative
counterfactual emotions like disappointment and regret emerge at around 6 years of age.
This paper presents an original analysis by combining together the methodological improvements
in development literature, a brand new experimental data set, and a parametric method of demand
estimation. It examines the onset of counterfactual emotions such as regret, relief, envy, and gloating in
childhood. To this purpose, the analysis implements an econometric estimation model to quantify the
impact on decision-making of counterfactual emotions and their evolution between age-specific groups.
The final goal is to uphold a number of previous findings by means of a different estimation approach.
The research leverages data collected in an experiment in which children from 3 to 11 years of age choose
between two gambles with non-uniform levels of risk under different conditions of information treatment
and comparison. The design of the experiment allows to induce a specific emotion in each trail. Agents
who observe the outcomes of their selected gamble only would experience relief or regret, depending
on the magnitude of the results one gamble. Whereas, in trials with complete feedback, children learn
about the outcomes of both gambles. Then emotions of gloating and envy may be induced according
to the magnitude of both gambles’ results. At the end of each test, participants evaluate how they feel
by comparing the result obtained with the counterfactual result. This assessment consists of an explicit
measurement of counterfactual emotions.
The rest of the paper proceeds as follows. Section 2.2 discusses the experimental methodology and
data, while Section 2.3 presents the econometric analysis. Section 2.4 introduces the theoretical demand
model and reports the results of demand estimation. Section 2.5 concludes.
2.2 Experiment
Experimental Material. Experimental design follows Guerini et al. (2020) experiment. Experimenters
employ a few items to conduct an experiment resembling Wheels of Fortune tasks in which participants
choose between two gambles of different risk. On each trial, two transparent plastic boxes are placed
on the table in front of the participant. Each box has two sections and payoffs are represented by stacks
of cardboard tokens. A different amount of tokens (or, equivalently, stacks with different heights) are
inserted in the four sections of the two boxes. Sections of each box represent the potential outcomes of
specific gambles. The allocation of cardboard tokens between boxes varies, but on each trial it always
allows to differentiate between boxes in terms of riskiness of the gamble. For instance, consider the case
in which the right box features three tokens in the left section and five tokens in the right section, and the
left box features zero tokens in the left section and eight tokens in the right section. In this case, the left
box represents a safe gamble, while the right box represents a risky gamble. Over the trials, the potential
outcomes of both safe and risky gambles change, as well as expected values (Figure 2.1).
Outcomes of each gamble are equally likely. In order to guarantee that, a tablet is placed in front of
the transparent boxes. The screen shows two half-bisected circles, with each circle aligned with one of the
boxes and the two halves aligned with the two sections of each box. Each circle features also a centered
37
Figure 2.1: Four conditions of the Wheels of Fortune task
Note: On each trial, potential winnings were represented with cardboard tokens in boxes aligned with bisected wheels displayed
on a tablet. Children chose by touching one of the wheels on the tablet screen. A black box appeared around the chosen wheel.
Left panels represent partial feedback trials, in which only the outcome of the chosen wheel was revealed. This gives the
conditions for regret when a downward comparison is made between the obtained outcome and the unobtained outcome (the
other outcome on the same wheel, upper panel), or relief when an upward comparison is made (lower panel). Right panels
depict complete feedback trials, in which the outcomes of both the chosen and unchosen wheels were revealed. This provides
the conditions for gloating when a downward comparison is made between the obtained outcome and the unobtained outcome
(the outcome of the other wheel), or envy when an upward comparison is made. Figure by Guerini et al. (2020).
spinning arrow, replicating a wheel of fortune design. Each child makes a choice between the safe and
the risky gamble by tapping on the respective circle beneath the preferred box. The choice is marked by a
square appearing around the selected circle, and the outcome of each gamble is determined by the arrow
spinning and pointing to either the left or the right side of both boxes. Neither ties between sections
and within box, nor within/across sections and between boxes are allowed. This feature is relevant as
participants are asked to evaluate their emotions at the end of each trail. They are provided a five-point
pictorial Likert scale and they rate their emotions by selecting one of the five faces ranging from ‘very
sad’ to ‘very happy’.
Procedure. Researchers test that all participants know to use the rating scale in an appropriate manner.
Children are required to select the ‘very happy’, ‘happy’, ‘neither happy nor sad’, ‘sad’ and ‘very sad’
faces in a random order. Children who respond incorrectly are given an explanation of the faces and
38
what they represent by the experimenter, and then they are asked to point to each face again. As no child
fail to respond correctly after the explanation, all the tested subjects take part of the experiment. Each
participant is given an identification number.
Each child completes two sessions of five trials each, with one practice trial and four test trials. Thus,
the final sample consists of 194 participants and eight trails each, for a total number of observations equal
to 1,552. In the first session the child is alone in the room with the experimenter, whereas in the second
session the child is paired with another participant who enters in the room at the beginning of the second
session. Of remarkable importance, in both sessions the child is the agent of choice: they choose one
of the two gambles and receive the outcome of the chosen gamble. The first session is characterized by
a partial feedback. The child observes and receives the outcome of the chosen gamble only. While the
second session is characterized by a total feedback. The child observes both the outcomes of the chosen
and the not chosen gamble, and they are instructed that they receive the outcome of the chosen gamble,
whereas the paired participant obtains the outcome of the not chosen gamble. Furthermore, trials can
be described in terms of comparison between the obtained and the not obtained outcome. An upward
comparison is a trail in which the outcome of the chosen gamble is lower than the not obtained outcome,
while a downward comparison is a trail in which the outcome of the chosen gamble is higher than the
not obtained outcome. The notion of not obtained outcome depends on the feedback each participant
is given. In partial feedback trails, the child observes the outcome of the chosen gamble only. The
not obtained outcome refers to the potential outcome not selected by the spinning arrow in the chosen
gamble. Whereas in total feedback trails, the child observes the outcome of both the chosen and not
chosen gamble. In this case, the not obtained outcome refers to the outcome selected by the spinning
arrow in the not chosen gamble, and received by the paired peer.
Specifying comparison trials and combining them with different feedback allows to define the emotions that this study aims at investigating. When the child are given limited information, the emotional
responses depends on the comparison between obtained and not obtained outcomes within the chosen
gamble. From participants standpoint, the nature of comparison, i.e., upward or downward, is stochastic
as the two outcomes are equally likely and the spinning arrow randomly selects the outcome. Participants would then experience that the nature of comparisons is by chance. Partial feedback trails are
associated to the emotions of relief and regret, for downward and upward comparisons, respectively.
Instead, when the child is given complete information, the emotional responses relies on the comparison
between obtained outcome of the chosen gamble and the outcome of the not chosen gamble. Because
the child knows that the not obtained outcome is given to the matched peer, the nature of comparison
is affected by how children evaluates this further dynamic. Participants would then experience that the
nature of comparisons depends on their choices. Thus, total feedback trails are associated to the emotions
of gloating and envy, for downward and upward comparisons, respectively.
Sample. The experiment involves 194 subjects. Participants are 3 to 11 years old children (median =
7 years; mean = 7.26 years; s.d. = 2.65 years). There are 77 males (median = 8 years; mean = 7.41
years, s.d. = 2.68 years) and 117 females (median = 7 years; mean = 7.16 years, s.d. = 2.63 years) of
middle economic status, Italian nationality and Caucasian ethnicity. Children are pupils of five different
kindergartners and elementary schools in the area of Trento, the most populated city of Trentino-Alto
39
Adige region in Northern Italy (Table B.1). Schoolchildren take part of the experiment after their parents
signed an informed consent. The experiment is conducted between December 2012 and June 2013, and
six researchers collect the data. In each session participants are paired with a peer and tested in a quiet
room of their school. Each session lasts 20 minutes. The study received the approval by the ethical
committee of the University of Trento.
2.3 Reduced-form Analysis
Descriptive Analysis. In each of the two practice trials and the eight test trials, all the 194 participants
were asked to choose one of the two gambles and to rate their emotions. They chose between the risky
and safe gamble, and a dummy variable was defined accordingly: 1 for the risky gamble, 0 for the safe
gamble. Preliminary investigations suggest that children risk preferences do not depend on gender, but
they evolve with age. The proportion of risky choices of male participants is on average equal to 0.47%
(s.d. = 0.09%), while for female participants is equal to 0.46% (s.d. = 0.08%). Risk propensity exhibits
an unsteady, but decreasing trend with age for both male and female participants. Both patterns are
characterized by a large variation in the first four years under consideration (3 to 6 year old), whereas
6 years old and older children display a more stable behavior in terms of propensity of risky choices
(Figure 1.1).
Proportion of risky choices is also investigated over age, conditional on both partial and total feedback.
In partial feedback, the propensity to risk in downward comparisons (relief) is on average equal to 0.45%
(s.d. = 0.10%), while in upward comparisons (regret) is equal to 0.49% (s.d. = 0.05%). In total feedback
instead, the propensity to risk in downward comparisons (gloating) is on average equal to 0.45% (s.d. =
0.16%), while in upward comparisons (envy) is equal to 0.44% (s.d. = 0.11%). No patters have a steady
linear trend. Nevertheless, it is interesting to notice that patterns related to positive emotions (relief and
gloating) begin to decrease between the age of 6 and 7 years. It is possible indeed to observe moments of
transition for both relief and gloating as in the first 3 to 6 year old phase children are more risk seeking
then the average, while in the second 7 to 11 year old phase they appear to be more risk averse than before.
On the opposite, patterns related to negative emotions (regret and envy) start to raise between the age of
6 and 7 years. In this case, regret pattern tend to fall for younger children and then it raises around the
age of 6 and 7 years, while behavior of propensity to risky choices in the envy case shows a very unstable,
upward direction (Figure B.2 and Figure B.3). At the end of each trial participants rated their emotions
through a five-point Likert scale. A discrete variable of emotion evaluation was coded from -2 for ‘very
sad’ to +2 for ‘very happy’. Preliminary descriptions suggest that children emotional evaluations do not
depend on gender and remain steady over the years. The magnitude of emotional evaluations of male
participants is on average equal to 0.39 (s.d. = 0.25), while for female participants is equal to 0.49 (s.d. =
0.14). Emotional responses exhibit steady, mean centered trends over time suggesting no specific effect
of age on emotional ratings for both male and female participants (Figure B.4).
Emotional evaluation is analyzed over age conditional on both partial and total feedback. In partial
feedback, the emotional evaluation in downward comparisons (relief) is on average equal to 1.31 (s.d.
= 0.43), while in upward comparisons (regret) is equal to -0.49 (s.d. = 0.80). Relief pattern presents an
40
upward trend from 3 to 6 years, and a following stabilization around average after the seventh year of
age. Conversely, regret pattern reveals a downward trajectory in the first three years, and a following
stabilization around -1. Considered both relief and regret averages of emotional evaluation, each patter
crosses its relative average between the fifth and sixth year. In total feedback, the emotional evaluation in
downward comparisons (gloating) is on average equal to 1.51 (s.d. = 0.29), while in upward comparisons
(envy) is equal to -0.52 (s.d. = 0.19). Both patterns exhibit a quite stable path around their averages, with
a drop displayed by the trend of envy after the seventh year. It is interesting to note that the magnitude
of emotional evaluation is not symmetric. Gloating evaluations range between 1 and 2 with an average
of 1.51, whereas envy evaluations fluctuate around an average of -0.52 (Figure B.5 and Figure B.6).
A further descriptive analysis depicts the trends of propensity of risky choices and emotional evaluations over age and conditional on both gender and task conditions. As far as proportion of risky
choices is concerned, both partial and complete downward comparison cases seem to present a common
decreasing trend over the years. These patterns are not linear and apparently not gender specific, but it
can be observed that in both scenarios children older than 5 years show a more risk adverse behavior
than younger participants. Conversely, both partial and complete upward comparison cases exhibit a
stochastic and unpredictable pattern. Male and female participants equally result in a unstable behavior
of risk aversion over the years suggesting that task conditions in upward comparisons do not affect willingness to risk in the experimental trials (Figure B.7). Regarding emotional evaluation, both partial and
complete downward comparison cases show a common increasing trend over the years. In particular,
male participants appear to raise their ratings, with a sharp increase in the first years and a consequent
pattern stabilization after the age of 6 and 7 years. Also older female participants report higher emotional
responses, however their trend results to be more mean centered compared to the behavior of their male
peers. On the contrary, partial and complete upward comparison cases offer two different tendencies.
In regret condition, both male and female children decrease their emotional evaluations as they are 6
years old and older. Specifically, it appears clearly the effect of age on emotional evaluations as younger
participants reduce their ratings, while older participants maintain slightly the same average in emotional
ratings. In envy condition levels of emotional responses stay stable over age in both genders (Figure B.8).
2.3.1 Linear Mixed-effect Model of Emotion Rating
Findings of these preliminary analysis support that pupils exhibit an greater sensitivity to emotion
evaluations from around the age of 6 years. This perception pays a crucial role also in children experience
of negative counterfactual emotions in downward comparison. Previous papers also demonstrate that
children who experience these negative emotions are more likely to change their choice when they deal
with with the same decision again (e.g., O’Connor et al., 2014; McCormack et al., 2019).
A linear mixed-effect model is thus implemented in the econometric analysis. Its purpose is to
estimate the impact of experimental feedback, gender, child age and risky choice on emotion evaluation
in both upward and downward comparisons’ trials. In order to do so, data set of test trials is divided
in two subsets of upward and downward comparisons to study jointly regret and envy, and relief and
gloating. In each model the dependent variable is Evaluation, which is coded from -2 when child selected
‘very sad’, to 2 when child selected ‘very happy’. Explanatory variables are Feedback, coded as a binary
41
variable equal to -1 for partial feedback and 1 for complete feedback, Gender, coded as a binary variable
equal to -1 for female and 1 for male, and Age, which is mean centered. The interactions of these three
predictors are also included in the model specification. Since risky choices are observed to reduce with
age and because different choices lead to different trial outcomes, the control variable Choice is added at
the trial level and it is coded as a binary variable equal to -1 for safe choice, and 1 for risky choice.
Table B.2 and Table B.3 display model estimates for upward and downward comparisons, respectively.
In each table, the first column presents the estimates of OLS linear regression model, while the second and
the third columns show the estimates of a linear mixed-effect model with random effect and with random
effect and random slopes, respectively. Column (I) contains the estimates of a standard linear regression
model. These are the coefficients of the explanatory variables, namely the variables that are expected
to have a statistical effect on the dependent variable. In broader term, explanatory variables are also
called fixed effects. This econometric analysis aims at making conclusions about how Feedback, Gender,
Age, Choice, and their interactions impact statistically on Evaluation. All of these are the so-called fixed
effects. By contrast, random effects are usually grouping factors which need to be controlled. Random
effect model assists in controlling for unobserved heterogeneity when the heterogeneity is constant over
time and not correlated with the explanatory variables. In addition, it is worthy noticing that the data for
random effect is a limited sample of all the possible realization of the variables under analysis. Despite
the sample of children considered in the experiment is limited, the goal is to generalize results to a
whole population. In this research, the variable defining participants, coded by subjects identification
number, was included as a random effect. Column (II) presents the estimates of a linear mixed-effect
model with random effect. To improve the statistical analysis, it is also considered a model that includes
both random intercepts and random slopes. A linear mixed-effect model with random effect only fits
random-intercept. This entails that a random intercept model allows the intercept to vary for each level of
the random effect, but keeps the slope constant among them. The variable Feedback defined the random
slope. Then, column (III) shows the estimates of a random-intercept (fixed effect) and random-slope
(random effect) model.
Upward Comparison: Regret and Envy. Table B.2 contains the estimates of the three models for upward
comparison trials, where the not obtained outcome was better than the obtained outcome. The analysis
focuses on the results for linear mixed-effect model with random effect and random slopes in column
(III). Age and Choice show statistically significant effects. The economic insight is that emotion evaluations
become more negative as the children grow up, and children feel worse when they take risky choices
rather than safety choices. This result is convincing as children got fewer tokens when they made the
risky choice in upward comparison trials. Both Feedback and Gender are not statistically significant. There
is a two-way interaction between Feedback and Age, which is statistically significant, while the other two
two-way interactions and the only three-way interaction are not statistically significant.
Furthermore, two simple metrics, for the experience of envy and regret respectively, are computed
to define the age at which participants first report negative counterfactual emotions after learning that
their obtained outcomes are worse than what they might have had. For the sample with complete and
partial feedback in upward comparison, participants’ emotional evaluations are entered into one-sample
t-tests to determine if participants’ ratings are less than 0 (‘neither happy nor sad’). Three, four and five
42
years old participants do not report feeling sad in upward comparison trails with complete feedback (
ps > 0.01). Whereas, children who are 6 years old and older children exhibit feeling negative emotions
in upward comparison (ps < 0.01 with marginal effect of 6 years old participants ps = 0.008). The same
analysis is performed for the experience of negative emotions with partial feedback. Again, participants
who are 7 years old or older show negative ratings in upward comparisons (ps < 0.01).
Downward Comparison: Relief and Gloating. Similarly, Table B.3 reports the estimates of the three
models for downward comparison trials, where the not obtained outcome was worse than the obtained
outcome. The analysis focuses on the results for linear mixed-effect model with random effect and
random slopes in column (III). In this case, Feedback, Age and Choice present statistically significant effects.
The statistical finding is that children feel better when they have complete feedback compared to partial
feedback, and children emotion ratings become more positive with age. Also, pupils feel better when
they make risky choices rather than safety choices. Again, this finding is expected since children obtained
more tokens when they made the risky choice in downward comparison trials. There are two two-way
interactions between Feedback and Age, and between Gender and Age, which are statistically significant,
while the other two-way interactions and the only three-way interaction are not statistically significant.
In addition, the two same metrics for gloating and relief, are computed to define the age at which
participants first report positive counterfactual emotions after learning that their obtained outcomes
are better than what they might have had. Given the sample with complete and partial feedback in
downward comparison, children’s emotional evaluations are entered into one-sample t-tests to determine
if participants’ ratings are more than 0. Three and four years old participants do not report feeling happy
in upward comparison trails with complete feedback (ps > 0.01). Whereas, children who are 5 years old
and older participants exhibit feeling negative emotions in upward comparison (ps < 0.01). The same
analysis is performed for the experience of positive emotions with partial feedback. In this case all the
children, regardless their age, show positive ratings in upward comparisons (ps < 0.01).
2.3.2 Logistic Mixed-effect Model of Choice Adaptation
A logistic mixed-effect model is also implemented. The purpose of this analysis is to investigate
the effect of emotion evaluations on choice adaptation in the next trial, namely whether positive and
negative ratings have an impact on the choice of children to select a safe or risky gamble if a safe or
risky gamble is chosen in the previous trial. In this model the dependent variable is Shift, which is
coded as a binary variable equal to 0 for repeated behavior (safe-safe or risky-risky) and 1 for shifted
behavior (safe-risky or risky-safe). Explanatory variables are Evaluation, five-point scale mean centered,
Feedback, coded -1 for partial feedback and 1 for complete feedback, Gender, coded -1 for female and
1 for male, Age, mean centered, and Choice, coded -1 for safe choice and 1 for risky choice. All these
explanatory variables are entered in a full-factorial design including all interactions. Moreover, variable
determining participants, coded by subjects identification number, is added as a random effect, and
explanatory variables Evaluation, Feedback, Choice and their interactions define the random slopes of the
logistic mixed-effect model.
Table B.4 reports model estimates of odds ratio, namely the exponential function of the regression
43
coefficient, and 95% confidence intervals. Considered a 5% significance level, Evaluation, Feedback, and
Age are statistically significant, together with the interaction terms Evaluation × Feedback, Feedback ×
Choice, and Age × Choice. None of the 3- and 4-variable interaction terms show statistically significant
effects, thus they are not reported. The econometric interpretation of Table B.4 is the following. For
Evaluation, subjects reporting high emotion ratings are more likely to change their choices in next trial
compared to those reporting low evaluations. In addition, this effect interacts with feedback (Evaluation
× Feedback). This interaction seems to suggest that after negative emotions, children are more likely to
shift their choice in next trial if they receive a complete feedback rather than a partial feedback. For
Feedback, in trials with complete feedback children are more likely to change their choice in next trial
compared to what they would do in trials with partial feedback. Moreover, this effect interacts with
choice (Feedback × Choice) suggesting that in trials with complete feedback children are more likely to
modify their choice in next trial after selecting a risky choice rather than a safe choice, compared to
trials with partial feedback. For Age, older children are more likely to shift their choice in next trial than
younger children. Furthermore, this effect interacts with choice at current trial (Age × Choice) entailing
that older children are more likely to modify their choice in next trial after selecting a risky choice rather
than a safe choice, compared to younger children. Instead, younger participants are more likely to repeat
their choice in next trial after selecting a safe choice than a risky choice.
2.4 Discrete-choice Demand Model
In his seminal paper, S. T. Berry (1994) develops an estimation method in which individual utility
functional specification is framed within a discrete-choice demand framework. S. T. Berry (1994) presents
a demand estimation procedure which inverts the market share equation in order to compute the implied
mean level of utility for each good in each market, and allows for estimation by means of instrumental
variables. This section adopts the same estimation technique in a model which aims at estimating utility
parameters and at investigating counterfactual emotions’ development in children using experimental
data. It first discusses the model and its primitives, then addresses the estimation strategy and clarifies
the use of specific instrumental variables. Finally, the results of demand estimation are presented and
discussed.
2.4.1 Demand Model
Each trial of the experiment is characterized by both compete or partial feedback, together with
upward or downward comparison. Four markets are considered, labeled by = 1, . . . , , where
= 4. The model assumes that a market is defined as the set of all the trails with the same feedback
and comparison, so that is the total market size. Therefore, it defines the Relief market (582 trials),
the Regret market (388 trials), the Gloating market (451 trials), and the Envy market (519 trials). In each
experimental trial the participant faces a discrete choice between two gambles, labeled by = 0, . . . , ,
where = 0 indicates the outside option and = 2. Since each pair of gambles consists of a safe gamble
and a risk gamble, child discrete choice can be seen as a decision-making process between the safe and
the risky gamble, thus = {0, , }.
44
The model assumes that each gamble has two salient features which are market and gamble specific,
namely they depend on the feedback and comparison treatments: expected value and gamble span.
Expected value is defined as
=
+
0
2
, (2.4.1)
where and
0
represent the equally likely outcomes of the preferred gamble: the first is the randomly
obtained outcome, while the second is the not obtained outcome. Instead, gamble span measures the
dispersion of the two outcomes of each gamble and is defined as
= ( −
0
)
2
. (2.4.2)
It is worth noting that for two gambles with the same expected value, the safe gamble has a lower span
compared to the risky gamble. Conversely, for two gambles with the same span, the safe gamble has
a greater expected value compared to the risky gamble. Furthermore, the following index is defined,
which summarizes the observable characteristics of each gamble by means of expected value and gamble
span:
=
, (2.4.3)
where it can be noticed that ∂
∂
> 0 and ∂
∂
< 0, namely increases as expected value increases
or as gamble span decreases.
The other crucial variable in each trial is the emotional rating . Because it is collected after the
participant observes the trail feedback and comparison, the evaluation is both market and gamble
specific. The economic intuition is that children know the feedback of each trail because they observe
whether or not a peer is matched with them, but they do not know whether the realization of the chosen
gamble will be greater or not to the reference term which depends on the feedback treatment. Therefore,
agents take into account the expectation of their emotion ratings in the valuation of each choice, before
the realization of the comparison treatment. After the comparison treatment is revealed, at the end of
each trial, pupils are asked to rate their emotions on a five-point Likert scale. A key assumption in this
model is that agents reveal their evaluations truthfully, that is there is no incentive of children to report
not truthful ratings .
In a discrete choice demand model the primitives are the characteristics of the gamble and preferences
of the agent. The utility of agent for gamble in market depends on both , which summarizes the
features of the gamble , and , which represents the emotion ratings. The experimental design allows
the experimenter to observe part of the gamble characteristics and the individual decision in each trial.
Utility functional form of agent for gamble in market is defined as follows
= + + + , (2.4.4)
where and are the observable characteristics of gamble in market , whereas and are
the structural errors, which are both observed by agent , but are not observed by the experimenter. Error
can be interpreted as the mean of valuations among agents of gamble ’s unobserved characteristics,
45
while error describes the distribution of unobserved agent-specific preferences. The term , which
represents the unobserved characteristics of each gamble , might be potentially correlated with both
observed regressors and : this is exactly the cause of endogeneity in the estimation problem.
Finally, and are the parameters object of estimation.
Of remarkable importance, the mean utility value of gamble represents the level of utility of gamble
that is common across all agents
= + + . (2.4.5)
The mean utility level of outside option = 0 is normalized to zero, so that 0 = 0 and thus 0 = 0
∀. Furthermore, Equation 2.4.4 can be rewritten as
= + , (2.4.6)
which entails that the additive term determines the variation in agent ’s preferences. A further
common assumption in discrete choice literature consists in imposing that error term is identically
and independently distributed cross agents and gambles and it is distributed according to a Generalized
Extreme Value distribution Type-I (TIEV), namely
v TIEV. (2.4.7)
By defining the choice indicator as follows
=
1 if agent i chooses gamble j
0 otherwhise,
(2.4.8)
and given these assumptions, the agent-specific choice probability for gamble in market assumes the
canonical multinomial logit form as follows
( = 1|, ,
0,
0, 0 = 1, . . . , , = 1, . . . , ) =
P
0=0
0
=
1 + P
0=1
0
. (2.4.9)
At the aggregate level, observed market shares ^ are imposed to be equal to ˜ which are the predicted
market share functions, given values of parameters and , and the unobserved errors (1, . . . , ):
˜ =
· ( = 1|, ,
0,
0, 0 = 1, . . . , , = 1, . . . , )
=
1 + P
0=1
0
, (2.4.10)
and
^ ≡ ˜(0, 1, . . . , ) ≡ ˜(, , 1, . . . , ) ∀ = 1, . . . , . (2.4.11)
At this stage, the econometric model and the parameters provide the predicted market shares based
on Equation 2.4.10, while Equation 2.4.11 defines the estimation principle according to which estimated
46
parameters and rationalize observed market shares to predicted market shares. However, due
to the endogeneity problem generated by the unobserved error term , which is positive correlated
with the observed characteristics of gamble , the estimation procedure cannot be performed by means
of nonlinear least squares. The reason why this estimation problem is inconsistent with nonlinear least
squares framework comes from the fact that, in order to compute predicted market shares, experimenter
needs to observe error term , which is not observable by definition.
2.4.2 Estimation Strategy and Instrumental Variables
In order to overcome the issue of endogeneity and to obtain consistent estimates of the preference
parameters, the estimation argument follows S. T. Berry (1994) IV-based estimation approach. The
common assumption is that there exists a set of relevant and valid instrumental variables . In this case,
consistent estimates of parameters are obtained by imposing a standard GMM-IV moment restriction,
namely
=
1
X
=1
=
1
X
=1
( − − )
= 0 ∀ = 1, . . . , , (2.4.12)
which converges (as → ∞) to zero at the true values ¯ and ¯. Given the limited data set in which
= 2, the parameters and are estimated by minimizing the sample moment conditions for each
market
, (, ) =
1
X
=1
( − − )
∀ = 1, . . . , , (2.4.13)
which is used to estimate (, ) by minimizing the quadratic norm in the sample moment functions:
min
(,)
, (, ) = [, (, )]0 [, (, )] ∀ = 1, . . . , , (2.4.14)
where is a weighting matrix which is assumed to be identity matrix.
In particular, the procedure implements a two-step estimation strategy. In the first step, it equates the
observed market shares ^ to the model predicted market shares ˜(0, 1, . . . , ) for all gambles
and market under the normalization of outside options 0 = 0. This defines a system of nonlinear
equations in the unknowns (1, . . . , ), specifically with = {0, , }:
^0 = ˜0(0, , )
^ = ˜(0, , )
^ = ˜(0, , ),
(2.4.15)
47
which defines the following system:
^0 =
1
1+
+
^ =
1+
+
^ =
1+
+
.
(2.4.16)
In order to solve for (, ), it is required to invert this system of equations by computing the
unknowns as a function of the observed market shares (^, ^). The outcome of this first step is thus
a set of estimates of mean utility level, namely (
^ = (^, ^),
^ = (^, ^)).
In the second step, because of the endogeneity which arises between unobserved errors and
emotion ratings , a suitable instrument for evaluations is required. In discrete choice models which
consider exogenous characteristics, observable characteristics of other products in the same market (in
this case, the not chosen gamble) define a set of suitable instruments. The intuition why these variables
specify appropriate instruments is twofold. Evaluations of agent depend on both the expected
realization of chosen and not chosen gambles, then emotion ratings should be affected also by the
observed characteristics of the not chosen gamble. However, as agent needs to select his preferred
gamble, observed characteristics of the not chosen gamble should not affect agent ’s valuation of the
chosen gamble. Thus, a set of suitable instruments is constructed as:
=
0 =
0
0
=
0 +
0
0
2(
0 −
0
0)
2
, (2.4.17)
where
0
represents the not chosen gamble when gamble is chosen. As = 2, 0 is the suitable
instrument in any trail in which the safe gamble is chosen, whereas 0 is the suitable instrument
in any trail in which the risky gamble is chosen. Then, the analysis defines the standard GMM-IV
moment restrictions and calculates the sample moment conditions using estimated (
^,
^) according
to Equation 2.4.14.
2.4.3 Demand Estimation
Results are collected in Table B.5. Parameters estimation procedure is implemented for each of the
four, previously defined markets. Some markets have more trials than others. Envy market has about
35% more observations than regret market, and gloating market has 30% less trails than relief market.
It is assumed these differences would not affect the outcomes of estimation algorithm. All markets are
characterized by even market shares between safe and risky gambles. Indeed, ^ and ^ range between
0.49 and 0.54, and 0.45 and 0.50, respectively. Average values of emotional evaluations are alike in the
two downward comparison markets (i.e., relief and gloating) and in the two upward comparison markets
(i.e., regret and envy). Average magnitudes of gamble-specific characteristic index and its instrument
0 depend on experimental design of gambles in each market. The first step of estimation strategy derives
the estimates of mean utility level of the two gamble types in each of the four markets. In the second step
the two parameters object of estimation are computed for each market.
48
Estimates resemble in relief and gloating markets. In both downward comparisons, the estimates of
the coefficient of observable gamble characteristics () are equal, likewise the estimates of sensitivity to
emotional ratings (). This could support the hypothesis that feedback treatment does not play a role
in children decision-making process in downward comparisons. Specifically, both personal valuation of
gamble features and sensitivity to emotional evaluations appear to have the same weight in both markets.
As far as estimates in regret and envy markets are concerned, it is interesting to notice that estimated of
envy is three times greater than estimated of regret. A within upward-treatment comparison indicates
that as the children receive a complete feedback they value preferred gamble observable characteristics
three times more than when they receive a partial feedback. Relatively to downward comparison
(3.32), the estimate in partial feedback upward comparison is almost twice greater (5.10) and the estimate
in total feedback upward comparison is four times greater (14.51). This result seems to suggest that
children, irrespective of their age, value more (twice and four times, respectively) gamble features in
upward comparisons with respect to downward comparisons. Something analogous can be inferred
from estimates of in regret and envy markets. Sensitivity to emotional ratings for regret is about
25% greater than the one for envy, but in terms of magnitude with respect to downward comparison
markets, of regret is almost three times greater than of relief and of envy is twice greater than
of gloating. Such finding could uphold the economic intuition that participants were more sensitive to
negative emotions compared to positive emotions, at the aggregate level.
In order to focus on the possible development of children parameters over age, the same estimation
procedure is performed conditional on age groups. Specifically, there are three age groups, namely 3 to
5 years old, 6 to 8 years old, and 9 to 11 years old children. The aim is to estimate both the coefficient of
observable gamble characteristics () and sensitivity to emotional ratings () over age groups. Table B.6
and Table B.7 collect estimation results which are discussed as follows. Estimates are similar in both
relief and gloating markets regarding the magnitude with respect to estimates of Table B.5. In both
downward comparisons, the estimates of coefficient of gamble characteristics () are centered around
the estimated mean, namely 3.32 and 3.32, respectively. In relief market estimates of show a sharp
fall, with the younger group displaying a coefficient of gamble characteristics (8.42) three times greater
than middle-age group (2.93) and almost twice greater than older group (4.85). Conversely, in gloating
market estimates of exhibit an upward steady trend, that is 3.74, 4.80 and 6.54, but still centered around
mean estimate. Estimates of sensitivity to emotional ratings () are characterized by a similar behavior.
Indeed, in relief market younger group shows a sensitivity parameter to emotional evaluations (15.60)
which is 50% greater than both 6 to 8 years (9.94) and 9 to 11 years old children (10.49). An equivalent
drop, although of minor intensity, is also observed in the gloating market in which younger participants
is equal to 11.94, while older children exhibit a sensitivity to emotional ratings equal to 8.72 and 9.87,
respectively. These results indicate that younger children are characterized by higher level of sensitivity
to emotional ratings in downward comparison, which then it decreases over the childhood and remains
stable for the middle-age group and older group. Furthermore, it seems also that age groups are not
severely affected by the feedback treatments in downward comparisons in terms of both evaluation of
gamble characteristics and sensitivity to emotional ratings.
As far as the upward comparisons are concerned, the estimates of in partial feedback upward
49
comparison display a analogous net decrease compared to partial feedback downward comparison. In
this case, evaluation of gamble characteristics of young children (23.31) results to be four time greater
than middle-age group (5.07) and almost seven times greater than older group (-3.06). Similarly to
what observed in gloating market, results of development in envy market show a sharp growth with
the younger group displaying a coefficient of gamble characteristics equal to 3.33 followed by a net
increase of middle-age group (19.14) and of older group (17.34). Interestingly, estimates of sensitivity
to emotional ratings () display a path which diverges from the estimates in downward comparisons,
with the main difference that in this case parameter is negative. Thus, in regret market younger group
shows a sensitivity parameter to emotional evaluations (59.86) which is, in absolute terms, four time
greater than 6 to 8 years (-13.05) and three time greater than 9 to 11 years old children (-23.13). It is
worth noticing that also an equivalent drop, although of minor intensity, is reported in the envy market
in which younger participants is equal to -26.98, while middle-age children exhibit a sensitivity to
emotional ratings equal to -16.96, almost half with respect to 3 to 5 years old participants. Again, these
findings could support the idea that younger children are characterized by higher, in absolute terms,
level of sensitivity to emotional ratings also in upward comparison, which then it decreases over the
childhood and remains stable for the middle-age group and older group. Moreover, it appears that
feedback treatments in upward comparisons condition age group specific estimates, both evaluation of
gamble characteristics and sensitivity to emotional ratings. Indeed, specific parameters greatly differ
within age group and among feedback treatments, suggesting there is an effect of the nature of feedback
in children decision-making process.
2.5 Discussion and Conclusion
In this work the development of counterfactual emotions such as regret, relief, envy and gloating in
childhood is is at the center of a new empirical method of analysis. The goal of this research is to quantify
the effect of these emotions in human decision-making process with the final purpose of corroborating
a number of early findings by means of a different estimation method. In particular, the analysis focuses
on an experimental data set collecting the choices of children between 3 and 11 years of age who perform
the Wheels of Fortune tasks with different risk. The design of the experiment allows for two types of trial:
trails with partial feedback if participants observe the outcomes of preferred gamble only, and trials with
complete feedback if they observe the outcomes of both gambles. Counterfactual emotions are measured
as pupils need to compare their obtained outcome to an opposite outcome in order to rate how they feel.
Linear mixed-effects models analyze both upward and downward comparisons conditional on feedback treatment. In upward comparisons results show that emotion evaluations become more negative as
the children get older, and pupils feel worse when they take risky choices rather than safe choices. Furthermore, children appear to experience regret from 7 years of age (with partial feedback) and envy from
6 years of age (with complete feedback). In downward comparison, instead, findings reveal that children
feel better when they observe complete feedback, emotional ratings improve with age, and pupils report
better evaluations when they make risky choices rather than safe choices. Moreover, participants begin to
report gloating from 5 years of age (with complete feedback), while all children, irrespective of their age,
50
exhibit feeling relief (with partial feedback). In addition, a logistic mixed-effect model studies the effect
of emotion evaluations of behavior in choice adaptation. Statistical results reveal that over childhood
children are more likely to shift their choice in the next trial, and older participants are also more likely
to modify their choice in the next trial after selecting a risky choice rather than a safe choice.
The second part of the research develops a micro-founded framework which rationalizes the decisionmaking process of children. It discusses the assumptions of the model and the economic intuitions behind
the implementation of demand estimation technique by S. T. Berry (1994). Such parametric estimation
method allows to estimate the impact in decision-making of counterfactual emotions and their evolution
between age groups. Because of its IV-based estimation approach, the main advantage of Berry (1994)
method is that it solves the problem of endogeneity. Thus, a proper set of instrumental variables
guarantees consistent estimates of the parameters object of estimation, namely the index of observable
gamble characteristics (, used as a control variable) and the sensitivity to emotional ratings (), which
is the objective of estimation endeavor.
At the aggregate level estimates suggest that children are more sensitive to negative emotions compared to positive emotions. Remarkably, it finds that the effects of feedback treatment have the same
relative magnitude in both types of comparison. In downward comparisons, the emotional sensitivity
of relief (partial feedback) is about 20% greater than the corresponding estimate of gloating (complete
feedback). Similarly, in upward comparisons, the effect of emotional ratings of regret is also 20% greater
than the estimated value of envy. The specific estimates of age groups reveal further insights. The focus
is mainly on estimates between the young age group (3 - 5 years) and the middle age group (6 - 8 years)
as I expect additional evidences in terms of development of counterfactual emotions. Results show that
younger participants weight more emotional ratings in their evaluation process irrespective of the nature
of comparison and feedback. Indeed, results reveal a relevant fall of emotional sensitivity equal to -40%
in relief market and -25% in gloating market, and a very sharp decrease of -105% and -40% in regret
and envy market, respectively. These findings suggest that children of the first two age groups evaluate
emotions unevenly. This could be explained by the development of counterfactual emotions in an age
range between 5 and 7 years. It could be argued that such a common drop results from the inability
by infants of counterfactual thinking, while both middle age and old age groups develop counterfactual
reasoning which entails a consistent level of emotional sensitivity. Estimates in upward comparison
highlight how younger participants overvalue emotion ratings as they have not developed yet that brain
activity associated with the experience of regret and, in social setting, of envy. Likewise, estimation
findings in downward comparison support the intuition that gloating emerge around 5 years of age as 3
- 5 years group exhibits a greater parameter of emotional sensitivity compared to the other two groups.
As far as relief is concerned, no evidences are found by means of one-sample t-test regarding the age at
which relief emerges. However, it appears that younger participants overrate relief emotion as children
of the middle and old group display the same sensitivity estimate, which is 50% lower than the younger
age group.
These results suggest that counterfactual emotions play a crucial role in children choice behavior
starting from around 6 years of age. More importantly, negative emotions such as regret and envy have a
greater effect on participants utility compared to positive emotions as relief and gloating. Nevertheless,
51
this work has some flaws. The last part discusses three main limitations and further directions for
future research improvements. First, the demand model assumes that children report truthful emotional
rating and such evaluations are a direct measure of counterfactual emotions. It could be debated that
this assumption is weak because it does not consider the possibility of strategic agents. Participants,
specifically older children in social context involving envy and gloating, might consider that disclosing
their emotions to the experimenter is not optimal, and they would report different emotional ratings.
One possible way to solve this limitation could be to extend the theoretical model by assuming two types
of agents, naive and strategic, depending on whether or not they report accurate evaluations. Second,
S. T. Berry (1994) defines a methodology for estimating discrete-choice demand models using aggregate
data in order to measure agent independent demand parameters. An improvement of the model could
be the hypothesis of a utility function with coefficients that vary across agents. This class of model
is called random coefficients logit models and is the main topic of S. Berry et al. (1995). Finally, the
experimental data set has some limitations both in terms of size of the sample and decision-making
framework. Specifically, observations by age group appear to be unevenly distributed and in some cases
the sample of a specific age is too limited to infer statistical significant outcomes. Furthermore, the
experiment was not specifically designed to implement such demand estimation method, and thus this
entails not unimpeachable results. Increasing the number of trails of each experimental session as well
as collecting further demographic control variables could improve the quality of the statistical analysis.
These are just a few ideas to provide better understanding on the incidence of counterfactual emotions
in human decision-making and to cast additional light on this topic of experimental and behavioral
economics.
52
Chapter 3
From Non-deterministic Preferences to
Reduced Value-based Neuronal Activity:
Stochastic Choices in Human Aging
3.1 Introduction
Decision theory postulates exogenous and deterministic preferences, entailing observable consistency
in behavior of economic agents. However, evidences of stochastic choices suggest for a refinement of
economic theory. Although belonging to different fields, we present two prominent models deeply
connected, which investigate randomness in choices. In Microeconomics, random utility model (RUM)
posits that choices are random because of evolving underlying preferences of economic agents. Instead,
in Neuroeconomics, drift-diffusion model (DDM) proposes that signal noise arises in the comparison of
computed decision values of two options during a dynamic accumulation process. As behavioral predictions from the drift-diffusion paradigm corroborate insights of the theoretical framework, computational
neuroeconomic models provide a neurobiological foundation for the random utility general framework.
This essay enlarges the focus on stochastic preferences and noisy process of decision values to the
biological implications of human aging. As individuals age, neuronal circuits undergo substantial
decline. It is plausible that decay of brain system contributes to alterations in economic behavior over the
adult lifespan. Nevertheless, it is also reasonable to deem that preferences evolves over lifetime as social,
environmental and biological changes have the ability to affect personal tastes. This research discusses
the hypothesis that randomness of observed behavior in elderly depends on two different sources which
cannot be easily disentangled. On the one hand, underlying preferences are subject to evolution over
time as specified by RUM. On the other hand, since aging affects those brain regions encoding decisionspecific values, older adults exhibit an higher degree of bounded rationality due to a noisier signal in
the decision process accounted by DDM. In support of this argument, this chapter reviews and provides
evidences from neuropsychological and neuroimaging studies related to aging process in humans. Such
hypothesis has remarkable implications regarding the evaluation of choice behavior. Random utility
models assume idiosyncratic shocks to the underlying preferences, thus stochastic variations cannot
53
necessarily be deemed as welfare decreasing. Conversely, in drift-diffusion models, the comparison
between computed decision values presents an error term, which defines a computational noise and does
not reflect changes in underlying preferences. Therefore, deterioration in neuronal circuits supporting
decision values needs to be considered as strictly welfare decreasing.
This essay first relates both RUM and DDM insights to empirical results from neuroscience with
a focus on human aging. The study offers a novel perspective which has the potential to shed light
on unsolved issues concerning aging and to provide further elements in designing actions directed to
older people. The final purpose of this research is twofold. It argues that neuroscientific findings
may contribute to the improvement of decision-making models. Furthermore, an integrated approach
between decision theory and neuroscience may potentially address future research questions.
3.2 Stochastic Preferences
At the basis of classical microeconomic theory there are preferences. The objectives of decision makers
are summarized in their preference relation, as the primitive characteristics of the economic agent. In
general, preferences are assumed to be rational, namely preference relation possesses the properties of
completeness and transitivity. Rationality implies a necessary condition to represent preference relations
by means of the utility function. It specifies a numerical value to each alternative in the choice set, ordering
these elements according to decision maker preferences. By imposing rationality axiom on preferences,
microeconomic theory analyzes the consequences of preference ordering on choice behavior through the
paradigm of utility maximization. Most economic models assume that decision makers behave as if they
were maximizing a latent utility function and choices are the outcome of this mechanism (Arrow, 1959;
Samuelson, 1983).
The nexus between preferences and choices consists in imposing that choices are generated by
individual-specific latent preferences. While preferences are a primitive, hidden feature of economic
agent, choices are observable. This intuition is at the basis of the weak axiom of revealed preferences, a
crucial result which reflects the expectation that observed choices display a certain amount of consistency,
analogously to rationality assumption of preferences (Samuelson, 1938). Theory of revealed preferences
asserts that observed (or also revealed) choices are consistent with the maximization of individualspecific utility function as long as choices satisfy basic consistency conditions. This theory also assumes
exogenous and deterministic preferences, and in turn, consistent and steady revealed choices. Nevertheless, randomness of choice behavior in experimental economics has motivated the development of new
models capable of explaining stochastic decisions (Kahneman & Tversky, 1984; Rogers, 1994; Trostel &
Taylor, 2001). Random utility models are devised to study stochastic choice behavior within a utility
maximization framework (Becker, DeGroot, & Marschak, 1963; Block, 1974). In RUM economic agents
maximize their utility function given rational preferences, and due to the evolving nature of preferences
over time, choices prove to be non-deterministic and discordant to classical theory predictions (D. McFadden et al., 1973; D. L. McFadden, 2006; Luce, 2012). Random utility models of discrete choice are the
cornerstone of empirical research as they provide an adaptive framework for relating empirical evidences
to stochastic choice behavior (D. McFadden & Train, 2000; D. McFadden, 2001). The analysis focuses on
54
the probability that an individual chooses an alternative from a discrete choice set of items, indexed
= 1, . . . , . RUM postulates the existence of a vector of random variables u, with element
, such that
= ( > , ∀ 6= ).
Conditions on determine whether observed behavior is consistent with the principle of utility maximization or, similarly, observed choices can be represented by vector u (Falmagne, 1978; D. L. McFadden,
2006). Among models of stochastic choice, the random preference models are of particular relevance.
These assume that a choice is represented by a preference relation stochastically drawn from a set
consistent with some axioms (Gul & Pesendorfer, 2006). Based on this interpretation, in each period
the agents first observe their preference realization, then alternatives are compared simultaneously and
those maximizing utility are selected. In this sense, stochastic choices are explained by the model by
allowing for preferences to change over time for different realization of u.
3.3 Neuroeconomics of Stochastic Choices
As a discipline, Neuroeconomics investigates the neurobiological mechanism behind human behavior.
Neuroeconomics has the potential to shed light on the neuronal basis of processes underlying economic
decision-making. Corroborated by empirical evidences, neurocomputational models have posited that
the decay of dopaminergic brain network may elucidate the individual changes in economic behavior over
lifetime. In particular, research has attempted to rationalize the relationship between decision deficits in
the elderly and age-related decrease in dopaminergic system. This section first introduces a know class
of models in the neuroeconomic realm, namely the drift-diffusion models, then it presents results from
neuroscience literature which highlight the role played by dopamine in encoding value signal.
3.3.1 Bounded rationality
In humans, economic behavior deeply depends on the modulation of different neurotransmitter
systems in the brain. Neurotransmitters are molecules of the nervous system which are involved in
the transmission of neuronal signals. They amplify, modulate and relay signals between neurons according to a specific objective. Different neurotransmitters are designated to different objectives. For
instance, dopamine, serotonin and norepinephrine have been found to be strictly related to economic
choices. Specifically, dopamine has been associated to adaptive decision-making and value-based learning (Schultz, 2006; Doya, 2008).
In neuroeconomic literature, computational models of discrete choices have been united by a simple
paradigm in which brain networks compute a decision value for each option. In these terms, economic
behavior can be explained as a decision-making process based on decision-specific values where the
subject chooses the option with the highest value after comparing the values corresponding to the
available options (Rangel, Camerer, & Montague, 2008; Weber & Johnson, 2009). A class of models
introduces the assumption of bounded rationality in economic agents. Subjects have a specified and
deterministic preference order over the set of alternatives, but they may not choose the maximizing
55
utility alternative because of some form of bounded rationality.
In the domain of economic decision-making, drift-diffusion model has proven to be an extremely
convenient model. The classic paradigm proposes an evolving drifting process which can be applied to
the comparison of decision values between two options (Ratcliff, 1978; Ratcliff & McKoon, 2008). Decision
between options and is made by dynamically computing a relative decision value signal (Fehr &
Rangel, 2011; Glimcher & Fehr, 2013). Typically, time is treated as a discrete variable and the drifting
process starts at zero and evolves according the following rule:
+1 = +
() − ()
+
,
where
is the signal at time , () and () denote the decision value assigned to each option, measures
of the process speed and specifies an independent and identically distributed error term with variance
equal to
2
. The updating process proceeds until a certain symmetric threshold is attained. Subject
selects if the upper limit is crossed, otherwise he chooses if the lower limit is crossed. Each boundary
acts as a pivotal benchmark for an option, so that the outcome of a decision is revealed when the decision
variable reaches one of the boundaries. Crucially, DDM incorporates noisy choices which proportionally
depend on the error variance
2
. This characteristic derives from the stochastic nature of the relative
decision value signal. Moreover, the randomness of the relative decision value is a consequence of the
inherent stochasticity of neuronal activity.
3.3.2 Noisy Neuronal Activity Encoding Value Signal
Most neurobiological studies on value-based choices have focused on characterizing the valuation
system of the human brain. Neuronal recording experiments have provided convincing evidences
regarding the processes underlying economic decision-making at the level of dopaminergic system
and the role played by these brain circuits (Wise, 2004; Boorman, Rushworth, & Behrens, 2013; Strait,
Blanchard, & Hayden, 2014). Relevant neuroimaging research has identified brain networks, including
the ventromedial prefrontal cortex (vmPFC), the dorsolateral prefrontal cortex (DLPFC) and the posterior
parietal cortex (PPC), all related with computing and comparing values. In particular, neuroimaging
studies have revealed that specific regions represent the subjective values of choice alternatives (Basten,
Biele, Heekeren, & Fiebach, 2010; Hare, Schultz, Camerer, O’Doherty, & Rangel, 2011).
Value coding is prominent in brain networks linked to action selection, such as vmPFC, PPC, but
also the ventral and dorsal striatum. For instance, a study has found that blood oxygen level-dependent
activity in the anterior insula, the DLPFC and the PPC was correlated with the aggregate neural activity
(Hare et al., 2011). Other studies have shown that action selection stimuli are associated with a surging
release of dopamine and, accordingly, related brain networks are activated by dopamine increase. In
particular, ventral striatum and the vmPFC have been identified as those networks entailed in the
neuronal representation of subjective values (Delgado, Nystrom, Fissell, Noll, & Fiez, 2000; Goto & Grace,
2005). Furthermore, neural recording experiments with humans have revealed that ventral and dorsal
striatum are involved in comparing decision values. As both these region of the brain are characterized
by massive dopaminergic projections, the striatum has also been associated to value-specific neural
56
activities (Schultz, Dayan, & Montague, 1997; O’Doherty, Dayan, Friston, Critchley, & Dolan, 2003).
Overall, adaptive decision-making in humans seems to be supported by different brain regions which
constitute the dopaminergic system. Specifically, the ventral and dorsal striatum in the basal ganglia, the
substantia nigra and the ventral tegmental area in the midbrain, and areas of the frontal cortex, all are
involved in the production of dopamine and in value-based decision-making neuronal processes.
3.4 Dopamine Decrease in Human Aging
As humans age, neurotransmitter system abates. Dopaminergic system similarly undergoes substantial decline during aging, compromising the efficacy of specific functions of various neurotransmitters.
Throughout adult life, dopamine levels diminish approximately by 5-10% every decade (Eppinger, Hämmerer, & Li, 2011). Supporting evidence of age-related decline in dopamine production have been
provided by various studies (Inoue et al., 2001; Lindenberger et al., 2008). Relevant in economic decisionmaking, the striatal areas are rich in dopamine receptors. Due to the fact that dopamine receptors decline
severely in old age, research has observed a consequent cognitive deterioration (Bäckman, Nyberg, &
Farde, 2013). For example, research focusing on the relationship between dopamine neurotransmission
and aging has considered the caudate and the putamen, two nuclei of the striatal complex, which are
characterized by a dense dopaminergic innervation from the substantia nigra. Convincing results have
indicated age-related decline in the nigrostriatal dopamine system (Mohr, Li, & Heekeren, 2010). The
striatum appears to be particularly affected by age, but also other areas in the prefrontal cortex seem to be
involved (Kaasinen & Rinne, 2002). Typically, these structural age-related deterioration is accompanied
by functional declines in the cognitive processes subserved by the affected neural circuits and neurotransmitter systems. Accordingly, it is possible that this neurocognitive decline would likely interfere
with economic decision-making in older subjects. Based on DDM assumption of inherent stochasticity
of neuronal activity, the cognitive deterioration is caused by a less efficient information processing in
the dopaminergic system which subserves a less effective decision making, as the number of dopamine
receptors declines with age.
3.5 Evidences of Age-related Changes in Choice Behavior
Economic preferences are likely to remain steady in the short term, but it is also reasonable to
assume that individual-specific preferences may evolve and related utility functions may change in the
long term. The research hypothesis argues that, on average, older individuals display similar evolution
of preferences which is different to preferences of younger individuals, which empirically results in
different choice behaviors at the aggregate level between the two groups. Thus, this review summarizes
neuroscientific empirical evidences in order to corroborate the research thesis. In particular, age is a
descriptive variable which accounts for variations that may cause shifts in economic choice behavior
over the adult lifespan. However, it also presents a literature review of experimental findings which
is consistent with the hypothesis of dopamine decrease in cognitive aging, according to which the
cognitive impairments that accompany healthy aging are caused by a simultaneous decline in dopamine
57
production. There exists a broad body of experimental literature which has provided supporting results
by means of evidences directly related to neurological measurements of dopaminergic system, and also
different experimental designs which are associated to inferred empirical evidences.
The purpose is to deliver a convincing argument which shows that differences in human behavior
between youngsters and elderly derives from a mixture of time-evolving preferences and increasing
stochasticity of neuronal activity due to age-related dopamine drop. To do so, the review divides this
critical review in three sections, based on three main branches of research: learning process over time,
intertemporal choice differences and age-related risk propensity.
3.5.1 Learning During Lifetime
In their learning process, humans acquire information from both positive and negative feedback. A
first branch of investigation has focused on the increased sensitivity to negative outcomes for older adults
in the learning process.
Evidences of the hypothesis that individuals tend to change their preferences come from a study in
which younger and older adults were tested in a probabilistic selection task sensitive to dopaminergic
function. Older participants revealed an increased tendency to focus more on negative outcomes, whereas
younger participants did not show this negative bias (Frank & Kong, 2008). Further support is provided by
experimental studies involving the Iowa gambling task. For example, one study observed at the beginning
of the game a pattern of selecting the bad card decks (negative feedback) by both youngsters and elderly.
However, the learning process seemed to develop differently depending on the age, suggesting opposite
preferences in the two groups. While the younger individuals gradually shifted towards the good
card decks (positive feedback) as the task progressed, the older individuals did not exhibit this change,
suggesting a preference of elderly to focus mainly on negative outcomes (Denburg, Tranel, & Bechara,
2005). In another study, a controlled experiment aimed at investigating probabilistic reinforcement
learning from both positive and negative stimuli for older individuals. Evidences revealed a twofold
result which is consistent with dopamine hypothesis of cognitive impairments. In negative feedback
learning, the elderly took more time due to noisy decision-making, while in positive feedback learning,
the elderly choices resulted sub-optimal because of an imbalance in learning from positive and negative
prediction errors (Sojitra, Lerner, Petok, & Gluck, 2018). Further evidences have been displayed by a recent
paper in which brain activation patterns of younger and older participants are analyzed. Experimental
findings indicated that hippocampus circuits supporting learning decrease more than striatal circuits
in healthy aging, thus suggesting that decline in hippocampus learning signals may be an important
predictor of deficits in learning-based decisions for older adults (Lighthall, Pearson, Huettel, & Cabeza,
2018). Other works have investigated what brain circuits are involved in learning experiments when agerelated differences are considered. For instance, two related studies focused on the effects of frontostriatal
pathways in mediating the influence of age on probabilistic learning. Using a Diffusion Tensor Imaging
in a learning task, the first study revealed that white matter integrity in thalamocortical and corticostriatal
paths statistically accounts for age differences in learning. In the second study instead, authors collected
functional neuroimaging data and found significant reductions in the frontostriatal representation of
prediction errors in older individuals (Samanez-Larkin, Levens, Perry, Dougherty, & Knutson, 2012;
58
Samanez-Larkin, Worthy, Mata, McClure, & Knutson, 2014). These results further corroborate the
relevance of dopaminergic system and related brain circuits in learning tasks in order to investigate
age-related dopamine drop.
3.5.2 Intertemporal Choices
Individual differences in discounting behavior have been hypothesized to be related to differences in
dopamine function. However, also differences in underlying preferences may affect the way individuals
weight time effect in intertemporal decision-making. A second line of research has focused on the
relationship between age and changes in delay discounting in decision contexts.
A recent work investigated the effects of age-related decay on cerebral activation of anticipation in
a monetary incentive delay task. Researchers found that age is associated with decreasing cerebral
response to anticipation of gains and increasing brain response to anticipation losses. These results are
consistent with the hypothesis of an age-related constriction in sensitivity to the magnitude of monetary
stimuli (Dhingra et al., 2020). Substantial proofs that economic behavior changes over the adult lifespan
are also offered by experimental studies on delay discounting. Under the pretense of an experiment
involving delay discounting task, three groups of individuals, i.e., pre-teens, young adults and older
adults, were tested and researchers observed a positive correlation between discount rate and age,
meaning that older adults exhibit stronger preferences for immediate rewards (Green, Fry, & Myerson,
1994). Another interesting paper attempted to identify how age and intertemporal choices evolve over
time. Stochastic preferences seem to generate a concave parabola-shaped relationship between age and
delay discounting. It suggests that older adults discount more than younger ones and that middle-aged
adults discount less than both groups (Read & Read, 2004). Other papers have addressed different
issues related to intertemporal decision making at the neuronal level. Two studies address how subject
value (SV) associated to activity in the medial prefrontal cortex is affected across adulthood. In the
first experiment, it was observed that better performances of decision-making task are correlated with
stronger SV-associated activation in the canonical subjective valuation network. This suggested that a
reduced representation of value in the brain, also driven by increased neural noise, relates to sub-optimal
decision-making for older adults (Halfmann, Hedgcock, Kable, & Denburg, 2016). Whereas, the second
experiment was designed to characterize the sensitivity to effort, probability and time in monetary
decisions across adulthood under functional magnetic resonance imaging (fMRI). Researchers observed
that SV was associated to activity in the medial prefrontal cortex across all the tasks performed. They
found evidences of age-related differences in the accomplishment across the tasks at the level of neural
representations of subject value (Seaman et al., 2018).
3.5.3 Age-related Risk Propensity
At last, a third line of research has deepened the effects of risk and uncertainty in adaptive decisionmaking. Several studies have provided empirical evidences which support the hypothesis that economic behavior under risk and decline in dopaminergic system are related. However, direct findings of
dopamine measurements have been lacking to the advantage of mostly indirect evidences that choice
59
behavior differs based on participants’ age.
Scholars are almost unanimous in affirming that risk taking behavior decreases over the adult lifespan.
For instance, an experiment employing gambling task has supported this result. The researchers observed
that the gambling strategy of older participants was significantly more conservative than the one of
younger individuals. This revealed that older adults exhibited a less risky behavior, thus suggesting that
age has a significant effect on the willingness to bear uncertainty (Deakin, Aitken, Robbins, & Sahakian,
2004). In another research older individuals revealed both stronger preferences for certain gains and
stronger aversion for sure losses compared to younger individuals, corroborating the hypothesis that
older adults are more susceptible to the certainty effect as a result of age-related evolving preferences
(Mather et al., 2012). Scholars have also sought to test whether risk-averse younger and older individuals
display different degrees of risk aversion in the domain of gains and of risk seeking in the domain of
losses. Supporting evidences found both that older adults are more risk averse than younger adults
when deciding between two potential gains, and that older adults are less risk averse than younger
adults when deciding between two potential losses (Lauriola & Levin, 2001). A more recent study
achieved similar conclusions in an experiment meant to analyze decision-making activity across life
span by measuring risk, ambiguity attitudes and choice consistency in the domain of gains and losses.
It was observed that, when risk is clearly specified, decision-making performances of elderly diverge
the most from risk neutrality than any other age group. In the gain domain instead, older individuals
take fewer risks than younger individuals, whereas in the loss domain, older individuals are more
risk-seeking than younger individuals (Tymula, Rosenberg Belmaker, Ruderman, Glimcher, & Levy,
2013). Focusing on both age-based differences in decision-making and neural activity, a further work
reported fMRI evidences revealing age differences in corticostriatal regions during decision-making in
risky gambles. Authors observed age-related biases toward positively skewed risky tasks, providing
indirect empirical inferences which are consistent with the assumption that dopamine is involved in
risky behavior. Furthermore, authors concluded that older adults preferring positively skewed gambles
corroborates also the hypothesis of a specific preferences in risky choices between the two age groups
(Seaman, Leong, Wu, Knutson, & Samanez-Larkin, 2017).
3.6 Conclusion
This essay argues that differences in choice behavior observed among age-specific groups depend on
two conceptually different, yet connected factors. On the one hand, it claims that decision-making of
youngsters and elderly exhibit, at the aggregate level, different characteristics because of the evolution
of substantial preferences. By means of neuroscientific results, it supports the thesis discussing that, for
instance, risk aversion increases over the lifespan or that older adults show stronger preferences for immediate rewards compared to younger peers for intertemporal decision contexts. On the other hand, this
work presents direct and indirect evidences which corroborate the hypothesis that randomness of choices
in older individuals is a consequence of the inherent stochasticity of neuronal activity and age-related
deterioration of dopaminergic system. In particular, functional neuroimaging data provide incontrovertible evidences about the significant reduction of the brain circuits involved in the representation of value
60
signals.
The review also contextualizes the two sources of behavioral heterogeneity through two established
models, namely the random utility model and drift-diffusion model, which provide the theoretical framework to analyze stochastic preferences and neuronal binary comparison of decision values, respectively.
The combination of behavioral and neural findings demonstrate that differences in age-related behavioral outcomes depend on the composed effects of stochastic preferences and a subsequent noisy error
term. The implications of this result are notably relevant from an economic standpoint regarding the
evaluation of individual decision-making. Stochastic feature of underlying preferences cannot necessarily be deemed as welfare decreasing. Conversely, the comparison of decision values is affected by an
error term which is a computational noise. This error increases as neuronal structures deteriorate and it
does not reflect an actual shift in the underlying preferences. Hence, the worsening of neuronal circuits
supporting decision values needs to be considered as strictly welfare decreasing.
Recent studies have expanded the frontier of research about these topics. For instance, it has been
proposed the so-called neuronal random utility model (NRUM) which extends the aspects of RUM to
neuronal observables, including both the maximization of stochastic decision variables and the possibility that heterogeneity in these variables includes information for choice prediction (R. Webb, Levy,
Lazzaro, Rutledge, & Glimcher, 2019). Otherwise, another work related decision times to stochastic
choice behavior in a RUM framework. It demonstrated that a RUM can be derived from the general class
of bounded accumulation models and that resulting distribution of random utility depends on response
time (R. Webb, 2019). A recent experiment divided stochastic choice in three broad classes of models:
models of random utility, of bounded rationality and of deliberate randomization. The experiment is
conducted with the goal of shedding light on the origin of stochastic choices (Agranov & Ortoleva, 2017).
Albeit these recent developments, this chapter offers a novel perspective in the research debate related
to aging process and provides an original contribution to the current literature. The further purpose
of this work is to show that Neuroeconomics has already led to important findings regarding decisionmaking brain processes and our understanding of economic behavior. Therefore this work may exemplify
a combined research paradigm between economic theory and neuroeconomic approach.
61
References
Agranov, M., & Ortoleva, P. (2017). Stochastic choice and preferences for randomization. Journal of
Political Economy, 125(1), 40–68.
Angrist, J. D., & Imbens, G. W. (1995). Two-stage least squares estimation of average causal effects
in models with variable treatment intensity. Journal of the American statistical Association, 90(430),
431–442.
Arora, N., Dreze, X., Ghose, A., Hess, J. D., Iyengar, R., Jing, B., . . . others (2008). Putting one-to-one
marketing to work: Personalization, customization, and choice. Marketing Letters, 19, 305–321.
Arrow, K. J. (1959). Rational choice functions and orderings. Economica, 26(102), 121–127.
Athey, S., & Imbens, G. W. (2017). The econometrics of randomized experiments. In Handbook of economic
field experiments (Vol. 1, pp. 73–140). Elsevier.
Bäckman, L., Nyberg, L., & Farde, L. (2013). Dopamine and cognitive aging: A strong relationship. In
Progress in psychological science around the world. volume 1 neural, cognitive and developmental issues.
(pp. 455–469). Psychology Press.
Baik, A., Simon Anderson, & Larson, N. (2023). Price discrimination in the information age: Prices,
poaching, and privacy with personalized targeted discounts. The Review of Economic Studies, 90(5),
2085–2115.
Bailey, P. E., & Leon, T. (2019). A systematic review and meta-analysis of age-related differences in trust.
Psychology and aging, 34(5), 674.
Bailey, P. E., Petridis, K., McLennan, S. N., Ruffman, T., & Rendell, P. G. (2019). Age-related preservation
of trust following minor transgressions. The Journals of Gerontology: Series B, 74(1), 74–81.
Bailey, P. E., Slessor, G., Rieger, M., Rendell, P. G., Moustafa, A. A., & Ruffman, T. (2015). Trust and
trustworthiness in young and older adults. Psychology and aging, 30(4), 977.
Bailey, P. E., Szczap, P., McLennan, S. N., Slessor, G., Ruffman, T., & Rendell, P. G. (2016). Age-related
similarities and differences in first impressions of trustworthiness. Cognition and Emotion, 30(5),
1017–1026.
Basten, U., Biele, G., Heekeren, H. R., & Fiebach, C. J. (2010). How the brain integrates costs and benefits
during decision making. Proceedings of the National Academy of Sciences, 107(50), 21767–21772.
Bavard, S., Rustichini, A., & Palminteri, S. (2020). The construction and deconstruction of sub-optimal
preferences through range-adapting reinforcement learning. bioRxiv, 2020–07.
Beck, S. R., & Riggs, K. J. (2014). Developing thoughts about what might have been. Child development
perspectives, 8(3), 175–179.
62
Becker, G. M., DeGroot, M. H., & Marschak, J. (1963). Stochastic models of choice behavior. Behavioral
science, 8(1), 41–55.
Bell, R., Giang, T., Mund, I., & Buchner, A. (2013). Memory for reputational trait information: Is
social–emotional information processing less flexible in old age? Psychology and aging, 28(4), 984.
Belleflamme, P., & Peitz, M. (2015). Industrial organization: markets and strategies. Cambridge University
Press.
Bergemann, D., Brooks, B., & Morris, S. (2015). The limits of price discrimination. American Economic
Review, 105(3), 921–957.
Bergemann, D., Castro, F., & Weintraub, G. (2022). Third-degree price discrimination versus uniform
pricing. Games and Economic Behavior, 131, 275–291.
Berger, J. (2013). Statistical decision theory: foundations, concepts, and methods. Springer Science & Business
Media.
Berry, S., Levinsohn, J., & Pakes, A. (1995). tautomobile prices in market equilibrium, u econometrica,
63.
Berry, S. T. (1994). Estimating discrete-choice models of product differentiation. The RAND Journal of
Economics, 242–262.
Bhat, C. R. (2008). The multiple discrete-continuous extreme value (mdcev) model: role of utility
function parameters, identification considerations, and model extensions. Transportation Research
Part B: Methodological, 42(3), 274–303.
Block, H. D. (1974). Random orderings and stochastic theories of responses (1960). In Economic information,
decision, and prediction: Selected essays: Volume i part i economics of decision (pp. 172–217). Springer.
Bonnefon, J.-F., Hopfensitz, A., & De Neys, W. (2017). Can we detect cooperators by looking at their face?
Current Directions in Psychological Science, 26(3), 276–281.
Boorman, E. D., Rushworth, M. F., & Behrens, T. E. (2013). Ventromedial prefrontal and anterior cingulate
cortex adopt choice and default reference frames during sequential multi-alternative choice. Journal
of neuroscience, 33(6), 2242–2253.
Breiman, L. (1996a). Arcing classifiers (Tech. Rep.). Technical report, University of California, Department
of Statistics.
Breiman, L. (1996b). Bagging predictors. Machine learning, 24, 123–140.
Brokesova, Z., Deck, C., & Peliova, J. (2014). Experimenting with purchase history based price discrimination. International Journal of Industrial Organization, 37, 229–237.
Caillaud, B., & De Nijs, R. (2014). Strategic loyalty reward in dynamic price discrimination. Marketing
Science, 33(5), 725–742.
Carroni, E. (2018). Behaviour-based price discrimination with cross-group externalities. Journal of
Economics, 125(2), 137–157.
Cassidy, B. S., Boucher, K. L., Lanie, S. T., & Krendl, A. C. (2019). Age effects on trustworthiness activation
and trust biases in face perception. The Journals of Gerontology: Series B, 74(1), 87–92.
Castle, E., Eisenberger, N. I., Seeman, T. E., Moons, W. G., Boggero, I. A., Grinblatt, M. S., & Taylor, S. E.
(2012). Neural and behavioral bases of age differences in perceptions of trust. Proceedings of the
National Academy of Sciences, 109(51), 20848–20852.
63
Castrellon, J. J., Seaman, K. L., Crawford, J. L., Young, J. S., Smith, C. T., Dang, L. C., . . . Samanez-Larkin,
G. R. (2019). Individual differences in dopamine are associated with reward discounting in clinical
groups but not in healthy adults. Journal of Neuroscience, 39(2), 321–332.
Chandra, S., Verma, S., Lim, W. M., Kumar, S., & Donthu, N. (2022). Personalization in personalized
marketing: Trends and ways forward. Psychology & Marketing, 39(8), 1529–1562.
Chen, Y., & Pearcy, J. (2010). Dynamic pricing: when to entice brand switching and when to reward
consumer loyalty. The RAND Journal of Economics, 41(4), 674–685.
Chernozhukov, V., Demirer, M., Duflo, E., & Fernandez-Val, I. (2018). Generic machine learning inference
on heterogeneous treatment effects in randomized experiments, with an application to immunization in india
(Tech. Rep.). National Bureau of Economic Research.
Choe, C., King, S., & Matsushima, N. (2018). Pricing with cookies: Behavior-based price discrimination
and spatial competition. Management Science, 64(12), 5669–5687.
Chu, Y., Yang, H.-K., & Peng, W.-C. (2019). Predicting online user purchase behavior based on browsing
history. In 2019 ieee 35th international conference on data engineering workshops (icdew) (pp. 185–192).
Colombo, S. (2016). Imperfect behavior-based price discrimination. Journal of Economics & Management
Strategy, 25(3), 563–583.
Colombo, S. (2018). Behavior-and characteristic-based price discrimination. Journal of Economics &
Management Strategy, 27(2), 237–250.
Coricelli, G., Critchley, H. D., Joffily, M., O’Doherty, J. P., Sirigu, A., & Dolan, R. J. (2005). Regret and its
avoidance: a neuroimaging study of choice behavior. Nature neuroscience, 8(9), 1255–1262.
Cowan, S. (2012). Third-degree price discrimination and consumer surplus. The Journal of Industrial
Economics, 60(2), 333–345.
Daniel, R., Radulescu, A., & Niv, Y. (2020). Intact reinforcement learning but impaired attentional
control during multidimensional probabilistic learning in older adults. Journal of Neuroscience,
40(5), 1084–1096.
Deakin, J., Aitken, M., Robbins, T., & Sahakian, B. J. (2004). Risk taking during decision-making in normal
volunteers changes with age. Journal of the International Neuropsychological Society, 10(4), 590–598.
De Cornière, A., Mantovani, A., & Shekhar, S. (2023). Third-degree price discrimination in two-sided
markets.
Delgado, M. R., Nystrom, L. E., Fissell, C., Noll, D., & Fiez, J. A. (2000). Tracking the hemodynamic
responses to reward and punishment in the striatum. Journal of neurophysiology, 84(6), 3072–3077.
Denburg, N. L., Tranel, D., & Bechara, A. (2005). The ability to decide advantageously declines prematurely in some normal older persons. Neuropsychologia, 43(7), 1099–1106.
Dhingra, I., Zhang, S., Zhornitsky, S., Le, T. M., Wang, W., Chao, H. H., . . . Li, C.-S. R. (2020). The
effects of age on reward magnitude processing in the monetary incentive delay task. Neuroimage,
207, 116368.
Dijk, O. (2017). For whom does social comparison induce risk-taking? Theory and Decision, 82(4), 519–541.
Doya, K. (2008). Modulators of decision making. Nature neuroscience, 11(4), 410–416.
Dubé, J.-P. (2019). Microeconometric models of consumer demand. In Handbook of the economics of
marketing (Vol. 1, pp. 1–68). Elsevier.
64
Dubé, J.-P., & Misra, S. (2023). Personalized pricing and consumer welfare. Journal of Political Economy,
131(1), 131–189.
Dvash, J., Gilam, G., Ben-Ze’ev, A., Hendler, T., & Shamay-Tsoory, S. G. (2010). The envious brain: the
neural basis of social comparison. Human brain mapping, 31(11), 1741–1750.
Dyczewski, E. A., & Markman, K. D. (2012). General attainability beliefs moderate the motivational
effects of counterfactual thinking. Journal of Experimental Social Psychology, 48(5), 1217–1220.
Dzhelyova, M., Perrett, D. I., & Jentzsch, I. (2012). Temporal dynamics of trustworthiness perception.
Brain research, 1435, 81–90.
Eppinger, B., Hämmerer, D., & Li, S.-C. (2011). Neuromodulation of reward-based learning and decision
making in human aging. Annals of the New York Academy of Sciences, 1235(1), 1–17.
Esteves, R. B. (2009). A survey on the economics of behaviour-based price discrimination (Tech. Rep.). NIPEUniversidade do Minho.
Esteves, R.-B. (2010). Pricing with customer recognition. International Journal of Industrial Organization,
28(6), 669–681.
Esteves, R.-B. (2014). Price discrimination with private and imperfect information. The Scandinavian
Journal of Economics, 116(3), 766–796.
Esteves, R.-B., Liu, Q., & Shuai, J. (2022). Behavior-based price discrimination with nonuniform distribution of consumer preferences. Journal of Economics & Management Strategy, 31(2), 324–355.
Esteves, R.-B., & Reggiani, C. (2014). Elasticity of demand and behaviour-based price discrimination.
International Journal of Industrial Organization, 32, 46–56.
Esteves, R. B., & Shuai, J. (2023). Behavior-based price discrimination with a general demand. Available
at SSRN 4403885.
Falmagne, J.-C. (1978). A representation theorem for finite random scale systems. Journal of Mathematical
Psychology, 18(1), 52–72.
Fehr, E., & Rangel, A. (2011). Neuroeconomic foundations of economic choice—recent advances. Journal
of Economic Perspectives, 25(4), 3–30.
Fera, F., Weickert, T. W., Goldberg, T. E., Tessitore, A., Hariri, A., Das, S., . . . others (2005). Neural
mechanisms underlying probabilistic category learning in normal aging. Journal of Neuroscience,
25(49), 11340–11348.
Fouragnan, E., Chierchia, G., Greiner, S., Neveu, R., Avesani, P., & Coricelli, G. (2013). Reputational
priors magnify striatal responses to violations of trust. Journal of Neuroscience, 33(8), 3602–3611.
Frank, M. J., & Kong, L. (2008). Learning to avoid in older age. Psychology and aging, 23(2), 392.
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an
application to boosting. Journal of computer and system sciences, 55(1), 119–139.
Friedman, J. H. (2002). Stochastic gradient boosting. Computational statistics & data analysis, 38(4), 367–378.
Fudenberg, D., & Tirole, J. (2000). Customer poaching and brand switching. RAND Journal of Economics,
634–657.
Fudenberg, D., & Villas-Boas, J. M. (2006). Behavior-based price discrimination and customer recognition.
Handbook on economics and information systems, 1, 377–436.
65
Fudenberg, D., & Villas-Boas, J. M. (2012). In the digital economy. The Oxford handbook of the digital
economy, 254.
Gehrig, T., Shy, O., & Stenbacka, R. (2012). A welfare evaluation of history-based price discrimination.
Journal of Industry, Competition and Trade, 12, 373–393.
Glimcher, P. W., & Fehr, E. (2013). Neuroeconomics: Decision making and the brain. Academic Press.
Gordon, B. R., Zettelmeyer, F., Bhargava, N., & Chapsky, D. (2019). A comparison of approaches to
advertising measurement: Evidence from big field experiments at facebook. Marketing Science,
38(2), 193–225.
Gosnell, G. K., List, J. A., & Metcalfe, R. D. (2020). The impact of management practices on employee
productivity: A field experiment with airline captains. Journal of Political Economy, 128(4), 1195–
1233.
Goto, Y., & Grace, A. A. (2005). Dopaminergic modulation of limbic and cortical drive of nucleus
accumbens in goal-directed behavior. Nature neuroscience, 8(6), 805–812.
Green, L., Fry, A. F., & Myerson, J. (1994). Discounting of delayed rewards: A life-span comparison.
Psychological science, 5(1), 33–36.
Greiner, B., & Zednik, A. (2019). Trust and age: An experiment with current and former students.
Economics Letters, 181, 37–39.
Grubb, M. A., Tymula, A., Gilaie-Dotan, S., Glimcher, P. W., & Levy, I. (2016). Neuroanatomy accounts
for age-related changes in risk preferences. Nature communications, 7(1), 13822.
Guerini, R., FitzGibbon, L., & Coricelli, G. (2020). The role of agency in regret and relief in 3-to 10-year-old
children. Journal of Economic Behavior & Organization, 179, 797–806.
Gul, F., & Pesendorfer, W. (2006). Random expected utility. Econometrica, 74(1), 121–146.
Guo, Z., Han, S., Wang, X., Wang, S., Xu, Y., Liu, S., & Zhang, L. (2021). The relationship between the
positivity effect and facial-cue based trustworthiness evaluations in older adults. Current Psychology,
40, 5801–5810.
Guttentag, R., & Ferrell, J. (2004). Reality compared with its alternatives: age differences in judgments
of regret and relief. Developmental psychology, 40(5), 764.
Haas, B. W., Ishak, A., Anderson, I. W., & Filkowski, M. M. (2015). The tendency to trust is reflected in
human brain structure. NeuroImage, 107, 175–181.
Halfmann, K., Hedgcock, W., & Denburg, N. L. (2013). Age-related differences in discounting future
gains and losses. Journal of Neuroscience, Psychology, and Economics, 6(1), 42.
Halfmann, K., Hedgcock, W., Kable, J., & Denburg, N. L. (2016). Individual differences in the neural
signature of subjective value among older adults. Social Cognitive and Affective Neuroscience, 11(7),
1111–1120.
Han, S., & Kim, H. (2019). On the optimal size of candidate feature set in random forest. Applied Sciences,
9(5), 898.
Hanemann, W. M. (1984). Discrete/continuous models of consumer demand. Econometrica: Journal of
the Econometric Society, 541–561.
Hare, T. A., Schultz, W., Camerer, C. F., O’Doherty, J. P., & Rangel, A. (2011). Transformation of stimulus
value signals into motor commands during simple choice. Proceedings of the National Academy of
66
Sciences, 108(44), 18120–18125.
Hitsch, G. J., Hortacsu, A., & Lin, X. (2021). Prices and promotions in us retail markets. Quantitative
Marketing and Economics, 19(3), 289–368.
Hitsch, G. J., Misra, S., & Zhang, W. (2023). Heterogeneous treatment effects and optimal targeting policy
evaluation. Available at SSRN 3111957.
Ho, T. K. (1995). Random decision forests. In Proceedings of 3rd international conference on document analysis
and recognition (Vol. 1, pp. 278–282).
Horvitz, D. G., & Thompson, D. J. (1952). A generalization of sampling without replacement from a finite
universe. Journal of the American statistical Association, 47(260), 663–685.
Hospedales, T., Antoniou, A., Micaelli, P., & Storkey, A. (2021). Meta-learning in neural networks: A
survey. IEEE transactions on pattern analysis and machine intelligence, 44(9), 5149–5169.
Huang, Y., Wood, S., Berger, D., & Hanoch, Y. (2013). Risky choice in younger versus older adults:
Affective context matters. Judgment and Decision Making, 8(2), 179–187.
Huber, M. (2014). Identifying causal mechanisms (primarily) based on inverse probability weighting.
Journal of Applied Econometrics, 29(6), 920–943.
Inoue, M., Suhara, T., Sudo, Y., Okubo, Y., Yasuno, F., Kishimoto, T., . . . Tanada, S. (2001). Age-related
reduction of extrastriatal dopamine d2 receptor measured by pet. Life sciences, 69(9), 1079–1084.
Kaasinen, V., & Rinne, J. O. (2002). Functional imaging studies of dopamine system and cognition in
normal aging and parkinson’s disease. Neuroscience & Biobehavioral Reviews, 26(7), 785–793.
Kahneman, D., & Tversky, A. (1984). Choices, values, and frames. American psychologist, 39(4), 341.
Kallus, N., & Zhou, A. (2021). Fairness, welfare, and equity in personalized pricing. In Proceedings of the
2021 acm conference on fairness, accountability, and transparency (pp. 296–314).
Kim, C., Smith, A. N., Kim, J., & Allenby, G. M. (2023). Outside good utility and substitution patterns in
direct utility models. Journal of choice modelling, 49, 100447.
Kotsiantis, S. B. (2014). Bagging and boosting variants for handling classifications problems: a survey.
The Knowledge Engineering Review, 29(1), 78–100.
Lauriola, M., & Levin, I. P. (2001). Personality traits and risky decision-making in a controlled experimental task: An exploratory study. Personality and individual differences, 31(2), 215–226.
Lee, S., & Allenby, G. M. (2014). Modeling indivisible demand. Marketing Science, 33(3), 364–381.
Lerner, I., Sojitra, R., & Gluck, M. (2018). How age affects reinforcement learning. Aging (Albany NY),
10(12), 3630.
Li, T., & Fung, H. H. (2013). Age differences in trust: An investigation across 38 countries. Journals of
Gerontology Series B: Psychological Sciences and Social Sciences, 68(3), 347–355.
Lighthall, N. R., Pearson, J. M., Huettel, S. A., & Cabeza, R. (2018). Feedback-based learning in aging:
Contributions and trajectories of change in striatal and hippocampal systems. Journal of Neuroscience,
38(39), 8453–8462.
Lindenberger, U., Nagel, I. E., Chicherio, C., Li, S.-C., Heekeren, H. R., & Bäckman, L. (2008). Agerelated decline in brain resources modulates genetic effects on cognitive functioning. Frontiers in
neuroscience, 2, 399.
67
Liu, X. (2023). Dynamic coupon targeting using batch deep reinforcement learning: An application to
livestream shopping. Marketing Science, 42(4), 637–658.
Luce, R. D. (2012). Individual choice behavior: A theoretical analysis. Courier Corporation.
Mahmood, A. (2014). How do customer characteristics impact behavior-based price discrimination? an
experimental investigation. Journal of Strategic Marketing, 22(6), 530–547.
Mamerow, L., Frey, R., & Mata, R. (2016). Risk taking across the life span: A comparison of self-report
and behavioral measures of risk taking. Psychology and aging, 31(7), 711.
Markman, K. D., Gavanski, I., Sherman, S. J., & McMullen, M. N. (1993). The mental simulation of better
and worse possible worlds. Journal of experimental social psychology, 29(1), 87–109.
Mata, R., Josef, A. K., Samanez-Larkin, G. R., & Hertwig, R. (2011). Age differences in risky choice: A
meta-analysis. Annals of the new York Academy of Sciences, 1235(1), 18–29.
Mather, M., Mazar, N., Gorlick, M. A., Lighthall, N. R., Burgeno, J., Schoeke, A., & Ariely, D. (2012).
Risk preferences and aging: The “certainty effect” in older adults’ decision making. Psychology and
aging, 27(4), 801.
McCormack, T., & Feeney, A. (2015). The development of the experience and anticipation of regret.
Cognition and Emotion, 29(2), 266–280.
McCormack, T., O’Connor, E., Cherry, J., Beck, S. R., & Feeney, A. (2019). Experiencing regret about
a choice helps children learn to delay gratification. Journal of Experimental Child Psychology, 179,
162–175.
McCormack, T., O’Connor, E., Beck, S., & Feeney, A. (2016). The development of regret and relief about
the outcomes of risky decisions. Journal of experimental child psychology, 148, 1–19.
McFadden, D. (2001). Economic choices. American economic review, 91(3), 351–378.
McFadden, D., et al. (1973). Conditional logit analysis of qualitative choice behavior.
McFadden, D., & Train, K. (2000). Mixed mnl models for discrete response. Journal of applied Econometrics,
15(5), 447–470.
McFadden, D. L. (2006). Revealed stochastic preference: a synthesis. In Rationality and equilibrium: A
symposium in honor of marcel k. richter (pp. 1–20).
Mienye, I. D., & Sun, Y. (2022). A survey of ensemble learning: Concepts, algorithms, applications, and
prospects. IEEE Access, 10, 99129–99149.
Miller, A., & Hosanagar, K. (2020). Personalized discount targeting with causal machine learning.
Mohr, P. N., Li, S.-C., & Heekeren, H. R. (2010). Neuroeconomics and aging: neuromodulation of
economic decision making in old age. Neuroscience & Biobehavioral Reviews, 34(5), 678–688.
O’Connor, E., McCormack, T., & Feeney, A. (2014). Do children who experience regret make better
decisions? a developmental study of the behavioral consequences of regret. Child development,
85(5), 1995–2010.
O’Doherty, J. P., Dayan, P., Friston, K., Critchley, H., & Dolan, R. J. (2003). Temporal difference models
and reward-related learning in the human brain. Neuron, 38(2), 329–337.
Opitz, D., & Maclin, R. (1999). Popular ensemble methods: An empirical study. Journal of artificial
intelligence research, 11, 169–198.
68
Oza, N. C., & Tumer, K. (2008). Classifier ensembles: Select real-world applications. Information fusion,
9(1), 4–20.
O’Connor, E., McCormack, T., & Feeney, A. (2012). The development of regret. Journal of experimental
child psychology, 111(1), 120–127.
Pazgal, A., & Soberman, D. (2008). Behavior-based discrimination: Is it a winning play, and if so, when?
Marketing Science, 27(6), 977–994.
Radulescu, A., Daniel, R., & Niv, Y. (2016). The effects of aging on the interaction between reinforcement
learning and attention. Psychology and aging, 31(7), 747.
Rafetseder, E., & Perner, J. (2014). Counterfactual reasoning: Sharpening conceptual distinctions in
developmental studies. Child development perspectives, 8(1), 54–58.
Rangel, A., Camerer, C., & Montague, P. R. (2008). A framework for studying the neurobiology of
value-based decision making. Nature reviews neuroscience, 9(7), 545–556.
Rasmussen, E. C., & Gutchess, A. (2019). Can’t read my broker face: Learning about trustworthiness
with age. The Journals of Gerontology: Series B, 74(1), 82–86.
Ratcliff, R. (1978). A theory of memory retrieval. Psychological review, 85(2), 59.
Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: theory and data for two-choice decision
tasks. Neural computation, 20(4), 873–922.
Read, D., & Read, N. L. (2004). Time discounting over the lifespan. Organizational behavior and human
decision processes, 94(1), 22–32.
Roalf, D. R., Mitchell, S. H., Harbaugh, W. T., & Janowsky, J. S. (2012). Risk, reward, and economic
decision making in aging. Journals of Gerontology Series B: Psychological Sciences and Social Sciences,
67(3), 289–298.
Rogers, A. R. (1994). Evolution of time preference by natural selection. The American Economic Review,
460–481.
Rossi, P. E., McCulloch, R. E., & Allenby, G. M. (1996). The value of purchase history data in target
marketing. Marketing Science, 15(4), 321–340.
Rubin, D. B. (2005). Causal inference using potential outcomes: Design, modeling, decisions. Journal of
the American Statistical Association, 100(469), 322–331.
Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining
and Knowledge Discovery, 8(4), e1249.
Salvia, E., Mevel, K., Borst, G., Poirel, N., Simon, G., Orliac, F., . . . others (2020). Age-related neural
correlates of facial trustworthiness detection during economic interaction. Journal of Neuroscience,
Psychology, and Economics, 13(1), 19.
Samanez-Larkin, G. R., Levens, S. M., Perry, L. M., Dougherty, R. F., & Knutson, B. (2012). Frontostriatal
white matter integrity mediates adult age differences in probabilistic reward learning. Journal of
Neuroscience, 32(15), 5333–5337.
Samanez-Larkin, G. R., Worthy, D. A., Mata, R., McClure, S. M., & Knutson, B. (2014). Adult age
differences in frontostriatal representation of prediction error but not reward outcome. Cognitive,
Affective, & Behavioral Neuroscience, 14, 672–682.
Samuelson, P. A. (1938). A note on the pure theory of consumer’s behaviour. Economica, 5(17), 61–71.
69
Samuelson, P. A. (1983). Foundations of economic analysis (Vol. 197) (No. 1). Harvard University Press
Cambridge, MA.
Santamaría-García, H., Baez, S., Reyes, P., Santamaría-García, J. A., Santacruz-Escudero, J. M., Matallana,
D., . . . Ibáñez, A. (2017). A lesion model of envy and schadenfreude: legal, deservingness and
moral dimensions as revealed by neurodegeneration. Brain, 140(12), 3357–3377.
Santos, S., Almeida, I., Oliveiros, B., & Castelo-Branco, M. (2016). The role of the amygdala in facial
trustworthiness processing: A systematic review and meta-analyses of fmri studies. PloS one, 11(11),
e0167276.
Schonlau, M., & Zou, R. Y. (2020). The random forest algorithm for statistical learning. The Stata Journal,
20(1), 3–29.
Schultz, W. (2006). Behavioral theories and the neurophysiology of reward. Annu. Rev. Psychol., 57,
87–115.
Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science,
275(5306), 1593–1599.
Seaman, K. L., Brooks, N., Karrer, T. M., Castrellon, J. J., Perkins, S. F., Dang, L. C., . . . Samanez-Larkin,
G. R. (2018). Subjective value representations during effort, probability and time discounting across
adulthood. Social cognitive and affective neuroscience, 13(5), 449–459.
Seaman, K. L., Leong, J. K., Wu, C. C., Knutson, B., & Samanez-Larkin, G. R. (2017). Individual
differences in skewed financial risk-taking across the adult life span. Cognitive, Affective, & Behavioral
Neuroscience, 17, 1232–1241.
Seiler, S., & Yao, S. (2017). The impact of advertising along the conversion funnel. Quantitative Marketing
and Economics, 15(3), 241–278.
Shamay-Tsoory, S. G., Tibi-Elhanany, Y., & Aharon-Peretz, J. (2007). The green-eyed monster and malicious
joy: the neuroanatomical bases of envy and gloating (schadenfreude). Brain, 130(6), 1663–1678.
Shiller, B. R. (2020). Approximating purchase propensities and reservation prices from broad consumer
tracking. International Economic Review, 61(2), 847–870.
Shiller, B. R., et al. (2013). First degree price discrimination using big data. Brandeis Univ., Department of
Economics.
Shin, J., & Sudhir, K. (2010). A customer management dilemma: When is it profitable to reward one’s
own customers? Marketing Science, 29(4), 671–689.
Shobana, G., & Umamaheswari, K. (2021). Forecasting by machine learning techniques and econometrics:
A review. In 2021 6th international conference on inventive computation technologies (icict) (pp. 1010–
1016).
Simester, D., Timoshenko, A., & Zoumpoulis, S. I. (2020). Efficiently evaluating targeting policies:
Improving on champion vs. challenger experiments. Management Science, 66(8), 3412–3424.
Smallman, R., & Summerville, A. (2018). Counterfactual thought in reasoning and performance. Social
and Personality Psychology Compass, 12(4), e12376.
Smith, A. N., Seiler, S., & Aggarwal, I. (2023). Optimal price targeting. Marketing Science, 42(3), 476–499.
Sojitra, R. B., Lerner, I., Petok, J. R., & Gluck, M. A. (2018). Age affects reinforcement learning through
dopamine-based learning imbalance and high decision noise—not through parkinsonian mecha70
nisms. Neurobiology of Aging, 68, 102–113.
Spellman, B. A., & Gilbert, E. A. (2014). Blame, cause, and counterfactuals: The inextricable link.
Psychological Inquiry, 25(2), 245–250.
Spreng, R. N., Cassidy, B. N., Darboh, B. S., DuPre, E., Lockrow, A. W., Setton, R., & Turner, G. R. (2017).
Financial exploitation is associated with structural and functional brain differences in healthy older
adults. Journals of Gerontology Series A: Biomedical Sciences and Medical Sciences, 72(10), 1365–1368.
Sproten, A., Diener, C., Fiebach, C., & Schwieren, C. (2010). Aging and decision making: How aging affects
decisions under uncertainty (Tech. Rep.). Discussion Paper Series.
Strait, C. E., Blanchard, T. C., & Hayden, B. Y. (2014). Reward value comparison via mutual inhibition in
ventromedial prefrontal cortex. Neuron, 82(6), 1357–1366.
Streitfeld, D. (2000). On the web price tags blur: What you pay could depend on who you are. The
Washington Post(September 27).
Suzuki, A. (2018). Persistent reliance on facial appearance among older adults when judging someone’s
trustworthiness. The Journals of Gerontology: Series B, 73(4), 573–583.
Suzuki, A., Ueno, M., Ishikawa, K., Kobayashi, A., Okubo, M., & Nakai, T. (2019). Age-related differences
in the activation of the mentalizing-and reward-related brain regions during the learning of others’
true trustworthiness. Neurobiology of Aging, 73, 1–8.
Taunton, E. (2023). Mcdonald’s accused of ’price-gouging’ loyal customers. Stuff (August 5).
Tkáč, M., & Verner, R. (2016). Artificial neural networks in business: Two decades of research. Applied
Soft Computing, 38, 788–804.
Trostel, P. A., & Taylor, G. A. (2001). A theory of time preference. Economic inquiry, 39(3), 379–395.
Tymula, A., Rosenberg Belmaker, L. A., Ruderman, L., Glimcher, P. W., & Levy, I. (2013). Like cognitive
function, decision making across the life span shows profound age-related changes. Proceedings of
the National Academy of Sciences, 110(42), 17143–17148.
Van Duijvenvoorde, A. C., Huizenga, H. M., & Jansen, B. R. (2014). What is and what could have been:
Experiencing regret and relief across childhood. Cognition & Emotion, 28(5), 926–935.
Villas-Boas, J. M. (1999). Dynamic competition with customer recognition. The Rand Journal of Economics,
604–631.
Wang, C., Xu, S., & Yang, J. (2021). Adaboost algorithm in artificial intelligence for optimizing the iri
prediction accuracy of asphalt concrete pavement. Sensors, 21(17), 5682.
Wang, F., Li, Z., He, F., Wang, R., Yu, W., & Nie, F. (2019). Feature learning viewpoint of adaboost and a
new algorithm. IEEE Access, 7, 149890–149899.
Webb, B., Hine, A. C., & Bailey, P. E. (2016). Difficulty in differentiating trustworthiness from untrustworthiness in older age. Developmental Psychology, 52(6), 985.
Webb, R. (2019). The (neural) dynamics of stochastic choice. Management Science, 65(1), 230–255.
Webb, R., Levy, I., Lazzaro, S. C., Rutledge, R. B., & Glimcher, P. W. (2019). Neural random utility: Relating
cardinal neural observables to stochastic choice behavior. Journal of Neuroscience, Psychology, and
Economics, 12(1), 45.
Weber, E. U., & Johnson, E. J. (2009). Mindful judgment and decision making. Annual review of psychology,
60, 53–85.
71
Weisberg, D. P., & Beck, S. R. (2010). Children’s thinking about their own and others’ regret and relief.
Journal of Experimental Child Psychology, 106(2-3), 184–191.
Weisberg, D. P., & Beck, S. R. (2012). The development of children’s regret and relief. Cognition & emotion,
26(5), 820–835.
Weisberg, D. S., & Gopnik, A. (2013). Pretense, counterfactuals, and bayesian causal models: Why what
is not real really matters. Cognitive science, 37(7), 1368–1381.
Wise, R. A. (2004). Dopamine, learning and motivation. Nature reviews neuroscience, 5(6), 483–494.
Yang, Y., Lv, H., & Chen, N. (2023). A survey on ensemble learning under the era of deep learning.
Artificial Intelligence Review, 56(6), 5545–5589.
Yoganarasimhan, H., Barzegary, E., & Pani, A. (2020). Design and evaluation of personalized free trials.
arXiv preprint arXiv:2006.13420.
Zebrowitz, L. A., Boshyan, J., Ward, N., Gutchess, A., & Hadjikhani, N. (2017). The older adult positivity
effect in evaluations of trustworthiness: Emotion regulation or cognitive capacity? PloS one, 12(1),
e0169823.
Zebrowitz, L. A., Ward, N., Boshyan, J., Gutchess, A., & Hadjikhani, N. (2018). Older adults’ neural
activation in the reward circuit is sensitive to face trustworthiness. Cognitive, Affective, & Behavioral
Neuroscience, 18, 21–34.
72
Appendix A
Appendix to Chapter 1
A.1 Technical Appendix
A.1.1 Proof of Proposition 1
Demand equilibrium {
∗
, ℎ
∗
} is defined in the interval (
,
) of the support of error distribution
∀ ∈ {, ℎ} where:
= ln 0(1 − 1{∗
≥ }
)
− ln
ln
∗
+ 1
(
∗
− Δ) + 1
,
= ln 0(1 − 1{∗
≥ }
)
− ln
ln (
∗
+ Δ) + 1
∗
+ 1
.
Proof: From (1.4.12) the following two expressions are derived:
∗
(
∗
, −∗
) >
∗
(
∗
− Δ, −∗
),
ln(
∗
+ 1) + −
−
−
ln(−
−∗
+ 1)+
0
− (1 − 1{∗
≥ }
)
∗
− (1 − 1{
−∗
≥ }
)
−∗
>
ln((
∗
− Δ) + 1) + −
−
−
ln(−
−∗
+ 1)+
0
− (1 − 1{∗
≥ }
)(
∗
− Δ) − (1 − 1{
−∗
≥ }
)
−∗
,
ln(
∗
+ 1) − ln((
∗
− Δ) + 1)
> 0(1 − 1{∗
≥ }
)Δ,
ln
∗
+ 1
(
∗
− Δ) + 1
>
0(1 − 1{∗
≥ }
)Δ
,
+ ln
ln
∗
+ 1
(
∗
− Δ) + 1
> ln 0(1 − 1{∗
≥ }
)Δ
,
> ln 0(1 − 1{∗
≥ }
)Δ
− ln
ln
∗
+ 1
(
∗
− Δ) + 1
.
73
∗
(
∗
, −∗
) >
∗
(
∗
+ Δ, −∗
),
ln(
∗
+ 1) + −
−
−
ln(−
−∗
+ 1)+
0
− (1 − 1{∗
≥ }
)
∗
− (1 − 1{
−∗
≥ }
)
−∗
>
ln((
∗
+ Δ) + 1) + −
−
−
ln(−
−∗
+ 1)+
0
− (1 − 1{∗
≥ }
)(
∗
+ Δ) − (1 − 1{
−∗
≥ }
)
−∗
,
ln(
∗
+ 1) − ln((
∗
+ Δ) + 1)
> −0(1 − 1{∗
≥ }
)Δ,
ln (
∗
+ Δ) + 1
∗
+ 1
<
0(1 − 1{∗
≥ }
)Δ
,
+ ln
ln (
∗
+ Δ) + 1
∗
+ 1
< ln 0(1 − 1{∗
≥ }
)Δ
,
< ln 0(1 − 1{∗
≥ }
)Δ
− ln
ln (
∗
+ Δ) + 1
∗
+ 1
Therefore,
= ln 0(1−1{∗
≥ }
)Δ
− ln
ln
∗
+1
(∗
−Δ)+1 with >
and
= ln 0(1−1{∗
≥ }
)Δ
− ln
ln
(∗
+Δ)+1
∗
+1 with <
. At last, it is proven that:
< <
.
A.1.2 Derivation of log-Likelihood Functions
Through logarithmic transformation of (Θ) in Equation 1.4.14, multinomial log-likelihood function
ℓ(Θ) is derived and specified for each combinations of (
, ), namely:
ℓ(Θ|
∗
, = 0, = 0) = X
=1(
1
∗
=0 ln
Φ
|
∗
=0 −
| {z }
No Convertion
+
1
∗
>0
ln
Φ
|
∗
>0 −
− Φ
|
∗
>0 −
| {z }
Low Convertion
)
, (A.1.2)
ℓ (Θ|
ℎ
∗
, > 0, = 0) = X
=1(
1
ℎ∗
=0 ln
Φ
ℎ
|
ℎ∗
=0 − ℎ
ℎ
| {z }
No Convertion
+
1
ℎ∗
>0
ln
Φ
ℎ
|
ℎ∗
>0 − ℎ
ℎ
− Φ
ℎ
|
ℎ∗
>0 − ℎ
ℎ
| {z }
High Convertion
)
, (A.1.3)
74
ℓ (Θ|
∗
, ℎ
∗
, > 0, > 0) = X
=1(
1
∗
=0 ln
Φ
|
∗
=0 −
+ 1
ℎ∗
0
ln
Φ
ℎ
|
ℎ∗
=0 − ℎ
ℎ
| {z }
No Convertion
+
1
∗
>0
ln
Φ
|
∗
>0 −
− Φ
|
∗
>0 −
| {z }
Low Convertion
+
1
ℎ∗
>0
ln
Φ
ℎ
|
ℎ∗
>0 − ℎ
ℎ
− Φ
ℎ
|
ℎ∗
>0 − ℎ
ℎ
| {z }
High Convertion
)
,
(A.1.4)
where the function ℓ(Θ) is computed over the sessions of users. Equation A.1.2 is log-likelihood
function of control treatment = (Figure A.7a), Equation A.1.3 of semi-treatment = (Figure A.7b)
and Equation A.1.4 of full-treatment = (Figure A.7c).
A.1.3 Derivation of Demand Elasticity
Proof: Marginal rate of substitution of utility function in (??) is
(ℎ)
0
=
∂(
,ℎ
,0
)
∂(ℎ)
∂(
,ℎ
,0
)
∂0
=
(ℎ)
(ℎ)
((ℎ)
(ℎ)
+ 1)0
,
where ∂(
,ℎ
,0
)
∂(ℎ)
=
(ℎ)
(ℎ)
(ℎ)
(ℎ)
+1
and ∂(
,ℎ
,0
)
∂0
= 0. By Walras’s law the optimal demand lies on the
budget constraint line. At the point of tangency, the marginal rate of substitution between the two goods
(ℎ)
and
0
is equal to their price ratio:
(ℎ)
0
= 1 − 1{
(ℎ)
≥ }
(ℎ)
(ℎ)
((ℎ)
(ℎ)
+ 1)0
= 1 − 1{
(ℎ)
≥ }
(ℎ)
=
(ℎ)
(ℎ) − (1 − 1{
(ℎ)
≥ }
)0
(1 − 1{
(ℎ)
≥ }
)0(ℎ)
.
From the budget constraint (1 − 1{
(ℎ)
≥ }
)
(ℎ)
+
0
=
it follows that:
0
= − (1 − 1{
(ℎ)
≥ }
)
(ℎ)
0
= − (1 − 1{
(ℎ)
≥ }
)
(ℎ)
(ℎ) − (1 − 1{
(ℎ)
≥ }
)0
(1 − 1{
(ℎ)
≥ }
)0(ℎ)
.
75
Price elasticity (ℎ) of good (ℎ) is derived:
(ℎ) =
∂(ℎ)
(ℎ)
∂1−1
{
(ℎ)
≥ }
1−1
{
(ℎ)
≥ }
=
∂(ℎ)
∂1 − 1{
(ℎ)
≥ }
1 − 1{
(ℎ)
≥ }
(ℎ)
=
−0(1 − 1{
(ℎ)
≥ }
)0(ℎ) − ((ℎ)
(ℎ) − (1 − 1{
(ℎ)
≥ }
)0)0(ℎ)
((1 − 1{
(ℎ)
≥ }
)0(ℎ)
)
2
1 − 1{
(ℎ)
≥ }
(ℎ)
(ℎ)−(1−1
{
(ℎ)
≥ }
)0
(1−1
{
(ℎ)
≥ }
)0(ℎ)
=
−(ℎ)
(ℎ)0(ℎ)
((1 − 1{
(ℎ)
≥ }
)0(ℎ)
)
2
(1 − 1{
(ℎ)
≥ }
)
20(ℎ)
(ℎ)
(ℎ) − (1 − 1{
(ℎ)
≥ }
)0
= −
(ℎ)
(ℎ)
(ℎ)
(ℎ) − (1 − 1{
(ℎ)
≥ }
)0
.
Cross-price elasticity 0 of outside good with respect to good (ℎ) is derived:
0 =
∂0
0
∂1−1
{
(ℎ)
≥ }
1−1
{
(ℎ)
≥ }
=
∂0
∂1 − 1{
(ℎ)
≥ }
1 − 1{
(ℎ)
≥ }
0
=
1
(ℎ)
1 − 1{
(ℎ)
≥ }
− (1 − 1{
(ℎ)
≥ }
)
(ℎ)
(ℎ)−(1−1
{
(ℎ)
≥ }
)0
(1−1
{
(ℎ)
≥ }
)0(ℎ)
=
(1 − 1{
(ℎ)
≥ }
)0
0(ℎ) − (ℎ)
(ℎ) + (1 − 1{
(ℎ)
≥ }
)0
.
Therefore, price elasticity and cross-price elasticity are respectively:
(ℎ) = −
(ℎ)
(ℎ)
(ℎ)
(ℎ) − (1 − 1{
(ℎ)
≥ }
)0
, (A.1.5)
0 =
(1 − 1{
(ℎ)
≥ }
)0
0(ℎ) − (ℎ)
(ℎ) + (1 − 1{
(ℎ)
≥ }
)0
. (A.1.6)
76
A.1.4 Proof of Proposition 2
Optimal treatment policy is defined as
() = Δ()Δ() + ()Δ() + ()Δ() − ()()()
.
Proof: From (1.5.7) the following expression is derived:
E
Π
| = , =
> E
Π
| = , =
,
()()
()(1 − ) −
+ ()
1 − ()
() −
> ()().
Marginal cost is assumed equal to zero. With treatment set = {, , }, two elements, i.e. =
and = , belong to the the set −{} of Equation 1.5.10. By definition () = () = 1. Thus,
two cases follow:
(i) Semi-treatment:
() ()(1 − ) > ()(),
() () − ()() > () () .
By adding
− ()()+ ()()
and
+() ()−() ()
and
−()()+
()()
on the left-hand side, inequality can be rewritten as:
() () +
− ()() + ()()
− ()() +
() ()
−() ()
+
− ()() + ()()
> () () ,
()
() − ()
− ()
() − ()
− 2()() + () () + ()()
> () () ,
()Δ () − ()Δ () − 2()() + () () + ()()
> () () ,
() − ()
Δ () + ()
() − ()
+
() − ()
()
> () () ,
Δ ()Δ () + ()Δ () + Δ ()()
> () () ,
Δ ()Δ () + ()Δ () + ()Δ () − () () > 0.
Therefore, for = , the optimal treatment function is
() = Δ ()Δ () + ()Δ () + ()Δ () − () () .
77
(ii) Full-treatment:
() () ()(1 − ) + ()(1 − ()) () > ()(),
() () ()− () () () + () ()− () () () > ()(),
() () − ()() > () () () .
The remaining part of the proof resembles the case of = . Therefore, for = , the optimal
treatment function is
() = Δ ()Δ () + ()Δ () + ()Δ () − () () () .
78
A.2 Data Description
The analysis in Sections 1.3.1, 1.3.2, 1.4.1, 1.4.2, 1.5.2, 1.6.1, 1.6.2 and in Online Appendix explicitly
refers to selected variables for clarity of explanation and due to space limitations. The variables are
separated into four groups based on their content and are reported with their group names due to space
limitations: (i) main variables, (ii) technology controls, (iii) time controls, and (iv) location controls.
A.2.1 Description of the Variables
1. Main variables: These variables are at the core of investigation of this paper. They are employed
in the reduced-form analysis of Sections 1.3.1 and 1.4.1, and also in the causal analysis of Section
1.3.2, and in the estimation of the structural demand model of Section 1.4.2 and of nonparametric
policy model in Section 1.5.2. Depending on the specification used, the variables of interest are
directly mentioned. Summary statistics are reported in Table A.1 at the aggregate level, and for
control, semi-treatment, and full-treatment groups. Table A.2 reports the correlations between
cross-correlation coefficients of continuous, main variables at the aggregate level. This list follows
the chronological order of the conversion funnel: (i) characteristics of session, (ii) characteristics
of treatment, (iii) characteristics of cart, (iv) characteristics of order, and (v) characteristics of user
history.
(a) Time to session measures the length of each session in minutes. It is computed by comparing
the day and hour reported by Session start and Session end for the 5,087,685 observations. Mean
is 13 minutes with standard deviation of 17.64 minutes.
(b) Number of pages counts the number of e-commerce website pages visualized by the user in each
session. This variable is a proxy of user’s engagement in the conversion funnel. It considers
all the 5,087,685 sessions with an average of 14.3 pages viewed per session, ranging from a
minimum of 1 to a maximum of 120 pages.
(c) The dummy Treatment is equal to 1 if the session is assigned to the treatment group (either
semi-treatment or full-treatment), 0 otherwise. 4,246,413 sessions are assigned to the treatment
group (2,490,092 to semi-treatment, 1,756,351 to full-treatment), 841,242 to the control group.
In relative terms, propensity scores of control, semi-treatment, and full-treatment are 0.15, 0.5
and 0.35, respectively.
(d) The variable d reports the RCT percentage discount assigned to the treated session which is
visualized by the user if the cart page is attained. Depending on the dummy Treatment, d ∈ {0,
5, 10, 15, 20}.
(e) The variable MPT reports the RCT minimum purchase threshold on gross order q to benefit
the discount, which it is visualized by the user, combined with the discount percentage d, if the
cart page is attained. Unit of measure is AC. For all the sessions reporting different currencies, a
conversion is made using the average exchange rate of the day the session occurred. Depending
on the dummy Treatment, MPT ∈ {0, 15, 25, 35, 45}.
79
(f) The dummy Treatment compliance is equal to 1 if gross order q is equal to or greater than
minimum purchase threshold MPT. Treatment compliance is 1 for the sessions assigned to
control and semi-treatment groups.
(g) The dummy Attained cart is equal to 1 if the user visualizes the cart page, 0 otherwise. On
average, between 26% and 30% of all sessions reach the cart page.
(h) Time to cart counts the minutes each user takes to visualize the cart for the first time. This
variable is a proxy of user’s interest in the conversion funnel. It records 1,456,608 sessions,
ranging from a minimum of 1 to a maximum of 338 minutes (5 hours and 38 minutes). Mean
is 10 minutes with standard deviation of 11.90 minutes.
(i) The dummy Converted order is equal to 1 if the user converts the session, 0 otherwise. Conversion rate of the sample is 5%.
(j) Time to order counts the minutes each user needs to attain the checkout order page. This
variable is a proxy of user’s commitment in the conversion funnel. It records 271,918 sessions,
ranging from a minimum of 3 to a maximum of 446 minutes (7 hours and 26 minutes). Mean
is 30 minutes with standard deviation of 24.83 minutes.
(k) Number of products counts the number of items purchased in each converted session. This
variable records 271,918 sessions with mean of 5 and standard deviation of 3.43.
(l) q measures the gross order of each converted session. Unit of measure is AC. For all the sessions
reporting different currencies, a conversion is made using the average exchange rate of the
day the session occurred. This variable counts 271,918 sessions with an average of AC45 per
converted session, ranging from a minimum of 0.89 to a maximum of AC4,768.
(m) The dummy New user is equal to 1 if the user never visited the e-commerce website before, 0
otherwise. On average, between 73% and 77% of all users are new users.
(n) The dummy Purchase history is equal to 1 if the user has already purchased at least once from
the e-commerce website, 0 otherwise. Rate of users with previous purchase records is 3%.
(o) Number of sessions counts the number of sessions per user key in the RCT experiment time. This
variable measures the frequency of sessions per user and it is a proxy of users’ interaction on
the e-commerce website. It considers all the 5,087,685 sessions with an average of 1.7 sessions
per user, ranging from 1.16 in the control group, to 1.96 and 1.78 in the semi-treatment group
and full-treatment group.
(p) Number of prior visits measures the number of sessions per user before the RCT session. This
variable is a proxy of user purchase history, and it considers 1,923,779 sessions.
(q) Number of prior purchases measures the number of converted sessions per user before the RCT
session. This variable is a proxy of user purchase history and it considers 1,923,779 sessions.
(r) Number of prior abandoned carts measures the number of abandoned carts per user before the
RCT session.This variable is a proxy of user purchase history and it considers 638,710 sessions.
80
2. Technology controls: These variables report information related to device typologies, browsing
technologies and operative systems used in each session. They are described in Table A.3. Technology variables include the following controls to check for e-commerce heterogeneity:
(a) Device (decreasing frequency order): Mobile, Computer, and Tablet. . Dummies for device:
Mobile, Computer, and Tablet (2 variables).
(b) Browser (decreasing frequency order with at least 1,000 sessions): Chrome Mobile, Mobile
Safari, Chrome, Apple WebKit, Safari, Firefox, Firefox Mobile, Microsoft Edge Mobile, Firefox
Mobile iOS, and Mozilla. Dummies for browser: Chrome Mobile, Mobile Safari, Chrome, Apple
WebKit, Safari, Firefox, Firefox Mobile, and Others (7 variables).
(c) Operative system (decreasing frequency order with at least 1,000 sessions): Mac OS X (iPhone),
Android 1.x, Windows 10, Android Mobile, Mac OS X, Mac OS X (iPad), Windows 7, Android,
Chrome OS, Windows 8.1, Linux, Android 6.x, Android 5.x, and Windows 8. Dummies for
operative system: Mac OS X (iPhone), Android 1.x, Windows 10, Android Mobile, Mac OS X, Mac
OS X (iPad), Windows 7, and Others (7 variables).
3. Time controls: This is a group of additional time covariates which are inferred by comparing timing
information from both Session start and Session end. These variables are related to month, day and
hour of each session. In the case where a session overlaps, so that Session start and Session end
would belong to different time controls, then the end of the session is the reference term of month,
day, and hour. They are described in Table A.4. Time variables include the following controls to
address time trends and seasonality:
(a) Control of the month in which the session occurred. Dummies for month: January, February,
March, April, May, June, July, August, September, October, November, and December (11 variables).
(b) Control of the time of the month in which the session took place. Dummies for time of the
month: Beginning (1st-10th), Halfway (11th-20th), and End (21st-28th/30th/31st) (2 variables).
(c) Control of the time of the day in which the session ends. Dummies for time of the day: Night
(12:00AM-6:00AM), Morning (6:00AM-12:00PM), Afternoon (12:00PM-6:00PM), and Evening
(6:00PM-12:00AM) (3 variables).
4. Location controls: These variables include the country and the region in which the sessions took
place, and their related currency. They are described in Table A.5. Location variables include the
following controls to account for nationality and currency heterogeneity:
(a) Country (decreasing frequency order with at least 10,000 sessions): France, Italy, Spain, Portugal, United Kingdom, Denmark, Netherlands, Belgium, Ukraine, United States, Brazil, Egypt,
Poland, Romania, Switzerland, Morocco, Thailand, Ireland, Russia, Greece, Croatia, Canada,
Lithuania, South Africa, Algeria, and Latvia. Dummies for countries: France, Italy, Spain,
Portugal, United Kingdom, Denmark, Netherlands, Belgium, Ukraine, United States, and Others (10
variables).
81
(b) Region: Since each country has its own specific administrative division into states and/or
regions (either numerical, nominal or by acronym), it results challenging to code all the
dummy variables to account for such a plethora. Thus, region is an information available, but
it is not considered in the analysis.
(c) Currency: Euro (EUR), British pound sterling (GBP), and United States dollar (USD). For all
sessions recorded in a country where the official currency is neither Euro, nor British pound
sterling, nor United States dollar, the local currency is converted into United States dollar at
the exchange rate of the day the session took place. Dummies for currency: EUR, GBP, and
USD (2 variables).
A.2.2 Summary Statistics
82
Table A.1: Summary statistics of main variables
Aggregate Control Semi-Treatment Full-Treatment
Variable Obs. Mean St.Dev. Obs. Mean St.Dev. Obs. Mean St.Dev. Obs. Mean St.Dev.
Characteristics of session
Time to session (minute) 5,087,685 12.96 17.64 841,242 12.04 17.64 2,490,092 12.39 17.02 1,756,351 14.19 18.42
Number of pages 5,087,685 14.31 15.95 841,242 12.59 15.09 2,490,092 15.06 15.66 1,756,351 14.07 16.66
Characteristics of treatment
Treatment (0/1) 5,087,685 0.85 0.52 841,242 0.00 0.00 2,490,092 1.00 0.00 1,756,351 1.00 0.00
d (%) 5,087,685 9.59 5.01 841,242 0.00 0.00 2,490,092 12.46 3.36 1,756,351 10.13 0.91
MPT (AC) 5,087,685 10.78 15.54 841,242 0.00 0.00 2,490,092 0.00 0.00 1,756,351 31.21 7.82
Treatment compliance (0/1) 5,087,685 0.05 0.21 841,242 1.00 0.00 2,490,092 1.00 0.00 1,756,351 0.04 0.19
Characteristics of cart
Attained cart (0/1) 5,087,685 0.29 0.45 841,242 0.26 0.44 2,490,092 0.30 0.46 1,756,351 0.28 0.45
Time to cart (minute) 1,456,608 9.83 11.90 218,753 9.64 12.49 749,093 8.74 10.82 488,762 11.59 12.94
Characteristics of order
Converted order (0/1) 5,087,685 0.05 0.22 841,242 0.05 0.22 2,490,092 0.06 0.23 1,756,351 0.05 0.22
Time to order (minute) 271,918 29.89 24.83 41,359 30.43 26.05 140,046 27.36 23.25 90,513 33.56 26.10
Number of products 271,918 5.13 3.43 41,359 5.01 3.32 140,046 4.69 3.28 90,513 5.86 3.59
q (AC) 271,918 45.42 65.41 41,359 46.49 67.25 140,046 46.43 77.69 90,513 43.35 37.96
Characteristics of user history
New user (0/1) 5,087,685 0.74 0.48 841,242 0.77 0.49 2,490,092 0.73 0.48 1,756,351 0.75 0.49
Purchase history (0/1) 5,087,685 0.03 0.17 841,242 0.03 0.18 2,490,092 0.03 0.18 1,756,351 0.02 0.15
Number of sessions 5,087,685 1.77 3.36 841,242 1.16 0.73 2,490,092 1.96 4.13 1,756,351 1.78 2.82
Number of prior visits 1,923,779 8.89 22.12 330,597 9.11 23.25 928,402 8.64 22.16 664,780 9.12 21.46
Number of prior purchases 1,923,779 0.18 1.25 330,597 0.18 1.20 928,402 0.20 1.38 664,780 0.14 1.08
Number of prior abandoned carts 638,710 3.78 8.51 110,368 3.92 9.00 312,792 3.90 9.02 215,550 3.53 7.43
83
Table A.2: Correlation matrix for continuous variables Number Number Number of Time Number Time Time Number Number of of prior to of to to of of prior prior abandoned session pages d MPT cart order products q sessions visits purchases carts Time to session 1.000 Number of pages 0.591 1.000
d 0.005 0.054 1.000
MPT 0.000 -0.055 0.077 1.000
Time to cart 0.493 0.429 0.006 0.013 1.000
Time to order 0.386 0.360 0.009 0.002 0.327 1.000
Number of products 0.245 0.299 0.015 0.006 0.251 0.744 1.000
q 0.172 0.197 0.021 -0.012 0.155 0.483 0.611 1.000
Number of sessions 0.053 0.032 0.107 0.015 0.009 0.018 0.009 0.173 1.000
Number of prior visits 0.045 0.014 0.010 0.000 0.010 0.010 0.013 0.027 0.457 1.000
Number of prior purchases 0.041 0.033 0.021 -0.014 0.032 0.052 0.067 0.352 0.101 0.431 1.000
Number of prior abandoned carts 0.054 0.041 0.013 -0.014 0.037 0.027 0.032 0.035 0.311 0.650 0.416 1.000
84
Table A.3: Summary statistics of technology controls
Aggregate Control Semi-Treatment Full-Treatment
Obs. Mean St.Dev. Obs. Mean St.Dev. Obs. Mean St.Dev. Obs. Mean St.Dev.
Controls of device
Mobile (0/1) 5,087,685 0.87 0.34 841,242 0.79 0.41 2,490,092 0.89 0.32 1,756,351 0.87 0.33
Computer (0/1) 5,087,685 0.13 0.33 841,242 0.21 0.40 2,490,092 0.11 0.31 1,756,351 0.12 0.32
Tablet (0/1) 5,087,685 0.01 0.08 841,242 0.01 0.08 2,490,092 0.01 0.08 1,756,351 0.01 0.09
Controls of browser
Chrome Mobile (0/1) 5,087,685 0.44 0.50 841,242 0.40 0.49 2,490,092 0.42 0.49 1,756,351 0.48 0.50
Mobile Safari (0/1) 5,087,685 0.36 0.48 841,242 0.33 0.47 2,490,092 0.41 0.49 1,756,351 0.31 0.46
Chrome (0/1) 5,087,685 0.10 0.29 841,242 0.15 0.36 2,490,092 0.08 0.27 1,756,351 0.09 0.29
Apple WebKit (0/1) 5,087,685 0.06 0.25 841,242 0.05 0.22 2,490,092 0.06 0.23 1,756,351 0.08 0.27
Safari (0/1) 5,087,685 0.03 0.17 841,242 0.05 0.21 2,490,092 0.02 0.15 1,756,351 0.03 0.16
Firefox (0/1) 5,087,685 0.01 0.09 841,242 0.01 0.11 2,490,092 0.01 0.08 1,756,351 0.01 0.08
Firefox Mobile (0/1) 5,087,685 0.00 0.03 841,242 0.00 0.03 2,490,092 0.00 0.03 1,756,351 0.00 0.03
Other (0/1) 5,087,685 0.00 0.03 841,242 0.00 0.03 2,490,092 0.00 0.03 1,756,351 0.00 0.03
Controls of operative system
Mac OS X (iPhone) (0/1) 5,087,685 0.44 0.50 841,242 0.40 0.49 2,490,092 0.48 0.50 1,756,351 0.41 0.49
Android 1.x (0/1) 5,087,685 0.37 0.48 841,242 0.34 0.48 2,490,092 0.36 0.48 1,756,351 0.40 0.49
Windows 10 (0/1) 5,087,685 0.08 0.27 841,242 0.13 0.33 2,490,092 0.06 0.25 1,756,351 0.07 0.26
Android Mobile (0/1) 5,087,685 0.05 0.21 841,242 0.04 0.20 2,490,092 0.04 0.19 1,756,351 0.07 0.25
Mac OS X (0/1) 5,087,685 0.04 0.20 841,242 0.06 0.25 2,490,092 0.03 0.18 1,756,351 0.04 0.19
Mac OS X (iPad) (0/1) 5,087,685 0.01 0.08 841,242 0.01 0.07 2,490,092 0.01 0.08 1,756,351 0.01 0.08
Windows 7 (0/1) 5,087,685 0.00 0.06 841,242 0.01 0.08 2,490,092 0.00 0.05 1,756,351 0.00 0.00
Other (0/1) 5,087,685 0.01 0.10 841,242 0.01 0.12 2,490,092 0.01 0.09 1,756,351 0.01 0.10
85
Table A.4: Summary statistics of time controls
Aggregate Control Semi-Treatment Full-Treatment
Obs. Mean St.Dev. Obs. Mean St.Dev. Obs. Mean St.Dev. Obs. Mean St.Dev.
Controls of month
January (0/1) 5,087,685 0.22 0.41 841,242 0.19 0.39 2,490,092 0.19 0.39 1,756,351 0.26 0.44
February (0/1) 5,087,685 0.08 0.26 841,242 0.06 0.23 2,490,092 0.00 0.00 1,756,351 0.19 0.39
March (0/1) 5,087,685 0.05 0.22 841,242 0.09 0.29 2,490,092 0.00 0.02 1,756,351 0.10 0.30
April (0/1) 5,087,685 0.06 0.24 841,242 0.04 0.20 2,490,092 0.05 0.22 1,756,351 0.09 0.28
May (0/1) 5,087,685 0.05 0.22 841,242 0.03 0.18 2,490,092 0.04 0.19 1,756,351 0.07 0.26
June (0/1) 5,087,685 0.06 0.23 841,242 0.04 0.19 2,490,092 0.06 0.23 1,756,351 0.07 0.25
July (0/1) 5,087,685 0.09 0.29 841,242 0.06 0.24 2,490,092 0.09 0.29 1,756,351 0.12 0.32
August (0/1) 5,087,685 0.08 0.27 841,242 0.05 0.22 2,490,092 0.08 0.28 1,756,351 0.09 0.29
September (0/1) 5,087,685 0.06 0.24 841,242 0.09 0.28 2,490,092 0.08 0.27 1,756,351 0.02 0.12
October (0/1) 5,087,685 0.07 0.25 841,242 0.08 0.27 2,490,092 0.11 0.32 1,756,351 0.00 0.00
November (0/1) 5,087,685 0.06 0.24 841,242 0.13 0.33 2,490,092 0.08 0.27 1,756,351 0.00 0.00
December (0/1) 5,087,685 0.13 0.33 841,242 0.14 0.35 2,490,092 0.21 0.41 1,756,351 0.00 0.00
Controls of time of the month
Beginning (0/1) 5,087,685 0.36 0.48 841,242 0.34 0.47 2,490,092 0.34 0.47 1,756,351 0.39 0.49
Halfway (0/1) 5,087,685 0.33 0.47 841,242 0.33 0.47 2,490,092 0.33 0.47 1,756,351 0.32 0.46
End (0/1) 5,087,685 0.31 0.46 841,242 0.32 0.47 2,490,092 0.32 0.47 1,756,351 0.30 0.46
Controls of time of the day
Night (0/1) 5,087,685 0.14 0.34 841,242 0.11 0.31 2,490,092 0.16 0.36 1,756,351 0.12 0.32
Morning (0/1) 5,087,685 0.14 0.35 841,242 0.15 0.36 2,490,092 0.12 0.33 1,756,351 0.16 0.36
Afternoon (0/1) 5,087,685 0.27 0.44 841,242 0.31 0.46 2,490,092 0.23 0.42 1,756,351 0.31 0.46
Evening (0/1) 5,087,685 0.46 0.50 841,242 0.42 0.49 2,490,092 0.50 0.50 1,756,351 0.42 0.49
86
Table A.5: Summary statistics of location controls Aggregate Control Semi-Treatment Full-Treatment Obs. Mean St.Dev. Obs. Mean St.Dev. Obs. Mean St.Dev. Obs. Mean St.Dev. Controls of country France (0/1) 5,087,685 0.25 0.43 841,242 0.26 0.44 2,490,092 0.27 0.44 1,756,351 0.22 0.41 Italy (0/1) 5,087,685 0.23 0.42 841,242 0.22 0.42 2,490,092 0.20 0.40 1,756,351 0.27 0.44 Spain (0/1) 5,087,685 0.15 0.36 841,242 0.16 0.37 2,490,092 0.15 0.36 1,756,351 0.15 0.36 Portugal (0/1) 5,087,685 0.07 0.25 841,242 0.06 0.25 2,490,092 0.08 0.27 1,756,351 0.06 0.24 United Kingdom (0/1) 5,087,685 0.05 0.22 841,242 0.05 0.22 2,490,092 0.06 0.25 1,756,351 0.03 0.18 Denmark (0/1) 5,087,685 0.04 0.20 841,242 0.04 0.19 2,490,092 0.04 0.19 1,756,351 0.05 0.21 Netherlands (0/1) 5,087,685 0.04 0.19 841,242 0.04 0.19 2,490,092 0.04 0.19 1,756,351 0.04 0.20 Belgium (0/1) 5,087,685 0.02 0.13 841,242 0.02 0.14 2,490,092 0.02 0.13 1,756,351 0.02 0.14 Ukraine (0/1) 5,087,685 0.02 0.13 841,242 0.02 0.12 2,490,092 0.01 0.10 1,756,351 0.03 0.16 United States (0/1) 5,087,685 0.01 0.12 841,242 0.02 0.12 2,490,092 0.01 0.12 1,756,351 0.01 0.12 Others (0/1) 5,087,685 0.12 0.32 841,242 0.12 0.32 2,490,092 0.12 0.33 1,756,351 0.12 0.32 Controls of currency EUR (0/1) 5,087,685 0.99 0.08 841,242 0.99 0.08 2,490,092 0.99 0.09 1,756,351 0.99 0.07 GBR (0/1) 5,087,685 0.00 0.07 841,242 0.00 0.06 2,490,092 0.01 0.08 1,756,351 0.00 0.06 USD (0/1) 5,087,685 0.00 0.04 841,242 0.00 0.04 2,490,092 0.00 0.04 1,756,351 0.00 0.04
87
A.3 Omitted Tables
Table A.6: Likelihood of conversion with new user dummy and purchase history dummy
Likelihood of conversion Control Semi-Treatment Full-Treatment
Intersection (0) -2.396∗∗∗ -2.328∗∗∗ -2.058∗∗∗
(0.279) (0.134) (0.174)
New user dummy ( ) -0.121∗∗∗[11] -0.101∗∗∗[10] -0.141∗∗∗[13]
(0.013) (0.007) (0.009)
Purchase history dummy () 0.472∗∗∗[60] 0.657∗∗∗[93] 0.608∗∗∗[84]
(0.030) (0.014) (0.017)
Main variables X X X
Technology controls X X X
Time controls X X X
Location controls X X X
Pseudo R-squared 0.095 0.077 0.100
Observations 218,753 749,093 488,762
Note: ∗
– p < 0.1; ∗∗ – p < 0.05; ∗∗∗ – p < 0.01. Logit regression of the likelihood of conversion on new user dummy, past purchase
dummy, and covariates. The control variables are defined in Section A.2.1. Standard errors in parenthesis. Odds ratio absolute
value percentage variations in brackets.
Table A.7: Gross order with new user dummy and purchase history dummy
Gross order Control Semi-Treatment Full-Treatment
Intersection (0) 3.940∗∗∗ 3.739∗∗∗ 3.822∗∗∗
(0.116) (0.054) (0.065)
New user dummy ( ) -0.025∗∗∗ -0.028∗∗∗ -0.047∗∗∗
(0.005) (0.003) (0.003)
Purchase history dummy () 0.083∗∗∗ 0.104∗∗∗ 0.014∗∗∗
(0.010) (0.006) (0.007)
Main variables X X X
Technology controls X X X
Time controls X X X
Location controls X X X
Adj. R-squared 0.166 0.182 0.136
Observations 41,359 140,046 90,513
Note: ∗
– p < 0.1; ∗∗ – p < 0.05; ∗∗∗ – p < 0.01. OLS regression of the gross order on new user dummy, past purchase dummy, and
covariates. The control variables are defined in Section A.2.1. Standard errors in parenthesis.
88
Table A.8: Likelihood of conversion with discount percentage and minimum purchase threshold
Likelihood of conversion Control Semi-Treatment Full-Treatment
Intersection (0) -2.452∗∗∗ -2.419∗∗∗ -2.260∗∗∗
(0.279) (0.135) (0.177)
Discount percentage () 0.019∗∗∗[2] 0.028∗∗∗[3]
(0.001) (0.002)
Minimum purchase threshold ( ) -0.003∗∗∗[0]
(0.001)
Main variables X X X
Technology controls X X X
Time controls X X X
Location controls X X X
Pseudo R-squared 0.095 0.081 0.102
Observations 218,753 749,093 488,762
Note: ∗
– p < 0.1; ∗∗ – p < 0.05; ∗∗∗ – p < 0.01. Logit regression of the likelihood of conversion on discount percentage, minimum
purchase threshold and covariates. The control variables are defined in Section A.2.1. Standard errors in parenthesis. Odds
ratio absolute value percentage variations in brackets.
Table A.9: Gross order with discount percentage and minimum purchase threshold
Gross order Control Semi-Treatment Full-Treatment
Intersection (0) 3.940∗∗∗ 3.460∗∗∗ 3.893 ∗∗∗
(0.116) (0.053) (0.066)
Discount percentage () 0.021∗∗∗ -0.011∗∗∗
(0.000) (0.001)
Minimum purchase threshold ( ) 0.002∗∗∗
(0.000)
Main variables X X X
Technology controls X X X
Time controls X X X
Location controls X X X
Adj. R-squared 0.166 0.194 0.137
Observations 41,359 140,046 90,513
Note: ∗
– p < 0.1; ∗∗ – p < 0.05; ∗∗∗ – p < 0.01. OLS regression of the gross order on discount percentage, minimum purchase
threshold and covariates. The control variables are defined in Section A.2.1. Standard errors in parenthesis.
89
Table A.10: Estimates of treatment elasticity
Control Semi-Treatment Full-Treatment Aggregate Demand
Low () High (ℎ) Low () High (ℎ) Low () High (ℎ)
Discount 5%
(ℎ) -1.294 -1.256 -1.592 -2.285 -2.178 -1.222
[-1.359,-1.127] [-1.431,-1.209] [-1.655,-1.492] [-2.374,-2.229] [-2.240,-2.029] [-1.335,-1.135]
0 0.021 0.022 0.015 0.011 0.011 0.018
[-0.124,0.087] [-0.136,0.111] [-0.025,0.128] [-0.084,0.061] [-0.106,0.104] [-0.053,0.143]
Discount 10%
(ℎ) -1.312 -1.272 -1.639 -2.434 -2.307 -1.235
[-1.468,-1.221] [-1.381,-1.147] [-1.791,-1.633] [-2.536,-2.384] [-2.407,-2.186] [-1.305,-1.116]
0 0.022 0.023 0.015 0.012 0.011 0.018
[-0.047,0.204] [-0.178,0.028] [-0.046,0.108] [-0.043,0.092] [-0.113,0.097] [-0.029,0.178]
Discount 15%
(ℎ) -1.331 -1.288 -1.688 -2.604 -2.453 -1.249
[-1.450,-1.196] [-1.449,-1.226] [-1.835,-1.680] [-2.729,-2.576] [-2.617,-2.390] [-1.266,-1.042]
0 0.023 0.024 0.016 0.012 0.012 0.019
[-0.011,0.229] [-0.134,0.107] [-0.039,0.102] [0.028,0.165] [-0.095,0.125] [-0.166,0.061]
Discount 20%
(ℎ) -1.351 -1.304 -1.740 -2.799 -2.618 -1.262
[-1.399,-1.165] [-1.339,-1.114] [-1.827,-1.685] [-2.901,-2.751] [-2.765,-2.559] [-1.378,-1.167]
0 0.024 0.025 0.017 0.013 0.012 0.020
[-0.177,0.042] [-0.051,0.162] [-0.047,0.092] [-0.081,0.064] [-0.114,0.098] [-0.203,0.006]
Note: Refer to Section A.1.3 in the Online Appendix. Endowment is set to AC101. 95% bootstrap confidence intervals are in
square brackets (501 repetitions). The dataset includes sessions in the control group and only sessions where the treatment was
received with a order gross equal to or lower than AC100.
Table A.11: Performance of classification ensemble methods
Accuracy ((), b()) Control Semi-Treatment Full-Treatment
BAG (BaggingClassifier) 0.768 0.778 0.777
RF (RandomForestClassifier) 0.813 0.815 0.817
AB (AdaBoostClassifier) 0.813 0.815 0.817
GBDT (GradientBoostingClassifier) 0.814 0.817 0.819
NN (MLPClassifier) 0.814 0.817 0.818
Observations 218,753 749,093 488,762
Note: Classification methods’ performance through the accuracy ((), b()) = 1
P−1
=0 1{b()=()}. Accuracy score
computes the fraction (or the count ) of correct predictions in the binary classification. The dataset includes only sessions where
the treatment was received (1,456,608 observations).
90
Table A.12: Performance of regression ensemble methods
Coefficient of determination 2((), b()) Control Semi-Treatment Full-Treatment
BAG (BaggingRegressor) 0.012 0.340 -0.119
RF (RandomForestRegressor) 0.015 0.393 -0.067
AB (AdaBoostRegressor) 0.014 -0.589 -0.402
GBDT (GradientBoostingRegressor) 0.176 0.404 0.062
NN (MLPRegressor) 0.096 0.380 0.057
Observations 218,753 749,093 488,762
Note: Regression methods’ performance through the coefficient of determination
2
((), b()) = 1−
P
=1(()−b())2
P
=1(()−¯())2
. It
represents the proportion of variance of () that has been explained by the independent variables in the model. It provides
an indication of goodness of fit and thus a measure of how well unseen samples are likely to be predicted by the model, through
the proportion of explained variance. Best possible score is 1.0 and it can be negative, because the model can be arbitrarily
worse. The dataset includes only sessions where the treatment was received (1,456,608 observations).
Table A.13: Estimates of counterfactual simulations of conversion rate
Semi-Treatment Full-Treatment Combined Treatments
Policy b
() Δc () b
() Δc () b
{ , }() Δc{ , }()
Uniform 1.3036 – 1.0077 – 2.3064 –
Non-targeted 0.4601 –64.71% 1.0077 0.00% 2.6982 +16.99%
(0.0044) (0.0080) (0.0227)
Conversion-targeted 1.3055 +0.14% 1.1188 +11.03% 2.6775 +16.09%
(0.0114) (0.0074) (0.0190)
Revenue-targeted 1.4962 +14.77% 1.1457 +13.69% 2.8457 +23.38%
(0.0125) (0.0075) (0.0200)
Conversion-revenue 1.4973 +14.86% 1.1461 +13.73% 2.8466 +23.42%
(0.0125) (0.0075) (0.0200)
Optimal treatment 1.5501 +18.91% 1.1601 +15.13% 2.8642 +24.18%
(0.0128) (0.0076) (0.0200)
Note: The results of simulation analysis of conversion rate of alternative policies and optimal treatment policy. First and second
column consider the treatment space = {, } and = {, }, respectively. The third column simulates the pricing
algorithm with the combined treatment space = {, , }. Comparison between alternative policies is framed in terms
of percentage variation defined as Δc() =
b
()−b
b
%
, where the non-treatment, uniform policy stands for the
benchmark case. The estimates of conversion rate of (1.5.17) per counterfactual policies and optimal treatment policy uses
dataset sessions where the treatment was received. The testing dataset is equal to 20%. Semi-treatment testing dataset features
19,908 observations, full-treatment dataset has 36,748 sessions, and the combined treatments testing dataset counts 56,656 data
points. Standard errors in parenthesis.
91
Table A.14: Estimates of counterfactual simulations of conversion rate for new and returning users
New users
Semi-Treatment Full-Treatment Combined Treatments
Policy ^
() Δc () ^
() Δc () ^
{ , }() Δc{ , }()
Uniform 1.4390 – 1.0874 – 2.5461 –
Non-targeted 1.4390 0.00% 1.0874 0.00% 2.2065 -13.33%
(0.0275) (0.0171) (0.0422)
Conversion-targeted 1.3902 -3.39% 1.2542 +15.33% 2.7037 +6.20%
(0.0227) (0.0166) (0.0386)
Revenue-targeted 1.5844 +10.10% 1.2068 +10.97% 2.9324 +15.18%
(0.0247) (0.0162) (0.0394)
Conversion-revenue 1.5861 +10.21% 1.2528 +15.21% 2.8718 +12.80%
(0.0247) (0.0166) (0.0395)
Optimal treatment 1.6403 +13.99% 1.2204 +12.23% 2.9568 +16.14%
(0.0252) (0.0163) (0.0395)
Returning users
Semi-Treatment Full-Treatment Combined Treatments
Policy ^
() Δc () ^
() Δc () ^
{ , }() Δc{ , }()
Uniform 1.4746 – 1.1341 – 2.6089 –
Non-targeted 0.4985 –66.19% 1.1341 0.00% 3.0189 +15.71%
(0.0114) (0.0222) (0.0588)
Conversion-targeted 1.8063 +22.49% 1.2406 +9.39% 3.3554 +28.61%
(0.0338) (0.0200) (0.0518)
Revenue-targeted 1.6273 +10.35% 1.2701 +11.99% 3.1621 +21.20%
(0.0319) (0.0204) (0.0514)
Conversion-revenue 1.8054 +22.44% 1.2714 +12.11% 3.2922 +26.19%
(0.0338) (0.0204) (0.0518)
Optimal treatment 1.6916 +14.71% 1.2949 +14.18% 3.1803 +21.90%
(0.0326) (0.0206) (0.0514)
Note: The estimates of the counterfactual analysis of alternative policies and optimal treatment policy conditional on new
users’ sessions. First and second column consider the treatment space = {, } and = {, }, respectively. The
third column simulates the pricing algorithm described in (1.5.11) and implemented by Algorithm 6 where the treatment space
= {, , } combines = and = . The first section presents the estimates of conversion rate’s simulation
analysis of (1.5.17) of new users per counterfactual policies and optimal treatment policy. The second section shows the results
of conversion rate’s simulation analysis of (1.5.17) of returning users per counterfactual policies and optimal treatment policy.
Comparison between alternative policies is framed in terms of percentage variation defined as Δc() =
b
()−b
b
%
,
where the non-treatment, uniform policy stands for the benchmark case. For new users, the simulation dataset includes only
sessions where the treatment was received and the testing dataset is set to 20%. Semi-treatment testing dataset features 5,517
observations, while full-treatment dataset has 8,353 observations. The combined treatments testing dataset counts 13,870
sessions. For returning users, semi-treatment testing dataset features 3,146 observations, while full-treatment dataset has 5,073
observations. The combined treatments testing dataset counts 8,219 sessions. Standard errors in parenthesis.
92
A.4 Omitted Figures
Figure A.1: Screenshots of McDonald’s app (Source: Taunton (2023))
Figure A.2: Homepage of Alpha website
Note: Screenshot of Alpha homepage. The text advertises that Alpha’s clients would raise the likelihood of conversion and
would increase net revenue per user by means of “personalized multichannel consumer experiences” which leverage e-commerce
behavioral data. Images have been edited to comply with MNDA.
93
(a) Control group
(b) Semi-treatment group
(c) Full-treatment group
Figure A.3: Treatment per control, semi-treatment and full-treatment group
Note: Screenshots of the online experience of receiving the treatment in the form of promotion pop-up based on groups’
assignment. (a) Control group C with = 0 and = 0. Promotion pop-up does not appear at the cart page. (b)
Semi-treatment group ST with = 15% and = 0. The cart is worth $39.50 and consists of four items and three products’
types. The promotion pop-up is received by the time the user reaches the cart page for the first time. (c) Full-treatment group
FT with = 10% and = $25. The cart consists of a $13.50 product and estimated shipping costs equal to $7.50. Accessed
on November 4th, 2022 at 7:35 PM from the San Francisco area using a MacBook Pro laptop with no prior purchases, but some
prior visits to the e-commerce website. Images have been edited to comply with MNDA.
94
(a) E-commerce homepage (b) Session started
(c) First item added to the cart page (d) Treatment received
(e) Promotion added to the cart page (f) Second item added to the cart page
95
(g) Gross order updated (h) Order converted
Figure A.4: Example of conversion funnel experience
Note: Images of the conversion funnel on Alpha’s client e-commerce website for a session assigned to the full-treatment group.
(a) The e-commerce homepage of the website displays the architecture divided by products’ typology (i.e., make-up, skin care,
accessories) and further features (i.e., offers, services, best seller). Each page is divided accordingly the products’ purpose. For
instance, the make-up section shows groups divided for face, eyes, lips, hands, palette, and sets and kits’ items. (b) The session
begins as the user logs-in the e-commerce site, starts browsing the website in search of different products. (c) The first item is
selected and a light-green banner at the bottom of the screen notifies that the item has been added to the cart. The notification
offers two options: at the bottom-left to continue visiting the products’ section or at the bottom-right to visualize the cart page.
In this case, the user chooses a Smart Hydrating Foundation at a price of $13.50. (d) The user clicks the bottom-right alternative
and the cart page is displayed for the first time during the session. The screen dims as a central promotional pop-up appears.
The treatment of the full-treatment group consists of a promotion with two components: the discount percentage and the
minimum purchase threshold. This section is assigned to a treatment consisting of a 10% discount on the gross order and a
minimum purchase of $25. The promotion is valid only for this online session. (e) On the left side, the cart page summarizes
the selected items with their quantities, characteristics and prices. The promotion is automatically added to the order summary
located on the right side of the cart page that the user was visiting. In this case, the voucher code FP_E25CA in the promotion
section is activated as the green bottom banner confirms the activation. (f) The session continues and a second product is added
to the cart page. As a second item, the user picks a Skin Trainer Cream at a price of $33.00. The same green banner appears. (g)
Again, the user clicks on the right button "Go to your bag". The order summary shows the number of items in the cart, 2, the
order subtotal, $46.50, the promo discount, $4.65, and the order total, $41.85, and the checkout option is located at the bottom.
(h) As the user moves to the checkout page, the order summary reports all the useful information in terms of selected products,
prices, sub-total, and order total. At last the user needs to add the payment method information, then the order is confirmed
and the session ends. Accessed on July 7th, 2022 at 9:55 PM from the San Francisco area using a MacBook Pro laptop with no
prior purchases and no prior visits to the e-commerce website. Images have been edited to comply with MNDA.
96
0 10 20 30 40 50 60 70 80 90 100
Gross Order (q)
0
500
1000
1500
2000
Number of Sessions
Control
Returning Users
New Users
(a) = 5, = 0
0 10 20 30 40 50 60 70 80 90 100
Gross Order (q)
0
500
1000
1500
2000
2500
Number of Sessions
Control
Returning Users
New Users
(b) = 10, = 0
0 10 20 30 40 50 60 70 80 90 100
Gross Order (q)
0
500
1000
1500
2000
2500
Number of Sessions
Control
Returning Users
New Users
(c) = 15, = 0
0 10 20 30 40 50 60 70 80 90 100
Gross Order (q)
0
500
1000
1500
2000
Number of Sessions
Control
Returning Users
New Users
(d) = 20, = 0
Figure A.5: Number of converted sessions by gross order per semi-treatment discount percentage
Note: Panels show distributions of converted sessions for semi-treatment group per discount percentage treatments, conditionally on returning users’ or new users’ groups. Panels exclude sessions which were converted with a gross order above
AC100. Black histograms show the control group demand, which counts 39,552 sessions with AC40.00 mean and AC17.36 standard
deviation. Semi-treatment demand counts 134,274 converted sessions in total, in which 47% are returning users (red plot),
while 53% are new users (blue plot). (a) Semi-treatment demand with 5% discount percentage reports 391 converted sessions in
total, where 35% are returning users (AC54.14 mean, AC18.88 st.dev.), while 65% are new users (AC53.56 mean, AC17.82 st.dev.). (b)
Semi-treatment demand with 10% discount percentage reports 73,581 converted sessions in total, where 44% are returning users
(AC38.34 mean, AC15.86 st.dev.), while 56% are new users (AC36.70 mean, AC15.22 st.dev.). (c) Semi-treatment demand with 15%
discount percentage reports 46,677 converted sessions in total, where 51% are returning users (AC40.97 mean, AC16.48 st.dev.),
while 49% are new users (AC38.86 mean, AC15.82 st.dev.). (d) Semi-treatment demand with 20% discount percentage reports
13,625 converted sessions in total, where 55% are returning users (AC42.85 mean, AC18.78 st.dev.), while 45% are new users (AC39.95
mean, AC17.62 st.dev.).
97
0 10 20 30 40 50 60 70 80 90 100
Gross Order (q)
0
500
1000
1500
2000
Number of Sessions
Control
Returning Users
New Users
(a) > 0, = 15
0 10 20 30 40 50 60 70 80 90 100
Gross Order (q)
0
500
1000
1500
2000
Number of Sessions
Control
Returning Users
New Users
(b) > 0, = 25
0 10 20 30 40 50 60 70 80 90 100
Gross Order (q)
0
500
1000
1500
2000
Number of Sessions
Control
Returning Users
New Users
(c) > 0, = 35
0 10 20 30 40 50 60 70 80 90 100
Gross Order (q)
0
500
1000
1500
2000
Number of Sessions
Control
Returning Users
New Users
(d) > 0, = 45
Figure A.6: Number of converted sessions by gross order per full-treatment minimum purchase threshold
Note: Panels show distributions of converted sessions for semi-treatment group per discount percentage treatments, conditionally on returning users’ or new users’ groups. Panels exclude sessions which were converted with a gross order above
AC100. Black histograms show the control group demand, which counts 39,552 sessions with AC40.00 mean and AC17.36 standard
deviation. Full-treatment demand counts 87,331 converted sessions in total, in which 43% are returning users (red plot), while
57% are new users (blue plot). (a) Full-treatment demand with = 15 reports 7,609 converted sessions in total, where 37%
are returning users (AC38.85 mean, AC15.78 st.dev.), while 63% are new users (AC36.51 mean, AC14.80 st.dev.). (b) Full-treatment
demand with = 25 reports 45,658 converted sessions in total, where 42% are returning users (AC39.91 mean, AC15.85
st.dev.), while 58% are new users (AC39.14 mean, AC15.85 st.dev.). (c) Full-treatment demand with = 35 reports 28,055
converted sessions in total, where 41% are returning users (AC40.77 mean, AC17.07 st.dev.), while 59% are new users (AC36.24
mean, AC14.97 st.dev.). (d) Full-treatment demand with = 45 reports 6,009 converted sessions in total, where 57% are
returning users (AC43.21 mean, AC18.21 st.dev.), while 43% are new users (AC40.72 mean, AC18.60 st.dev.).
98
(a) Control group ( = 0, = 0)
(b) Semi-treatment group ( > 0, = 0)
(c) Full-treatment group ( > 0, > 0)
Figure A.7: Graphical intuition of constrained utility maximization problem
Note: Each panel graphically describes the choice problem in (1.4.9). Control group has no treatment, the semi-treatment
group gets unconditional discounts, and the full-treatment group receives conditional discounts. (a) Control group exhibits
two demand equilibria, namely No Conversion and Low Conversion. The slope of budget constraint is 1. (b) Semi-treatment
group shows two demand equilibria, that is No Conversion and High Conversion. The slope of budget constraint is equal to
1−. (c) Full-treatment group has three demand equilibria: No Conversion, Low Conversion and High Conversion. The slope
of budget constraint is 1 if < and 1 − otherwise.
99
(a) Δ() > 0, Δ() > 0 (b) Δ() < 0, Δ() < 0
(c) Δ() < 0, Δ() > 0 (d) Δ() > 0, Δ() < 0
Figure A.8: Graphical analysis of optimal policy marginal variation
Note: Optimal treatment policy function (1.5.9) varies depending on the sign and the magnitude of CATEs for demand
extensive margin Δ() and demand intensive margin Δ(). The solid square depicts the expected profit for the control
group as a product of the likelihood of conversion () and the expected gross order (). Positive and negative CATEs
of both dimensions are represented by red and blue dashed lines, respectively, on the right and upper square sides. The
dashed parallelogram portrays the expected profit for the treatment group as a product of the likelihood of conversion
() = () + Δ() and the expected gross order () = () + Δ(). The blue dashed quadrilateral at the
lower left corner stands for the expected cost of treatment t which is equal to the expected profit for the treatment group times
the expected discount, namely the discount percentage dt multiplied by the likelihood of complying to MPTt (). Four
scenarios are represented and described to study marginal changes in optimal treatment policy. In all four cases the expected
cost of treatment t is equal to −()()() and it is negative (graphically blue).
For cases (a) and (b) the product Δ()Δ() is positive (graphically red), while the control effects ()Δ() and
()Δ() are either positive or negative, depending on the cases:
(a) () =
Δ()
Δ()
+ ()
Δ()
+ ()
Δ()
−
()()()
,
(b) () =
Δ()
Δ()
+ ()
Δ()
+ ()
Δ()
−
()()()
.
For cases (c) and (d) the product Δ()Δ() is negative (graphically blue), while the control effects ()Δ() and
()Δ() are either positive or negative, depending on the cases:
(c) () =
Δ()
Δ()
+ ()
Δ()
+ ()
Δ()
−
()()()
,
(d) () =
Δ()
Δ()
+ ()
Δ()
+ ()
Δ()
−
()()()
.
100
−100 −75 −50 −25 0 25 50 75 100
Semi-treatment optimal policy function fbiST (x)
0
50
100
150
200
250
300
350
Frequency
fbiST (x) ≤ 0
fbiST (x) > 0
(a) Semi-treatment group
−100 −75 −50 −25 0 25 50 75 100
Full-treatment optimal policy function fbiF T (x)
0
200
400
600
800
1000
Frequency
fbiF T (x) ≤ 0
fbiF T (x) > 0
(b) Full-treatment group
Figure A.9: Distributions of optimal policy function b() per treatment
Note: Panels show the distribution of estimated optimal treatment policy function b() for semi-treatment and full-treatment
computed though GBDT ensemble method as described in Algorithm 6. The semi-treatment testing sample has 19,908 observations, mean is equal to 0.49, standard deviation 31.43 and median 0.00. Instead, the full-treatment testing sample features 36,748
observations, mean is equal to -7.68, standard deviation 35.03 and median 0.02. Histograms focus on the interval centered
at 0 between −100 and +100. Given a testing sample of 19,908 and 36,748 observations of semi-treatment and full-treatment
respectively, (a) the semi-treatment plot includes 5,577 negative and 5,415 positive data points, while (b) the full-treatment
graph counts 10,570 and 24,857 negative and positive observations, respectively.
101
0 20 40 60 80 100
Profit (EUR)
0
50
250
200
150
100
Number of Sessions
Aggregate Supply
Control
Semi-Treatment
Full-Treatment
(a) Profit distribution of uniform policy per treatment
0 20 40 60 80 100
Profit (EUR)
0
50
250
200
150
100
Number of Sessions
Aggregate Supply
Control
Semi-Treatment
Full-Treatment
(b) Profit distribution of optimal treatment policy per treatment
20 40 60 80 100
Indirect Utility (EUR)
0
800
600
400
200
1000
Number of Sessions
Aggregate Demand
Control
Semi-Treatment
Full-Treatment
(c) Indirect utility distribution of uniform policy per treatment
20 40 60 80 100
Indirect Utility (EUR)
0
700
600
500
400
300
200
100
Number of Sessions
Aggregate Demand
Control
Semi-Treatment
Full-Treatment
(d) Indirect utility distribution of optimal treatment policy per treatment
Figure A.10: Treatment distributions of profit and indirect utility per policy
Note: Panels (a) and (b) show the profit distribution of uniform policy and of optimal treatment policy at the aggregate level
and at each treatment-group level. Panels (c) and (d) display the indirect utility distribution of uniform policy and of optimal
treatment policy at the aggregate level and at each treatment-group level. 102
20 40 60 80 100
Profit (EUR)
0
70
60
50
40
30
20
10
Number of Sessions
Aggregate Supply
Control
Semi-Treatment
Full-Treatment
(a) Profit distribution of uniform policy per treatment
20 40 60 80 100
Profit (EUR)
0
80
70
60
50
40
30
20
10
Number of Sessions
Aggregate Supply
Control
Semi-Treatment
Full-Treatment
(b) Profit distribution of optimal treatment policy per treatment
20 40 60 80 100
Indirect Utility (EUR)
0
50
300
250
200
150
100
Number of Sessions
Aggregate Demand
Control
Semi-Treatment
Full-Treatment
(c) Indirect utility distribution of uniform policy per treatment
20 40 60 80 100
Indirect Utility (EUR)
0
80
60
40
20
160
140
120
100
Number of Sessions
Aggregate Demand
Control
Semi-Treatment
Full-Treatment
(d) Indirect utility distribution of optimal treatment policy per treatment
Figure A.11: Treatment distributions of profit and indirect utility per policy for new users
Note: Panels (a) and (b) show the profit distribution of uniform policy and of optimal treatment policy at the aggregate level
and at each treatment-group level for new users. Panels (c) and (d) display the indirect utility distribution of uniform policy
and of optimal treatment policy at the aggregate level and at each treatment-group level for new users. 103
0 20 40 60 80 100
Profit (EUR)
5
0
40
35
30
25
20
15
10
Number of Sessions
Aggregate Supply
Control
Semi-Treatment
Full-Treatment
(a) Profit distribution of uniform policy per treatment
0 20 40 60 80 100
Profit (EUR)
0
50
40
30
20
10
Number of Sessions
Aggregate Supply
Control
Semi-Treatment
Full-Treatment
(b) Profit distribution of optimal treatment policy per treatment
20 40 60 80 100
Indirect Utility (EUR)
0
80
60
40
20
160
140
120
100
Number of Sessions
Aggregate Demand
Control
Semi-Treatment
Full-Treatment
(c) Indirect utility distribution of uniform policy per treatment
20 40 60 80 100
Indirect Utility (EUR)
0
80
60
40
20
140
120
100
Number of Sessions
Aggregate Demand
Control
Semi-Treatment
Full-Treatment
(d) Indirect utility distribution of uniform policy per treatment
Figure A.12: Treatment distributions of profit and indirect utility per policy for returning users
Note: Note: Panels (a) and (b) show the profit distribution of uniform policy and of optimal treatment policy at the aggregate
level and at each treatment-group level for returning users. Panels (c) and (d) display the indirect utility distribution of uniform
policy and of optimal treatment policy at the aggregate level and at each treatment-group level for returning users. 104
20 40 60 80 100
Profit (EUR)
0
80
70
60
50
40
30
20
10
Number of Sessions
Uniform
Optimal Treatment
(a) Profit distribution per policy
0 20 40 60 80 100
Profit (EUR)
0
500
2500
2000
1500
1000
Producer Surplus (EUR)
Uniform
Optimal Treatment
(b) Producer surplus distribution by profit per policy
20 40 60 80 100
Indirect Utility (EUR)
0
50
300
250
200
150
100
Number of Sessions
Uniform
Optimal Treatment
(c) Indirect utility distribution per policy
0 20 40 60 80 100
Indirect Utility (EUR)
0
5000
20000
15000
10000
Consumer Surplus (EUR)
Uniform
Optimal Treatment
(d) Consumer surplus distribution by indirect utility per policy
Figure A.13: Welfare analysis for new users
Note: Panel (a) shows the distribution of profit of new users. Panel (b) displays the distribution of producer surplus of new
users by profit levels. Panel (c) illustrates the distribution of indirect utility of new users. Panel (d) reports the distribution of
consumer surplus of new users by indirect utility levels. 105
0 20 40 60 80 100
Profit (EUR)
0
50
40
30
20
10
Number of Sessions
Uniform
Optimal Treatment
(a) Profit distribution per policy
0 20 40 60 80 100
Profit (EUR)
0
800
600
400
200
1600
1400
1200
1000
Producer Surplus (EUR)
Uniform
Optimal Treatment
(b) Producer surplus distribution by profit per policy
20 40 60 80 100
Indirect Utility (EUR)
0
80
60
40
20
160
140
120
100
Number of Sessions
Uniform
Optimal Treatment
(c) Indirect utility distribution per policy
0 20 40 60 80 100
Indirect Utility (EUR)
0
8000
6000
4000
2000
12000
10000
Consumer Surplus (EUR)
Uniform
Optimal Treatment
(d) Consumer surplus distribution by indirect utility per policy
Figure A.14: Welfare analysis for returning users
Note: Panel (a) shows the distribution of profit of returning users. Panel (b) displays the distribution of producer surplus of
returning users by profit levels. Panel (c) illustrates the distribution of indirect utility of returning users. Panel (d) reports the
distribution of consumer surplus of returning users by indirect utility levels. 106
A.5 Machine Learning and Algorithms
A.5.1 Ensemble Machine Learning Methods
Ensemble learning is a set of machine learning techniques that uses multiple baseline learners to train
an ensemble learner. According to Opitz and Maclin (1999) ensemble learning is divided into ensemble
averaging and ensemble boosting. Averaging methods construct several estimators independently and
then average their predictions, instead boosting methods built base estimators sequentially and weighted
combination of estimators aims at reducing the bias. Bagging (BAG) and random forest (RF) belong to
ensemble averaging, while adaptive boosting (AB), gradient boosting (GBDT), and neural network (NN)
are examples of ensemble boosting. These methods are discussed and related algorithms are summarized.
Mienye and Sun (2022) provides a thorough survey on ensemble learning.
Bagging. Historically the bagging method, literally bootstrap aggregating, was first suggested by Breiman
(1996b) to enhance the classification performance of machine learning models by combining the predictions from randomly generated training sets. The BAG estimator is an ensemble meta-algorithm designed
to improve both stability and accuracy of the machine learning algorithms used in statistical classification and regression. This ensemble estimator fits base classifiers (regressors) on random subsets of the
original dataset and then aggregates their individual predictions to construct a combined outcome. Such
a meta-estimator can typically be used in order to reduce the variance of a black-box estimator (e.g., a
decision tree or neural network) by means of introducing randomization into its development (Oza &
Tumer, 2008). Basically, the bagging method involves splitting the training data for each base learner
using random sampling to generate different subsets used to train base learners. The base learners
are then combined using majority voting to obtain a strong classifier. The most significant advantage
comes from efficiently decreasing the variance without increasing bias (Algorithm 1). Further advantages
of BAG include the ability to introduce diversity in the input data because of the bootstrapping approach
(Kotsiantis, 2014). Random forests are a standard implementation of the bagging technique.
Random forest. A RF is a meta-estimator that fits a number of decision tree classifiers on different subsamples of the dataset and utilize averaging to improve the predictive accuracy and to control overfitting.
In random forests each tree in the ensemble is built from a bootstrap drawn from the training set. For
classification, the output of the random forest is the class selected by most trees. For regression, the
average prediction of the individual trees is returned. In the pioneering work by Ho (1995), random
forests algorithm uses bagging technique to build multiple decision trees by means of bootstrapped
samples. The bagging technique generates random samples with replacements from the input data and
trains the decision trees from the samples (Algorithm 2). A main advantage of leveraging this algorithm
is its power to solve the overfitting issue usual in decision tree models. Furthermore, when input dataset
features high dimensional data, prediction performances of random decision forests tend to improve
significantly (e.g., Han & Kim, 2019; Schonlau & Zou, 2020).
Adaptive boosting. The adaptive boosting, also AdaBoost, algorithm is a type of boosting algorithm
capable of using weak learners to obtain a robust estimator. This boosting method was developed by
107
Freund and Schapire (1997) and it is among the most efficient ML algorithms. An AB classifier (regressor)
is a statistical meta-estimator that begins by fitting a classifier (regressor) on the original dataset. Thus,
AdaBoost fits additional copies of the classifier (regressor) on the same dataset, but where the weights
of incorrectly classified instances are adjusted such that subsequent estimators focus more on difficult
cases. The working principle of the algorithm is to fit a sequence of weak learners on repeatedly modified
versions of the data. All predictions are then combined through a weighting rule to produce the aggregate
prediction. The data modifications at each boosting iteration consists of applying weights 1, . . . ,
to each of the training samples. Initially, those weights are all set to = 1/, so that the first step
simply trains a weak learner on the original data. For each successive iteration, the sample weights are
individually modified and the learning algorithm is reapplied to the reweighted data. At a certain step,
those training examples that were incorrectly predicted by the boosted model induced at the previous
step have their weights increased, whereas the weights are decreased for those that were predicted
correctly. Each subsequent weak learner is thereby forced to concentrate on the examples that are missed
by the previous ones in the sequence (Algorithm 3) (F. Wang et al., 2019; C. Wang et al., 2021). The AB is
flexible, relatively easy to implement, and accommodates a variety of algorithms as the base learner.
Gradient boosting. The gradient boosting is a ML algorithm that uses the boosting technique to create
strong ensembles. It is also called gradient boosted decision tree since it mainly adopts decision trees
as the base learner to produce a robust ensemble classifier or regressor. GBDT was first introduced by
Breiman (1996a) and then developed by the seminal work of Friedman (2002). Gradient boosting for
classification (regression) is an algorithm that builds an additive model in a forward stage-wise fashion.
Indeed, it allows for the optimization of arbitrary differentiable loss functions and, at each stage,
classes of trees are fit on the negative gradient of the loss function. Specifically, the learning process
of this algorithm involves sequentially training new models to obtain a robust classifier (regressor). It
is built in a marginal manner, similarly to other boosting techniques, but its core intuition consists of
developing base learners that are highly correlated with the negative gradient of the loss function related
to the entire ensemble (Algorithm 4). The major advantage of GBDT is that it can learn complex patterns
from the input data since it is trained to correct the errors of the previous model.
Neural network. A multilayer perceptron (MLP) is a feed-forward artificial neural network, consisting
of fully connected neurons with a nonlinear kind of activation function, organized in at least three layers,
and able to distinguish data that is not linearly separable. Mathematically, multilayer perceptron is a
supervised learning algorithm that learns a function (·) : R
→ R
by training on a data set, where is
the number of dimensions for input and is the number of dimensions for output. Given a set of features
= 1, 2, . . . , and a target , it can learn a non-linear function approximation for either classification
or regression. Between the input and the output layer, there can be one or more non-linear layers, called
hidden layers (Algorithm 5). The important advantage of NN consists in learning non-linear models (e.g.
Tkáč & Verner, 2016; Hospedales et al., 2021; Shobana & Umamaheswari, 2021).
108
A.5.2 Ensemble Learning: Bagging, Random Forest, Adaptive Boosting, Gradient Boosting
and Neural Network
Algorithm 1 Bagging
Input: = {(1, 1), . . . ,(, )} training data; base ML algorithm; number of base learners
Procedure:
for = 1, . . . , do
(i) Generate a bootstrap sample from the training data
(ii) Fit a base learner ℎ using , i.e. ℎ = ( )
(iIi) Combine the outputs of the base learners () = mode(ℎ1(), . . . , ℎ ())
end for
Output: Return the bagging final learner ()
Algorithm 2 Random forest
Input: = {(1, 1), . . . ,(, )} training data; attributes and class variables; number of trees in the
forest; number of class labels
Procedure:
for = 1, . . . , do
(i) Generate a bootstrap sample from the training data
(ii) Fit a base learner ℎ using and, for a given node ,
(a) Randomly select attributes (usually =
√)
(b) Compute the best split features using the randomly selected feature subset
(c) Split the node using the optimal split features obtained in (b)
(d) Repeat (a), (b) and (c) until the stopping criteria is achieved
(iii) Repeat (i) and (ii) for times to build a forest of trees
(iv) For a given test sample , combine the outputs from the trees in the final predicted class
label, i.e. () =
P
=1 1{ℎ()=} ∀ = 1, . . . ,
end for
Output: Return the random forest final predicted class label ()
109
Algorithm 3 Adaptive boosting
Input: = {(1, 1), . . . ,(, )} training data; base ML algorithm; {, } set of vector of weights;
number of iterations
Procedure: Initialize the model with initial vector 0 using 0() = 1
= 1, . . . ,
for = 1, . . . , do
(i) Generate a bootstrap sample from the training data
(ii) Train the base classifier ℎ() by minimizing =
P
=1 ()1{ℎ()6=}
(iii) Calculate the weights of the classifier ℎ() using =
1
2
ln( 1−
)
(iv) Update the vector of weights +1() = ()
exp
− ℎ()
= 1, . . . ,
(v) Apply () = sgn P
=1 ℎ()
end for
Output: Return the adaptive boosting final classifier ()
Algorithm 4 Gradient boosting
Input: = {(1, 1), . . . ,(, )} training data; (, ()) specified differential loss function; number
of iterations
Procedure: Initialize the model with constant Γ using 0 = Γ
P
=1 (
, Γ)
for = 1, . . . , do
(i) Generate a bootstrap sample from the training data
(ii) Calculate the pseudo-residuals =
h
∂(,())
∂()
i
()=−1()
∀ = 1, . . . ,
(iii) Train a base learner closed under scaling ℎ() using the training set =
,
=1
(iv) Compute multiplier by performing the line search optimization ℎ() =
,ℎ P
=1 (
, −1() + ℎ())
(iv) Update the model () = −1() + ℎ()
end for
Output: Return the gradient boosting final model ()
Algorithm 5 Neural network
Input: = {(1, 1), . . . ,(, )} training data; (, (), ) specified differential loss function;
vector of weights for respective layers; {1, 2} model parameters, learning rate parameter; number
of layers; number of iterations
Procedure: Initialize the model with initial vector 0 and constant Γ using 0 =
Γ,0
P
=1 (
, Γ, 0)
for = 1, . . . , do
(i) Generate a bootstrap sample from the training data
(ii) Compute the vector of weights () by minimizing (
, (), )
(iii) Update the vector of weights = −
h
∂(,(),)
∂ i
=−1
∀ = 1, . . . ,
(iv) Update the model () = (
+ 1) + 2
end for
Output: Return the neural network final model ()
110
A.5.3 Estimation of Primitives of the Policy Model
Algorithm 6 Pseudocode for estimation of optimal treatment policy and counterfactual policies
Input:
= {
,
,
,
, }
=1 representative sample of RCTs dataset with observed conversion decision
,
compliance response
, realized gross order
, assigned treatment
, and covariates matrix
of users in the experiment;
= {
,
,
,
,
, ℎ
,
,
,
, }
=1 covariates matrix of RCTs dataset with time to
session
, number of pages
, attained cart dummy
, time to cart
, new user dummy
,
purchase history dummy ℎ
, number of sessions
, number of prior visits
, number of prior
purchases
, number of prior abandoned carts
;
= {, , } treatment space with control set = ∈ {0, 0}, semi-treatment set = ∈ R
+×{0},
and full-treatment set = ∈ R
+ × R
+;
hyperparameters set for ensemble method calibration
Output:
b() ∀ ∈ −{}
, estimated treatment policy functions associated to optimal treatment policy and
counterfactual policies
Procedure:
function Estimate Response Functions (, )
for Treatment pair ∈ do
b() ←− train selected supervised ensemble classifier method
b() ←− train selected supervised ensemble regressor method
b () ←− train selected supervised ensemble classifier method for =
end for
return Estimated response functions {b(), b (), b (), b(), b (), b (), b ()}
end function
function Estimate CATEs (b(), b (), b (), b(), b (), b (), b ())
for Treatment pair ∈ −{} do
Δc() ←− b() − b()
Δc() ←− b() − b()
end for
return Estimated CATE functions {Δc (), Δc (), Δc (), Δc ()}
end function
◁ Pseudocode continues
111
◁ Pseudocode continues
function Estimate Treatment Policy Functions (, )
{b(), b (), b (), b(), b (), b (), b ()} ←− Estimate Response Functions
(, )
{Δc (), Δc (), Δc (), Δc ()} ←− Estimate CATEs
(b(), b (), b (), b(), b (), b (), b ())
Non-treatment policy
b
() ←− −1
b
() ←− −1
Non-targeted treatment policy
b
() ←− Δ ()Δ () + ()Δ () + ()Δ () − () ()
b
() ←− Δ ()Δ ()+()Δ ()+()Δ ()− () () ()
Conversion-targeted treatment policy
b
() ←− Δc
b
() ←− Δc
Profit-targeted treatment policy
b
() ←− Δc
b
() ←− Δc
Double-positive treatment treatment policy
b
() ←−
1{Δc()>0}Δc()
1{Δc()>0}Δc()
b
() ←−
1{Δc()>0}Δc()
1{Δc()>0}Δc()
b()
Optimal treatment policy
b
() ←− Δc ()Δc () + b()Δc () + b()Δc () − b ()b ()
b
() ←− Δc ()Δc ()+b()Δc ()+b()Δc ()−b ()b ()b ()
return Estimated treatment policy functions {b
(), b
()} for optimal treatment policy and
counterfactual policies.
end function
112
Appendix B
Appendix to Chapter 2
B.1 Omitted Tables
Table B.1: Descriptive statistics of experiment participants
Age Frequency Percentage Male Male Female Female
Frequency Percentage Frequency Percentage
3 16 8.247 8 10.390 8 6.838
4 25 12.887 9 11.688 16 13.675
5 22 11.340 6 7.792 16 13.675
6 20 10.309 6 7.792 14 11.966
7 15 7.732 7 9.091 8 6.838
8 20 10.309 8 10.390 12 10.256
9 20 10.309 11 14.286 9 7.692
10 31 15.979 13 16.883 18 15.385
11 25 12.887 9 11.688 16 13.675
194 100 77 100 117 100
113
Table B.2: Upward comparison: Regret and envy
OLS OLS with OLS with
random effect random effect and
random slopes
(I) (II) (III)
Feedback -0.052 -0.048 -0.045
(-0.134, 0.031) (-0.125, 0.029) (-0.132, 0.041)
Gender -0.049 -0.051 -0.051
(-0.131, 0.034) (-0.150, 0.049) (-0.151, 0.048)
Age -0.075∗∗∗ -0.074∗∗∗ -0.073∗∗∗
(-0.107, -0.044) (-0.111, -0.036) (-0.111, -0.036)
Choice -0.183∗∗∗ -0.172∗∗∗ -0.170∗∗∗
(-0.263, -0.103) (-0.250, -0.094) (-0.247, -0.092)
Feedback × Gender -0.008 -0.010 -0.011
(-0.091, 0.074) (-0.087, 0.067) (-0.097, 0.075)
Feedback × Age 0.121∗∗∗ 0.123∗∗∗ 0.123∗∗∗
(0.090, 0.153) (0.094, 0.152) (0.090, 0.156)
Gender × Age 0.015 0.015 0.015
(-0.017, 0.046) (-0.023, 0.053) (-0.022, 0.053)
Feedback × Gender × Age -0.016 -0.016 -0.015
(-0.047, 0.015) (-0.045, 0.014) (-0.048, 0.018)
Constant -0.497∗∗∗ -0.493∗∗∗ -0.490∗∗∗
(-0.580, -0.414) (-0.592, -0.394) (-0.590, -0.391)
Note: Model estimates for models predicting children’s emotion ratings in upward comparison trials (95% confidence intervals
in parentheses, ∗
< 0.1;
∗∗ < 0.05;
∗∗∗ < 0.01).
114
Table B.3: Downward comparison: Relief and gloating
OLS OLS with OLS with
random effect random effect and
random slopes
(I) (II) (III)
Feedback 0.109∗∗∗ 0.111∗∗∗ 0.111∗∗∗
(0.052, 0.166) (0.061, 0.162) (0.060, 0.162)
Gender -0.011 -0.012 -0.012
(-0.068, 0.046) (-0.087, 0.063) (-0.087, 0.063)
Age 0.075∗∗∗ 0.076∗∗∗ 0.076∗∗∗
(0.053, 0.096) (0.048, 0.104) (0.048, 0.104)
Choice 0.070∗∗ 0.071∗∗∗ 0.070∗∗
(0.013, 0.127) (0.018, 0.125) (0.016, 0.123)
Feedback × Gender 0.030 0.029 0.029
(-0.027, 0.087) (-0.021, 0.080) (-0.022, 0.080)
Feedback × Age -0.045∗∗∗ -0.044∗∗∗ -0.044∗∗∗
(-0.066, -0.024) (-0.063, -0.025) (-0.063, -0.025)
Gender × Age 0.029∗∗∗ 0.029∗∗∗ 0.029∗∗∗
(0.008, 0.050) (0.008, 0.050) (0.008, 0.050)
Feedback × Gender × Age -0.007 -0.007 -0.007
(-0.028, 0.014) (-0.026, 0.012) (-0.026, 0.012)
Constant 1.398∗∗∗ 1.400∗∗∗ 1.399∗∗∗
(1.341, 1.455) (1.325, 1.475) (1.324, 1.474)
Note: Model estimates for models predicting children’s emotion ratings in downward comparison trials (95% confidence
intervals in parentheses, ∗
< 0.1;
∗∗ < 0.05;
∗∗∗ < 0.01).
115
Table B.4: Choice adaptation
Odds Ratio
Evaluation 1.018∗∗∗ (1.001, 1.037)
Feedback 1.051∗∗∗ (1.025, 1.078)
Gender 0.977∗
(0.953, 1.003)
Age 1.011∗∗ (1.001, 1.021)
Choice 1.006 (0.979, 1.033)
Evaluation × Feedback 0.958∗∗∗ (0.943, 0.974)
Evaluation × Gender 1.003 (0.985, 1.021)
Evaluation × Age 1.002 (0.995, 1.009)
Feedback × Gender 0.991 (0.966, 1.016)
Feedback × Age 0.994 (0.985, 1.004)
Feedback × Choice 1.040∗∗∗ (1.015, 1.065)
Gender × Age 0.994 (0.984, 1.004)
Gender × Choice 0.990 (0.964, 1.017)
Age × Choice 1.012∗∗ (1.002, 1.022)
Evaluation × Choice 1.015∗
(0.998, 1.031)
.
.
.
Evaluation × Feedback × Gender × Age × Choice 1.001 (0.995, 1.008)
Constant 1.624∗∗∗ (1.582, 1.665)
Note: Model estimates for model predicting shifts in children’s choices at trial + 1 (95% confidence intervals in parentheses,
∗
< 0.1;
∗∗ < 0.05;
∗∗∗ < 0.01).
Table B.5: Estimates of demand model parameters
Relief Regret Gloating Envy
^ 0.4966 0.5129 0.5455 0.5395
^ 0.5034 0.4871 0.4545 0.4605
1.2871 -0.4070 1.4512 -0.2143
1.4300 -0.5608 1.5317 -0.6778
0.5998 0.9958 0.6123 0.8513
0.0878 0.0634 0.2998 0.0734
0 ( ) 0.0878 0.0658 0.0829 0.1980
0 ( ) 0.5875 0.9832 0.1177 0.7264
^ 17.4511 17.5590 17.4815 17.5071
^ 17.4649 17.5074 17.2992 17.3488
3.3233 5.1095 3.3250 14.5181
12.0089 -30.6382 10.6432 -24.0233
N 582 388 451 519
116
Table B.6: Estimates of demand model parameters. Downward comparison: Relief and gloating
Relief Gloating
3 - 5 years 6 - 8 years 9 - 11 years 3 - 5 years 6 - 8 years 9 - 11 years
^ 0.3968 0.5333 0.5526 0.4305 0.6219 0.5912
^ 0.6032 0.4667 0.4474 0.5696 0.3782 0.4088
0.7600 1.5909 1.3889 1.2154 1.7162 1.4112
1.0789 1.7273 1.5980 1.3605 1.8222 1.5541
0.6281 0.5865 0.5922 0.6786 0.6069 0.5757
0.0865 0.0785 0.0962 0.2909 0.3104 0.3037
0 ( ) 0.0806 0.0982 0.0849 0.0797 0.0833 0.0846
0 ( ) 0.5778 0.6025 0.5872 0.1186 0.1177 0.1167
^ 17.1446 17.5284 17.4443 17.0502 17.8737 17.6980
^ 17.5633 17.3948 17.2330 17.3302 17.3763 17.3292
8.4148 2.9303 4.8507 3.7446 4.7951 6.5409
15.6037 9.9376 10.4918 11.9377 8.7190 9.8726
N 189 165 228 151 119 181
Table B.7: Estimates of demand model parameters. Upward comparison: Regret and envy
Regret Envy
3 - 5 years 6 - 8 years 9 - 11 years 3 - 5 years 6 - 8 years 9 - 11 years
^ 0.5000 0.5000 0.5329 0.4939 0.5833 0.5427
^ 0.5000 0.5000 0.4671 0.5061 0.4167 0.4573
0.6825 -0.9455 -0.8889 -0.5556 -0.0659 -0.0833
0.3175 -1.3091 -0.7606 -0.6386 -0.9385 -0.5275
1.0000 1.0000 0.9897 0.7365 0.8704 0.8623
0.0625 0.0639 0.0639 0.0717 0.0793 0.0706
0 ( ) 0.0625 0.0625 0.0706 0.2065 0.2001 0.1897
0 ( ) 1.0000 0.9716 0.9773 0.7063 0.7270 0.7442
^ 17.5459 17.4018 17.5297 17.4418 17.7780 17.4939
^ 17.5459 17.4018 17.5297 17.4662 17.4415 17.3226
23.3101 5.0676 -3.0641 3.3315 19.1398 17.3382
59.8588 -13.0458 -23.1325 -26.9788 -16.9674 -30.5186
N 126 110 152 164 156 199
117
B.2 Omitted Figures
Figure B.1: Propensity of risk by age and by gender
Note: Male participants appear to be less risk averse than female participants, and risk aversion seems to decrease with age
with a net drop after 6 and 7 years of age. Dash lines represent averages of proportion of risky choices of male participants (in
blue, 0.47% with s.d. = 0.09%) and of female participants (in red, 0.46% with s.d. = 0.08%).
118
Figure B.2: Propensity of risk by age and by comparison with partial feedback
Note: Dash lines represent averages of proportion of risky choices for relief (in dark green, 0.45% with s.d. = 0.10%) and for
regret (in dark blue, 0.49% with s.d. = 0.05%).
Figure B.3: Propensity of risk by age and by comparison with complete feedback
Note: Dash lines represent averages of proportion of risky choices for gloating (in light green, 0.45% with s.d. = 0.16%) and for
envy (in light blue, 0.44% with s.d. = 0.11%).
119
Figure B.4: Emotional evaluation by age and by gender
Note: Emotional ratings appear to be steady over years and similar around gender-specific means. Dash lines represent averages
of emotional ratings of male participants (in blue, 0.39 with s.d. = 0.25) and of female participants (in red, 0.49 with s.d. = 0.14).
Figure B.5: Emotional evaluation by age and by comparison with partial feedback
Note: Dash lines represent averages of emotional ratings for relief (in dark green, 1.31 with s.d. = 0.43) and for regret (in dark
blue, -0.49 with s.d. = 0.80).
120
Figure B.6: Emotional evaluation by age and by comparison with complete feedback
Note: Dash lines represent averages of emotional ratings for gloating (in light green, 1.51 with s.d. = 0.29) and for envy (in light
blue, -0.52 with s.d. = 0.19).
121
(a) Partial downward comparison: Relief (b) Partial upward comparison: Regret
(c) Complete downward comparison: Gloating (d) Complete upward comparison: Envy
Figure B.7: Propensity of risk by age and by gender conditional on comparisons with partial and complete
feedback
122
(a) Partial downward comparison: Relief (b) Partial upward comparison: Regret
(c) Complete downward comparison: Gloating (d) Complete upward comparison: Envy
Figure B.8: Emotional evaluation by age and by gender conditional on comparisons with partial and
complete feedback
123
Appendix C
Appendix to Chapter 3
C.1 Critical Review
Authors Title Publication Journal
Fouragnan et al. (2013) Reputational priors magnify striatal
responses to violations of trust
2013 Journal of Neuroscience
Reputational priors affect decisions to trust in at least two ways: (1) they influence expectations and
decisions in initial stages of the interaction and (2) they influence the learning mechanisms involved in
repeated interactions. From a neural point of view, the presence of priors (versus absence of priors)
generates enhanced activation in the mPFC and an inverse activation pattern in bilateral anterior insula.
mPFC and dlPFC encode the value of reputation priors. Reputation priors magnify striatal responses to
violation of trust (the caudate nucleus tracks estimated prediction errors, triggering learning). However,
when such priors are available, other brain regions (vlPFC) may contribute to keep decisions anchored to
the priors, thus relatively discounting the weight of conflicting evidence. The vlPFC has strong functional
connectivity with the caudate after violation of trust in the prior compared with no-prior conditions, so that
striatal de-activation is observed in this condition, which does not correlate with learning rates. Finally, the
activity in the vlPFC is inversely correlated with individual retaliation rates after violations of trust.
B. Webb, Hine, and Bailey (2016)
Difficulty in Differentiating Trustworthiness From Untrustworthiness in Older Age
2016 Developmental Psychology
124
There are no age-related differences in learning to recognize the trustworthiness or untrustworthiness
of trustees at varying social distances. Young and older adults do not differ in their investment with
socially close (trustworthy) trustees , but older adults invest more than young adults with neutral and
distant (trustworthy) trustees. Also, older adults invest more than younger adults with untrustworthy
(and socially distant) trustees. Moreover, longer term evidence about the trustworthiness of a socially close
trustee (compared to a a distant one) makes it more difficult to update perceptions of trustworthiness based
on more recent experience. This is true for both young and older adults.
Bailey et al. (2016) Age-related Similarities and Differences in First Impressions of Trustworthiness
2016 Cognition and Emotion
When deciding whom to trust we are influenced by facial appearance and reputational information. There is
no major age group effect, but, compared to young adults, older adults invest more money with trustees with
a reputation for providing low returns and for being uncooperative. There is no interaction between facial
appearance and reputational information, and this fits with the trust decay model. There is no age-related
increase in relying more on facial appearance: both young and older adults give more weight to third-party
reputational information, which is more informative (and deliberative), than to facial information, which
is more superficial (and automatically processed).
Bailey and Leon (2019) A Systematic Review and MetaAnalysis of Age-Related Differences
in Trust
2019 Psychology and Aging
The meta-analysis shows that there is an overall effect of age-group on trust. Under no circumstances
young adults are more trusting than older adults. Older adults were significantly more trusting than young
adults in response to negative indicators of trustworthiness, regardless of the type of trust (financial vs
non financial) or type of responding (self-reported vs behavioral). Expressions of trust are greater among
older adults relative to younger adults in response to positive and neutral indicators of trustworthiness, but
only when expressed non financially. Older adults self-reported being more trusting than young adults in
response to neutral cues of trustworthiness. These findings, together with the lack of increased financial
trust in the trustworthy, suggest a heightened risk of financial exploitation and abuse in older age.
Bailey et al. (2015) Trust and Trustworthiness in Young
and Older Adults
2015 Psychology and Aging
125
Two studies show a dissociation between the stereotype of older adults as more trustworthy and the
lack of increased investing with older adults.In the context of a ‘real life’ investment task involving risk
(study 2), older adults are generally no more likely than young adults to exhibit trust. Older adults are
disproportionately influenced by social partners who they feel closer to and they have reduced concerns
for reputation: they are more trustworthy when interacting with anonymous same age partners. Instead,
young adults demonstrate greater reputational concerns: they were more trustworthy when interacting
face-to-face. These findings suggest that the influence of age on trust and trustworthiness may rely, at least
in part, on own-age biases and anonymity.
Castle et al. (2012) Neural and Behavioral Bases of Age
Differences in Perceptions of Trust
2012 Proceedings of the National Academy of Sciences
Two studies, one behavioral and one using neuroimaging methodology, that found age differences in perceptions of trust. Older adults did not discriminate trustworthy from untrustworthy faces as sharply as
younger adults did: they perceived untrustworthy faces to be significantly more trustworthy and approachable than younger adults did. In addiction, compared to younger adults, older adults showed a muted
activation of the anterior insula, both when making ratings of trustworthiness and when viewing untrustworthy faces. These results suggest that older adults may have a lower visceral warning signal in response
to cues of untrustworthiness, which could make deciding whom to trust difficult, and may at least partially
underlie their vulnerability to fraud.
Spreng et al. (2017) Financial Exploitation Is Associated With Structural and Functional
Brain Differences in Healthy Older
Adults
2017 Journals of Gerontology Series A: Biomedical Sciences and Medical Sciences
Structural and functional brain differences in regions implicated in socioemotional processing are associated
with financial exploitation risk. Indeed, financially exploited older adults showed: (a) a cortical thinning in
anterior insula, a core node of the salience network associated with salience-detection, affect-based decisionmaking and reward anticipation (affective informations); (b) a cortical thinning in right posterior superior
temporal cortices, which are parts of the default network, associated with social reasoning; (c) greater
functional interactions between salience and default networks, suggesting that exploited older adults may
place greater reliance on low fidelity, and possibly misleading, social information to guide affectively based
decision-making.
Zebrowitz, Ward,
Boshyan, Gutchess, and
Hadjikhani (2018)
Older Adults’ Neural Activation in
the Reward Circuit is Sensitive to
Face Trustworthiness
2018 Cognitive, Affective,
and Behavioral Neuroscience
126
There are significant effects of face trustworthiness on the activation of the reward circuit that are exclusive
to older adults: the amygdala shows stronger activation to high than to medium trustworthy faces, the
caudate shows stronger right caudate activation to high than to low trustworthy faces, and stronger left
caudate activation to high than to medium trustworthy faces. There is an effect that is not moderated by age
in the dACC: significantly stronger activation to high than low trustworthy faces. There are no significant
effects of face trustworthiness on activation in the insula, mOFC, NAcc, or vmPFC. Finally, older adults’
ratings of face trustworthiness are more positive than those of young adults, consistent with an age-related
positivity effect.
Suzuki (2018) Persistent Reliance on Facial Appearance Among Older Adults
When Judging Someone’s Trustworthiness
2018 The Journals of Gerontology: Series B
Young and older adults give similar trustworthiness ratings to the trustworthy-looking faces and
untrustworthy-looking faces. Even though both young and older adults show a learning effect across
blocks in repeated trust games, young adults are faster then older adults at learning to invest with good
trustees and not to invest with bad trustees. Unlike young adults, older adults judge someone’s trustworthiness based on facial appearance even after repeatedly experiencing that such face-based judgments are
invalid. (These results suggest a decline in older adult’s ability to adjust judgments regarding the trustworthiness of others in response to their cooperative or cheating behaviors during repeated social exchanges).
In conclusion, aging may interfere with learning-based adjustments in decision making.
Santos, Almeida,
Oliveiros, and CasteloBranco (2016)
The Role of the Amygdala in Facial
Trustworthiness Processing: A Systematic Review and Meta-Analyses
of fMRI Studies
2016 PloS one
This systematic review and meta-analyses provide evidence for a role of the amygdala in trustworthiness
processing. Specifically, the amygdala shows larger activation for untrustworthy compared to trustworthy
faces, with a right lateralization pattern. In addition to the amygdala, the thalamus and the insula also
show the same negative correlation between their activation and facial trustworthiness. However, other
areas such as the posterior cingulate and medial frontal gyrus, display a positive correlation pattern: they
show larger activation for trustworthy compared to untrustworthy faces.
Guo et al. (2021) The Relationship Between the
Positivity Effect and Facial-cue
Based Trustworthiness Evaluations
in Older Adults
2021 Current Psychology
127
This study confirms that older adults from an Eastern cultural background do display a positivity effect
similar to older adults from a Western cultural background: older adults’ trustworthiness ratings are
higher than that of younger adults for both untrustworthy and trustworthy faces. The age-related positivity
effect is due to increased processing of positive information, and not to a reduced processing of negative
information. Younger adults’ processing of negative information is regulated by attention capacity, which
implies that younger adults process more negative information and so they have a negative bias.
Haas, Ishak, Anderson,
and Filkowski (2015)
The Tendency to Trust is Reflected
in Human Brain Structure
2015 NeuroImage
This study shows that individual differences in the tendency to trust are associated with regional gray
matter volume within the ventromedial prefrontal cortex (vmPFC), amygdala and anterior insula. There
is a positive association between self-report and behavioral trustworthiness evaluations and increased gray
matter volume within the bilateral vmPFC and bilateral anterior insula. There is a negative association
between these measures of evaluation and grey matter volume within the bilateral cerebellum. Greater
right amygdala volume is associated with the tendency to rate faces as more trustworthy and distrustworthy
(U-shaped function). A whole brain analysis also shows that individual differences in the tendency to trust
are associated with greater gray matter volume within the dorsomedial prefrontal cortex (dmPFC).
Dzhelyova, Perrett, and
Jentzsch (2012)
Temporal Dynamics of Trustworthiness Perception
2012 Brain Research
There is a relationship between sex and trustworthiness: people tend to regard male faces as untrustworthy
and female faces as trustworthy. This behavioral bias is reflected in the amplitude of two ERP components,
the N170 (only in the right hemisphere) and the EPN (early posterior negativity), with more negative
amplitudes for trustworthy female and untrustworthy male faces (faces congruent with the bias ) than
for faces not congruent with the bias. These results suggest a rapid and spontaneous processing of
trustworthiness, and suggest that posterior-occipital regions are implicated in trustworthiness perception.
Li and Fung (2013) Age Differences in Trust: An Investigation across 38 Countries
2013 Journals of Gerontology Series B: Psychological Sciences and Social Sciences
128
This study found a universal pattern that older age was positively related to a higher level of trust across
38 countries around the world. Different types of trust (particularized and generalized trust) were all
positively associated with age. The age differences in trust toward closer groups (i.e., family or friends)
were relatively smaller than trust toward more distant groups. There is also evidence suggesting that
country-level contextual factors ( individualism level, developing status and income inequality) moderated
the positive associations between age and certain types of trust: higher individualism of a country was
significantly associated with stronger positive associations between age and trust toward friends and
strangers; the positive association between age and trust in neighbors was stronger in more developed
countries; the positive association between age and trust in strangers was stronger in countries with greater
income inequality.
Cassidy, Boucher,
Lanie, and Krendl
(2019)
Age Effects on Trustworthiness Activation and Trust Biases in Face Perception
2019 The Journals of Gerontology: Series B
Although older adults and young adults agree on faces being trustworthy or untrustworthy, older adults
perceive faces as more trustworthy than young adults. This age-related trust bias is modulated by dynamic
category activation: dynamic category activation toward untrustworthy and trustworthy faces increases
and decreases, respectively, the trust bias. Specifically, older adults activated trustworthiness and untrustworthiness categories when evaluating untrustworthy faces, but tended to activate only trustworthiness
toward trustworthy faces. Young adults did not exhibit this difference. This means that older adults
disproportionately activate the category of trustworthiness when perceiving faces. In addiction, age and
dynamic category activation relate to the trust bias: the older adults’ dynamic activation of multiple categories (trustworthy-untrustworthy) to a greater extent when evaluating untrustworthy (vs trustworthy)
faces increase the trust bias.
Zebrowitz, Boshyan,
Ward, Gutchess, and
Hadjikhani (2017)
The Older Adult Positivity Effect
in Evaluations of Trustworthiness:
Emotion Regulation or Cognitive
Capacity?
2017 PloS one
The older adults’ positivity effect in evaluative ratings may be due to age-related declines in cognitive
capacity rather than to increases in the regulation of negative emotions. In fact, in contrast with the emotion
regulation theory, the positivity effect tends to be absent for strongly valenced or arousing stimuli and,
in addiction, the cognitive load is demonstrated to increase evaluation positivity. This latter effect, which
is not moderated by age, suggests that more cognitive resources are needed to process negative stimuli
because they are more cognitively elaborated than positive ones. Thus, interfering with such elaboration
(i.e. increasing the cognitive load) should decrease negative evaluations.
129
Salvia et al. (2020) Age-Related Neural Correlates of
Facial Trustworthiness Detection
During Economic Interaction
2020 Journal of Neuroscience, Psychology,
and Economics
The findings of this study indicate that there is an age-related amygdala activation increase when we face
Trustees who will abuse our trust. Specifically, there was a positive correlation between age and activation
of the left amygdala: when adult investors were looking at the picture of a trust-abusing trustee, the left
amygdala was relatively more activated than when they were looking at a trust-honoring or a neutral player.
Younger adolescents did not show this pattern and responded with a more pronounced deactivation when
facing a trust- abusing trustee. However, the more pronounced activation of the left amygdala when faced
with abusers did not result in a decreased transfer to these trustees. In addition to the amygdala, a number
of additional parietal (e.g., angular gyrus, inferior parietal lobule, and precuneus), temporal (e.g., inferior
temporal gyrus), and frontal regions (e.g., medial frontal gyrus) also show an age- related increase when
facing trust abusers. Many of these regions are implicated in the theory of mind and perspective taking
tasks.
Bonnefon, Hopfensitz,
and De Neys (2017)
Can We Detect Cooperators by
Looking at Their Face?
2017 Current Directions in
Psychological Science
People can detect cooperation after personally interacting with or seeing (even brief, even mute) video clips
of another person, but pictures seems to be not informative enough to allow for cooperation detection.
However, cooperation detection can be improved by degrading the informational content of the pictures
(pictures can be converted to grayscale and full pictures can be cropped in order to display only inner
features of the face). One possible reason is that transformed pictures discourage people from thinking too
much and this suggest that successful cooperation detection can be supported by intuitive processes.
Greiner and Zednik
(2019)
Trust and Age: An Experiment with
Current and Former Students
2019 Economics Letters
This study finds generally linear and positive effects of age on both trustingness and trustworthiness. In
addition, females behave more trustworthy but are not more trusting.
Suzuki et al. (2019) Age-related Differences in the Activation of the Mentalizing-and
Reward-related Brain Regions during the Learning of Others’ True
Trustworthiness
2019 Neurobiology of Aging
130
During a face-based trustworthiness judgment task, older adults’ striatal activity is lower for incongruent
than for congruent feedback, while the difference was not significant in young adults. This pattern is
associated with subsequent retrieval failure in a memory-based trustworthiness judgment task. In addiction, fMRI data during the face-based task show larger activation of mentalizing-related regions (dmPFC,
and precuneus) and reward-related regions (Anterior cingular cortex) in younger than in older participants during the processing of both congruent and incongruent feedback. Overall, the results suggest
that age-related differences in the striatum engagement may underlie older adults’ inefficiency in learning
impression-incongruent information about others’ trustworthiness.
Bailey, Petridis, McLennan, Ruffman, and Rendell (2019)
Age-Related Preservation of Trust
Following Minor Transgressions
2019 The Journals of Gerontology: Series B
Both young and older adults are able to disregard initial impressions based on facial expression to use
more reliable behavioral information when making investment decisions in the trust game: (1) both groups
decreased their trustworthiness ratings from pre to post-trust game for smiling and neutral (but not angry)
trustees with low return rate (cheating behavior) in the trust game; (2) both groups increased their trustworthiness ratings from pre- to post-trust game for neutral and angry (but not smiling) trustees with a high
return rate (cooperative behavior). These findings suggest that both age groups were more likely to update
trust beliefs when behavior was incongruent relative to congruent with appearances. However, only young
(and not older) adults recall information that violates a positive expectancy better than information that
violates a negative expectancy: only young adults’ trustworthiness ratings of smiling trustees decreased
after they demonstrated a high return rate but that occasionally cheated (in 2 out of 10 games.) This suggests that for young but not older adults, minor transgressions elicit punishment, particularly when they
are accompanied by a smile.
Bell, Giang, Mund, and
Buchner (2013)
Memory for Reputational Trait Information: Is Social–Emotional Information Processing Less Flexible
in Old Age?
2013 Psychology and Aging
Older adults have good source memory for the reputational information (in this study they were asked to
remember if a certain face (trustworthy or untrustworthy looking) had been previously associated with a
cheating or cooperative behavior). However, the results also suggest that older adults’ emotional source
memory is less flexible than that of younger adults’, because older adults do not adapt their encoding
strategies to their expectations. (In fact, young adults show an expectancy-violation effect on source memory : they remember expectancy-incongruent information better than expectancy-congruent information.
Instead, older adults show an unconditional source-memory advantage for cheater faces over cooperator
faces, regardless of whether the faces looked trustworthy or untrustworthy.
131
Rasmussen and
Gutchess (2019)
Can’t Read my Broker Face: Learning About Trustworthiness With
Age
2019 The Journals of Gerontology: Series B
Results show that, although both young and older adults learn to distinguish good and bad brokers from
neutral ones, younger adults demonstrate better learning of trustworthiness information than older adults:
(1) they invest in good brokers more than OA and in bad brokers less than OA; (2) they learn to distinguish
among broker types sooner than OA, (3) they explicitly remember the behavior of brokers better than OA. In
addiction, the study demonstrates a role of explicit memory in learning the trustworthiness of investment
partners: the performance on the broker investment task and the broker classification task (explicit memory
test) was significantly correlated. However, the relationship between explicit memory and investment in
good brokers is weaker for older than younger adults. In conclusion, findings demonstrate age-related
impairments in learning about trustworthiness, which may rely on explicit memory.
Learning During Lifetime
Fera et al. (2005) Neural Mechanisms Underlying
Probabilistic Category Learning in
Normal Aging
2005 Journal of Neuroscience
The research studies the effect of aging on the physiological mechanisms underlying probabilistic category
learning using a version of the weather prediction task which elicits dorsolateral prefrontal cortex and
striatal activity in healthy young adults. Experimental evidences support that equivalent nondeclarative
probabilistic category learning in healthy young and older adults elicits differential activation of a similar
neural network, involving areas of the brain such as the dorsolateral prefrontal cortex, caudate nucleus,
and posterior parietal cortex. Furthermore, authors find greater prefrontal cortex and caudate activation
in healthy young adults, and greater parietal cortex activation in healthy older adults. Overall, differential
activation within a circumscribed neural network in the context of equivalent learning suggests that some
brain regions may provide a compensatory mechanism for healthy older adults in the context of deficient
prefrontal cortex and caudate nuclei responses.
Frank and Kong (2008) Learning to Avoid in Older Age 2008 Psychology and Aging
132
This work discusses an experiment in which 44 older adults are tested with a probabilistic selection task
sensitive to dopaminergic function and designed to assess relative biases to learn more from positive or
negative feedback. The paper presents several results. First, individuals tend to become more risk averse
with age. Second, experimental findings are consistent with the dopamine hypothesis of cognitive aging,
according to which many of the cognitive impairments that accompany normal aging are caused by a
simultaneous decline in dopamine availability. Third, age significantly affects the bias to avoid negative
outcomes. On the one hand, older seniors present an enhanced tendency to learn from negative compared
with positive consequences of their decisions. On the other hand, younger seniors fail to reveal this negative
learning bias.
Samanez-Larkin et al.
(2012)
Frontostriatal White Matter Integrity Mediates Adult Age Differences in Probabilistic Reward
Learning
2012 Journal of Neuroscience
In this paper the authors adopt Diffusion Tensor Imaging (DTI) with a probabilistic reward learning
task in a community adult life span sample in order to assess whether the white matter integrity of
frontostriatal pathways mediates the influence of age on reward learning. As expected, author find that
increased white matter integrity in thalamocortical and corticostriatal paths correlates with better reward
learning performance. Moreover, the most relevant result consists in finding that white matter integrity in
aforementioned paths statistically account for age differences in learning. This suggests that the integrity
of frontostriatal white matter pathways critically supports reward learning.
Samanez-Larkin et al.
(2014)
Adult Age Differences in Frontostriatal Representation of Prediction
Error but not Reward Outcome
2014 Cognitive, Affective,
and Behavioral Neuroscience
This paper researches on adult age differences in neural activity for monetary gain outcomes. Collecting
data with functional neuroimaging, the experiment consists of 39 healthy adults completing reward-based
tasks that are dependent or independent to probabilistic learning. Authors report significant reductions in
the frontostriatal representation of prediction errors in older individuals during probabilistic learning. In
sharp contrast, authors observe no significant reduction but age-independent stability in the representation
of reward outcome in a task not involving probabilistic learning. In general, these results suggest that
neural representation of prediction error decreases in old age, while reward outcome remains unchanged.
Lerner, Sojitra, and
Gluck (2018)
How Age Affects Reinforcement
Learning
2018 Aging
133
In this editorial it presents a summary of the results in their paper. In a nutshell, 252 participants divided in
three age groups performs a learning task based on probabilistic feedback consisting in combining abstract
images with their likely outcomes. A remarkable feature is that feedbacks include rewards, punishments,
and also a no-feedback condition. Finally, a computational model of reinforcement learning is adopted to
characterize the behavioral results. Authors find that aging impairs the ability to learn from both rewards
and punishments. They also discover that such deficits result from two different reasons. On the one
hand, the impairments in learning from punishments results from elderly individuals having increasingly
noisy decision-making processes. On the other hand, reward learning deficits depend to older individuals
settling for a sub-optimal solution where they prefer to avoid any feedback rather than to find a response
that leads to reward.
Daniel, Radulescu, and
Niv (2020)
Intact Reinforcement Learning but
Impaired Attentional Control During Multidimensional Probabilistic
Learning in Older Adults
2020 Journal of Neuroscience
The authors present a computational description of the neural correlates of age-related changes in the interaction between learning and attention. Behavioral and fMRI data are recorded from both a unidimensional,
reinforcement learning task, and a multidimensional reinforcement learning task that requires selective
attention to focus learning on relevant dimensions. The results suggest that behavior and neural signals
during simple reinforcement learning task are not significantly affected by age, whereas the decisions of
older adults worsen with the higher attentional demands in multidimensional environments. Model-based
analysis finds that older adults might use Reinforcement Learning as a fallback strategy for directing attention. The finding that older adults rely more heavily on suboptimal Reinforcement Learning strategies is
supported at the neuronal level by empirical evidences revealing the fMRI activity of the ventral striatum,
whereas results regarding younger adults using attention processes are corroborated by cortical networks’
imaging.
Lighthall et al. (2018) Feedback-based Learning in Aging:
Contributions and Trajectories of
Change in Striatal and Hippocampal Systems
2018 Journal of Neuroscience
134
They design an experiment to examine brain activation patterns of younger and older participants during
learning tasks with choice feedbacks presented immediately or after a brief delay. Under fMRI, individuals
first complete the learning tasks and then they complete an unexpected memory task. The purpose of this
task is to test the recognition by participants of trial-unique feedback stimuli, followed by assessments of
post-learning cue preference, outcome probability awareness, and willingness to pay. Authors report three
main results. First, prediction error responses of older adults in the striatum reveal greater feedback-timing
sensitivity relative to prediction error responses in the hippocampus. Second, older adults do not exhibit
enhanced episodic memory for outcome stimuli in the delayed-feedback condition, unlike younger adults.
Third, older adults show an impaired ability to subsequently transfer their learning to measures of cue
preference, outcome probability awareness, and willingness to pay, even if older adults exhibit similar
rates of learning of younger adults across feedback-timing conditions. Overall, the results indicate that
hippocampus circuits supporting learning and memory decrease more than striatal circuits in healthy
aging. This suggests that declines in hippocampus learning signals may be an important predictor of
deficits in learning-dependent economic decisions among older adults.
Radulescu, Daniel, and
Niv (2016)
The Effects of Aging on the Interaction Between Reinforcement Learning and Attention
2016 Psychology and Aging
The goal of is to investigate how age affects the interaction between reinforcement learning and selective
attention in multidimensional environments characterizing real-world learning and decision-making scenarios. In order to do so, two experiments are discussed. In the first experiment, young and older adults
are tested on a set of probabilistic learning tasks in which the number of stimulus dimensions and the availability of hints about the identity of the rewarding target feature or relevant dimension are manipulated.
Authors find that group-related accuracy differences and reaction times vary systematically as a function
of the number of dimensions and the type of hint available to participants. In the second experiment,
researchers implement a trial-by-trial computational modeling of the learning process with the purpose to
test for differences in learning strategies between older and younger adults. Modeling computations suggest that a reinforcement-learning model that integrates selective attention to constrain learning accounts
for observed behavior of both young and older participants. However, the model reveals that older adults
tend to restrict their learning to fewer features, employing more focused attention than younger adults.
Sojitra et al. (2018) Age Affects Reinforcement Learning through Dopamine-based
Learning Imbalance and High
Decision Noise—not through
Parkinsonian Mechanisms
2018 Neurobiology of Aging
135
This research accounts for some inconsistencies in the existing literature related to the mechanisms involved
in probabilistic reinforcement learning from both positive and negative stimuli for older individuals. Authors test 252 adults divided into three different age groups on a probabilistic reinforcement learning task
that distinguishes between learning from positive and negative reinforcement. They analyze trial-by-trial
performance with a Q-reinforcement learning model, and correlate both fitted model parameters and behavior to polymorphisms in dopamine-related genes. The most relevant finding of the paper consists of
showing that learning from both positive and negative feedback declines with age but through different
mechanisms. On the one hand, when learning from negative feedback, older adults take more time due
to noisy decision-making. On the other hand, when learning from positive feedback, older adults tend
to settle for a non optimal solution due to an imbalance in learning from positive and negative prediction
errors.
Bavard, Rustichini, and
Palminteri (2020)
The Construction and Deconstruction of Sub-Optimal Preferences through Range-Adapting Reinforcement Learning
2020 bioRxiv
This paper accounts for range adaptation in the framework of context dependent reinforcement learning.
Authors test online a large cohort of human participants over eight different variants of a behavioral task, in
which the idea of the experiment is to match an initial learning phase characterized by fixed pairs of options
to a subsequent transfer phase where options are rearranged in new pairs. Bavard and colleagues show that
reinforcement learning values are learned in a context dependent manner that is compatible with a range
adaptation process which aims at increasing the signal-to-noise ratio. According this paradigm, authors
argue that a counter-intuitive prediction can be formulated, namely decreasing outcome uncertainty should
increase range adaptation and, in turn, extrapolation errors. The experimental results verify that range
adaptation induces systematic extrapolation errors and is stronger when decreasing outcome uncertainty.
Eppinger et al. (2011) Neuromodulation of Reward-based
Learning and Decision Making in
Human Aging
2011 Annals of the New York
Academy of Sciences
This paper by Eppinger, Hämmerer and Shu-Chen Li provides a selective review of the current literature to
underline relations between age-associated decline in dopaminergic and serotoninergic neuromodulation
and adult age differences in adaptive goal-directed behavior. After an overview on the neural systems
involved in adaptive decision making and learning, the review presents a collection of recent findings on
adult age differences in three major aspects of learning and decision making. First, the acquisition of goaldirected behavior under reward uncertainty; second, the ability to leave accustomed habits and learn new
behavior during reversal learning; third, asymmetries in the valuation of rewarding and aversive outcomes
during learning and decision making. The main conclusion of the review is that aging-related deficits in
neuromodulation of the frontostriatal-limbic networks contribute to adult age differences in motivational
regulation of behavioral control.
136
Intertemporal Choices
Halfmann, Hedgcock,
and Denburg (2013)
Age-Related Differences in Discounting Future Gains and Losses
2013 Journal of Neuroscience, Psychology,
and Economic
This paper argues that conflicting results in existing literature may be accounted by both demographic
confounds and substantial variability in the healthy aging process. The study considers thee groups,
namely middle-aged, unimpaired older, and impaired older adults. The decision framework employs the
Iowa Gambling Task (IGT) which incorporates aspects of learning, ambiguity, uncertainty, reward, and
punishment, and it consists of a good alternative to neuroimaging. All three groups’ participants need
to choose between a smaller and sooner versus a larger and later monetary reward or loss, with a focus
on the differences between identified IGT impaired and unimpaired older adults and middle-aged adults.
Findings suggest that impaired older adults discount future financial streams more than unimpaired older
adults both in the domain of gains and losses. Instead, middle-aged and impaired older adults discount
future gains at a similar rate, whereas the first group discounts future losses less than the second one, but
quite similarly to unimpaired older adults. Halfann et al. (2013) conclude that unimpaired older adults
exhibit a compensatory mechanism which generates more cautious, patient choices.
Halfmann et al. (2016) Individual Differences in the Neural Signature of Subjective Value
among Older Adults
2016 Social Cognitive and
Affective Neuroscience
This research finds that older adults with stronger representation of subject value (SV) during intertemporal
choice also have lower neural signal variability and perform better on a separate probabilistic decisionmaking task. The paper illustrates an experiment in which thirty-three healthy older adults with varying
scores on the Iowa Gambling Task (IGT) are recruited. The participants are asked to perform an intertemporal decision-making task while undergoing functional magnetic resonance imaging (fMRI). The purpose
of such an experimental design is to examine whether SV representation in the canonical valuation network
differs across older adults based on complex decision-making ability. Halfmann and colleagues list four
main findings. First, better performance on the IGT correlates with stronger SV-associated activation in the
canonical subjective valuation network, including the VMPFC and striatum. Second, individual differences
in age, education, intellect and numeracy are not correlated with differences in the neural SV representation. Third, better performance on the IGT is not significantly related with objective magnitude, i.e., delay
to reward, or task-associated activation. Lastly, worse performance on the IGT corresponds to increased
variability in the striatum and in the VMPFC. Authors conclude that these findings suggest that a reduced
representation of value in the brain, possibly driven in part by increased neural noise, relates to suboptimal
decision-making in a subset of older adults.
137
Seaman et al. (2018) Subjective Value Representations
during Effort, Probability and Time
Discounting across Adulthood
2018 Social Cognitive and
Affective Neuroscience
The main goal is to characterize both sensitivity to and tolerance of effort, probability and time in monetary decisions across adulthood. In order to do so, young adults, middle-aged adults and older adults
accomplish three different types of forced-choice decision-making tasks with two alternative. In each of
the trial each participant assesses the preferences for effort requirements, probability and temporal delays
while undergoing functional magnetic resonance imaging (fMRI). The experiment shows that preferences
for lower physical effort, higher probability, or shorter time delays are uncorrelated. In spite of this finding,
Seaman et al. (2018) finds that overlapping subject value (SV) is associated with activity in the medial
prefrontal cortex across all three tasks. These evidences indicate that while the tolerance of these decision
features is behaviorally dissociable, the discounted value signal-related decisions have a common neural
representation. In addition, the paper presents no evidence for age-related differences in the accomplishment across all three tasks at the level of neural representations of SV. This suggests that decision-making
preferences may be relatively steady across adulthood, and also that neural mechanisms supporting SV are
preserved with age.
Castrellon et al. (2019) Individual Differences in
Dopamine Are Associated with
Reward Discounting in Clinical
Groups But Not in Healthy Adults
2019 Journal of Neuroscience
This study examines whether characteristics of monetary rewards are associated to individual differences
in dopamine (DA) function in humans. The paper discusses two analyses based on PET imaging with
the goal to quantify a relation between individual differences in DA function and time, probability, and
physical effort discounting. The first study is empirical, while the second study is a meta-analysis of
DA PET studies about reward discounting. The empirical work reveals that there are no correlations
between individual differences in DA D2-like receptors (D2Rs) and time or probability discounting of
monetary rewards in healthy subjects. Moreover, associations with physical effort discounting are largely
inconsistent across adults of different ages. The meta-analysis comparing correlations between discounting
and striatal DA function fails to detect a positive correlation. This result corroborates the empirical findings
regarding a minimal effect of DA measures on discounting in healthy individuals. Furthermore, the
meta-analysis indicates that the association between individual differences in DA and reward discounting
statistically depends on clinical diseases such as addictions, Parkinson’s disease, obesity, and attentiondeficit or hyperactivity disorder. Overall, the evidences collated indicate that preferences for shorter time
delays, greater probability, and lower physical effort are generally uncorrelated with DA D2Rs availability
across brain regions in healthy adults.
138
Dhingra et al. (2020) The Effects of Age on Reward Magnitude Processing in the Monetary
Incentive Delay Task
2020 Neuroimage
The authors examine the age-related effects on cerebral activations during anticipation and feedback for
participants of the Monetary Incentive Delay Task (MIDT). The purpose of the experiment is to investigate
whether age is associated with diminished response to anticipation to win large against small amount of
money, likewise to the outcomes of wins and losses of large against small reward. In order to test the
research hypotheses, Dhingra and colleagues present an experiment in which 54 participants, ranging from
22 to 74 years of age, play the Monetary Incentive Delay Task (MIDT) with explicit cues and timed response
to win. The authors measure brain activations during anticipation and feedback, as well the effects of age
on these regional activations. Experimental results suggest that age is associated with decreasing cerebral
response to anticipation of large versus small monetary reward and increasing cerebral response to the
outcome of small versus large monetary loss, together reflecting an age-related constriction in sensitivity to
the magnitude of monetary reward.
Age-related Risk Propensity
Sproten, Diener,
Fiebach, and Schwieren
(2010)
Aging and Decision Making: How
Aging Affects Decisions under Uncertainty
2010 Discussion Paper Series
This paper investigates the effects of aging on decision making under financial uncertainty in the different
domain of risk and ambiguity. Researchers tested three hypothesis, namely that older adults are less prone to
financial risks than young subjects, that ambiguity behavior evolves over lifespan , and that both young and
older people are willing to gamble less in the ambiguity domain compared to risk conditions. Experimental
data from a card game which was composed of risky and ambiguous conditions reject first hypothesis,
confirm second hypothesis and partially confirm third hypothesis. Overall, older subjects appear to be
equally willing to risky behavior compared to young adults, age explains changes in ambiguity behavior,
and young adults behave more conservatively in ambiguous conditions than in risky conditions, while
there are no differences between risk and ambiguity behavior for older adults.
Roalf, Mitchell, Harbaugh, and Janowsky
(2012)
Risk, Reward, and Economic Decision Making in Aging
2012 Journals of Gerontology Series B: Psychological Sciences and Social Sciences
139
Older subjects seem be less impulsive, more risk averse and less sensation seeking than younger subjects.
They find that age does not influence the measure of nonsocial economic decision making. Moreover, older
adults results being less likely to accept unfair divisions of money during an economic social-bargaining
game and more willing to propose fair divisions of money during social-giving game. Furthermore, the
degree to which subjects were risk averse predicts their acceptance rate during social bargaining, but was
not related to discounting or social-giving game. These results together suggest that age effects on risk
taking modify older adults social economic decision making. Results indicate that age-related differences
are specific of decision making domain and that some social economic decisions depend on individuals’ risk
attitudes. Older adults are less risk seeking, but this feature does not affect their willingness to postpone
future reward in a nonsocial context. Perceiving more risk is related with a reluctance to agree to an unequal
offer and this results into poorer outcomes for older adults.
Denburg et al. (2005) The Ability to Decide Advantageously Declines Prematurely in
Some Normal Older Persons
2005 Neuropsychologia
This study presents an experiment in which 80 subjects divided according age, i.e. younger (aged 26–55)
and older (aged 56–85), performed a “Gambling Task” game. “Gambling Task” game provides a real-world
decision-making framework which yields a sensitive index of ventromedial prefrontal function taking into
consideration reward, punishment, and unpredictability. Notably, a group among the older adults has a
decision-making impairment on the Gambling Task. The empirical results are consistent with the notion
that some older individuals have significant difficulty with reasoning and decision-making, as indexed
by Gambling Task. These finding suggests that older individuals with decision-making impairment are
affected by a disproportionate aging of the ventromedial prefrontal cortex.
Mata, Josef, SamanezLarkin, and Hertwig
(2011)
Age Differences in Risky Choice: A
Meta-Analysis
2011 Annals of the New York
Academy of Sciences
140
Authors present a literature review and applied meta-analysis to measure age-related differences in risky
choice for tasks involving decisions from both description and experience. Decisions from description
are the decision making in which full information about probabilities and outcomes is known. On the
opposite, decisions from experience are based on no explicit information regarding probabilities and
outcomes, then decision makers rely on experience acquired over time. The findings indicate that the
pattern of age-related differences varies as a function of the task used. The results from the decisions
from description show that younger and older adults do not differ in their risk-taking willingness when
learning components are excluded from task demands, at least in tasks involving a choice between a gamble
and a safe amount and in tasks based on the Blackjack card game. The findings from the decisions from
experience suggest that older adults are more risk averse than younger adults, at least in tasks involving
decisions in the IGT (Iowa Gambling Task) and in the BIAS (Behavioral Investment Allocation Strategy).
However, However, the analysis indicate significant age differences in the CGT (Cambridge Gambling Task)
in decisions from description, and higher risk aversion of older adults relative to younger adults in the BART
(Balloon Analogue Risk Task) in decisions from experience. Overall, these results conflict the theoretical
expectations of general differences in risk-taking behavior over life-span.
Mohr et al. (2010) Neuroeconomics and Aging: Neuromodulation of Economic Decision Making in Old Age
2010 Neuroscience and
Biobehavioral Reviews
This review hypothesizes a relationship between economic decision making, dopaminergic and serotonergic neuromodulation, and aging. For what concerns age-related changes in economic behavior the review
presents several decision making’s topics such as risky decision making, decisions under ambiguous conditions, and delay discounting effects. Many study present evidences which support age-related differences
in economic behavior, specifically in risk-taking behavior, delay discounting, and the ability to make advantageous decisions in the IGT (Iowa Gambling Task)). These papers provide no evidence for the underlying
mechanisms that drive age-related changes in economic decision making.
Mather et al. (2012) Risk Preferences and Aging: The
“Certainty Effect” in Older Adults’
Decision Making
2012 Psychology and Aging
Combined the results of the four experiments, results find that older adults evaluate certain outcomes
with higher weights when offered a choice between a certain outcome and a risky option relative to young
adults. Older individuals have a stronger preference for certain gains and a stronger aversion for sure losses
compared to younger individuals. Conversely, experimental evidence show no age differences in overall
risk aversion when participants choose between two risky options.
Grubb, Tymula, GilaieDotan, Glimcher, and
Levy (2016)
Neuroanatomy Accounts for Agerelated Changes in Risk Preference
2016 Nature Communications
141
The main finding is that loss of grey matter volume (GMV) in right posterior parietal cortex (rPPC) better
accounts for changing risk preferences than does age. This result allows to link changes in behavior to
neurobiological processes that unfold across the lifespan rather than to chronological age itself. In aging
there exists multiple changing factors that can account for GMV decline. All these factors can affect efficient
neural coding and thus are consistent with computational theories proposing that risk aversion results from
limited neural computational capacity.
Mamerow, Frey, and
Mata (2016)
Risk Taking Across the Life Span: A
Comparison of Self-Report and Behavioral Measures of Risk Taking
2016 Psychology and Aging
The objective of the experiment is to analyze the cross-sectional age–risk trends originated from three
different measures (i.e., trajectories for self-report measures, for description-based behavioral measure, and
for experience-based behavioral measures), and to assess their convergent validity as a function of age. All
three measures provide some evidence for reduced risk taking with increased age. In particular, compared
to younger individuals, older adults claim higher risk aversion, pump less in the Balloon Analogue Risk
Task (BART) in the low-risk condition, and choose the risky gamble less often when the risky and safe
option has the same EV. Overall, the trajectories of age differences in behavioral tasks are likely to depict
a decline in propensity for risk across the life span, with a strong dependence depending on specific task
characteristics.
Huang, Wood, Berger,
and Hanoch (2013)
Risky Choice in Younger versus
Older Adults: Affective Context
Matters
2013 Judgment and Decision
Making
This study uses two forms of the Columbia Card Task (CCT) to measure risky decision-making across the
lifespan. In order to do so, younger and older individuals play both the “warm” and the “cold” CCT, namely
tasks to trigger either affective decision-making or deliberative decision-making, respectively. They find
that, across conditions, overall risk seeking is similar between younger and older subjects. However, only
older adults are significantly more risk averse in the “cold” CCT compared to the “warm” CCT, suggesting
that, in the absence of emotional information, older adults were relatively more risk seeking. In brief, this
study find no relevant overall age differences in risky decision-making.
Seaman et al. (2017) Individual Differences in Skewed
Financial Risk-Taking across the
Adult Life Span
2017 Cognitive, Affective,
and Behavioral Neuroscience
142
This research studies adult age-related differences in decision making together with neural activity during
skewed financial risky gambles. In the experiment there three classes of gambles: symmetric skewed (50%
chance of gain/loss), positive skewed (50% chance of large gain), and negative skewed (50% chance of
large loss). Authors find that older adults are more willing to take positively skewed gambles compared
to symmetric gables. At the neuronal level, fMRI evidences suggest that, relatively to younger adults,
older subjects are characterized by an increase in the anticipatory activity for negatively skewed gambles.
The same whole-brain fMRI analyses indicate also that anticipatory activity reduces for positively skewed
gambles both in the anterior cingulate and in lateral prefrontal regions. In summary, these findings reveal
age biases toward positively skewed risky tasks and age differences in corticostriatal regions during skewed
risk-taking.
Tymula et al. (2013) Like Cognitive Function, Decision
Making across the Life Span Shows
Profound Age-related Changes
2013 Proceedings of the National Academy of Sciences
Identifying the existence of age-related patterns in decision making under uncertainty, the authors presents
three main findings. First, older adults in the age-range between 65 and 90 behave inconsistently in
their decision making relative to younger subjects. Second, ambiguity aversion is domain-specific: this
phenomenon arises when subjects expect gains and it does not occur when subjects make losses. Third,
the function representing risk aversion across life span shows a inverse U-shaped form implying that both
older and younger individuals are less risk seeking compared to middle life peers.
Table C.1: Summary of critical review
143
Asset Metadata
Creator
Gabriele, Francesco (author)
Core Title
Essays on behavioral economics
Contributor
Electronically uploaded by the author
(provenance)
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Degree Conferral Date
2025-05
Publication Date
04/17/2025
Defense Date
04/01/2025
Publisher
University of Southern California
(original),
Los Angeles, California
(original),
University of Southern California. Libraries
(digital)
Tag
causal inference,counterfactual simulation,counterfactual thinking,decision-making,demand estimation,dopamine evolution,drift diffusion,field experiment,human aging,laboratory experiment,machine learning,neuroeconomics,policy model,price discrimination,random utility,stochastic choice,structural demand,welfare analysis
Format
theses
(aat)
Language
English
Advisor
Oliva, Paulina (
committee chair
), Metcalfe, Robert (
committee member
), Proserpio, Davide (
committee member
), Coricelli, Giorgio (
committee member
)
Creator Email
fg14007@usc.edu,francescogabriele78@gmail.com
Unique identifier
UC11399KBHM
Identifier
etd-GabrieleFr-13950.pdf (filename)
Legacy Identifier
etd-GabrieleFr-13950
Document Type
Dissertation
Format
theses (aat)
Rights
Gabriele, Francesco
Internet Media Type
application/pdf
Type
texts
Source
20250418-usctheses-batch-1254
(batch),
University of Southern California Dissertations and Theses
(collection),
University of Southern California
(contributing entity)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
uscdl@usc.edu
Abstract (if available)
Abstract
This dissertation consists of three essays and explores different topics in Behavioral Economics. Each chapter presents a research perspective of economic behavior in different context. The first covers topics at the intersection of field experiment, empirical industrial organization and quantitative marketing, the second combines subjects of experimental economics and development of human emotions, while the third reviews both decision theory and neuroeconomics.
The first chapter focuses on how information technology enables digital platforms to use consumers’ data, and how such information asymmetries shape marketing strategies and economic outcomes. In particular, the first essay seeks to understand the role of personalization in pricing strategies and the welfare consequences of e-commerce retailers’ power to price discriminate. E-commerce retailers have the power to price discriminate based on users’ online past purchase behavior. The chapter develops a structural model of consumer demand and a pricing policy model to quantify the welfare effects of behavior-based price discrimination (BBPD). Using data from a randomized controlled experiment on a cosmetics e-commerce site, the structural estimation reveals elastic demand to price discount treatments. With nonparametric estimation via machine learning, the counterfactual analysis tests different pricing algorithms and shows that personalized price discrimination increases e-commerce profit by 24% and consumer surplus by 4%, relative to uniform pricing. This result suggests that exploiting past purchase history is profitable for the monopolistic e-commerce: BBPD complements targeting discounts by generating an additional 11% gain in producer surplus without harming loyal customers. The analysis contributes to the current public policy debate about pricing strategies in digital markets as the welfare analysis has implications for privacy policy.
The second chapter studies the development of counterfactual emotions in childhood. Counterfactual thinking involves the imagination of hypothetical alternatives to reality to compare experienced reality with counterfactual scenarios. Renewed interest in behavioral research has focused on the development of human counterfactual emotions that allow for such fundamental cognitive process. Albeit the use of simple paradigms in human development studies, a coherent picture of the evolution of these emotions in childhood has yet to emerge. The chapter provides an original contribution by using laboratory experimental data to estimate a discrete-choice demand model where children consider counterfactual scenarios. The analysis measures the effects of counterfactual emotions in a risky decision-making task and controls for age within different cohorts of children. Results are consistent with previous findings. Counterfactual emotions such as regret, relief, envy and gloating tend to emerge around 6 years of age. The incidence of counterfactual emotions is salient especially for emotions such as regret and envy, suggesting that children are more sensitive to negative rather than positive emotions. The novelty of this essay consists in applying a microeconometric demand framework to a behavioral research question in order to provide estimates which corroborate previous findings.
The third chapter investigates the hypothesis that randomness in decision-making of elderly results from two different, yet connected effects. The essay argues that, on the one hand, the exhibited differences in choice behavior between youngsters and elderly is caused by the evolution of substantial preferences. Whereas, on the other hand, variation in decision-making for older individuals is a consequence of the intrinsic stochasticity of neuronal activity and age-related deterioration of the dopaminergic system. The first effect results from evolving preferences and stochastic utility functions. The chapter discusses the random utility framework which accounts for non-deterministic preferences changing during lifetime. The second factor derives from the decay of brain circuits encoding decision-specific value signal. The essay presents the drift-diffusion model which considers increasing stochasticity of neuronal activity as a biological consequence of dopamine decay in old individuals. By reviewing recent results in the neuroscientific literature, the chapter supports the thesis that both components play a role in elderly decision-making. The implications of this novel perspective are notably relevant in terms of welfare evaluation. Indeed, stochastic feature of underlying preferences cannot necessarily be deemed as welfare decreasing. Conversely, the comparison of decision values is affected by an error term which is a computational noise. Thus, the worsening of neuronal circuits supporting decision values results strictly welfare decreasing.
Tags
price discrimination
field experiment
causal inference
structural demand
policy model
machine learning
counterfactual simulation
welfare analysis
counterfactual thinking
decision-making
laboratory experiment
demand estimation
neuroeconomics
stochastic choice
random utility
drift diffusion
human aging
dopamine evolution
Linked assets
University of Southern California Dissertations and Theses