Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Efficient policies and mechanisms for online platforms
(USC Thesis Other)
Efficient policies and mechanisms for online platforms
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
EFFICIENT POLICIES AND MECHANISMS FOR ONLINE PLATFORMS by Negin Golrezaei A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (BUSINESS ADMINISTRATION) August 2017 Copyright 2017 Negin Golrezaei To my loving husband, and my dear parents, for their support, encouragement, and constant love. ii Acknowledgments This thesis is a product of my close collaboration with my research advisor Prof. Hamid Nazerzadeh. Words cannot express my feelings of gratitude towards him for his constant support, encouragement, and guidance. He is the main reason I can now look back and say that I have had an incredible last five years at USC. I am also grateful to Prof. Paat Rusmevichientong and Prof. Ramandeep Randhawa, who are also my co-authors. They have always been a constant source of guidance and inspiration. I am also grateful to Prof. Phebe Vayanos. for being on my thesis committee. I am greatly indebted to her for her encouragement and enthusiasm about my work. I also greatly appreciate the genuine concern, constant encouragement, and help of Professors Raj Rajagopalan and Greys Sosic. During the course of my stay at Marshall School of Business, I also had the good fortune of interacting with Professors Yehuda Bassok, Amy Ward, Vishal Gupta, Kimon Drakopoulos, Leon Zhu, Song-Hee Kim, and Ashok Srinivasan - they have all been highly inspirational in addition to being instrumental in making me feel part of the greater USC family. I would like to thank the collegial faculty members and friendly staff at the Marshall School of Business, who were always available when I needed help in research or life. Special thanks to my Ph.D. friends for their sympathetic ear, altruistic help, and stimulating discussions we had about our research and lives. Finally, I want to thank my family for their constant support all through my life - no achievement in my life would have been possible without them. I am forever indebted to my husband, Sajjad Beygi, and my parents, Mohammad Hossein Golrezaei and Batool Najafi. Their love, encouragement, and support have been the core of my strength throughout the quest of attaining my Ph.D. degree. This thesis is dedicated to them. I also express a feeling of bliss for having loving siblings like Zahra, Saeed, and Mahnaz. They have always been genuinely concerned and ever encouraging. iii Contents Acknowledgments iii List of Tables viii List of Figures ix Abstract xii 1 Real-time Optimization of Personalized Assortments 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Preliminaries and Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3 Inventory-Balancing Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3.1 The Tight Upper Bound on the Competitive Ratio . . . . . . . . . . . . . . . . . . . 14 1.4 Stochastic I.I.D. Arrivals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.4.1 Motivating IB Algorithms via Dual-Based Heuristics . . . . . . . . . . . . . . . . . 18 1.5 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.5.1 Incorporating Partial Information and Learning Customer types? . . . . . . . . . . . 19 1.5.2 Incorporating (Uncertain) Information About Arrival Pattens . . . . . . . . . . . . . 20 1.5.3 Beyond Substitutability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.6 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.6.1 Dataset and Simulation Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.6.2 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.6.3 Benefits of Personalization: The Importance of Knowing Customer Types . . . . . . 30 1.6.4 Comparison to Myopic Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 1.7 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 iv 2 Auctions with Dynamic Costly Information Acquisition 34 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.2 Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.3 Direct Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.4 The Efficient Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.4.1 What If All Agents Are Allowed to Access the Additional Information? . . . . . . 47 2.5 Maximizing Revenue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.5.1 Sequential Second-Price Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.5.2 Revenue-Optimal Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 2.5.3 What If All Agents Are Allowed to Access to the Additional Information? . . . . . . 54 2.6 Who Will Be Selected? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.6.1 Impacts of the Cost and Variance of Second Signals on Selection Rules . . . . . . . 58 2.7 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2.7.1 Information Acquisition as an Entry Cost . . . . . . . . . . . . . . . . . . . . . . 59 2.7.2 Multi-unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3 Dynamic Pricing for Heterogeneous Time-Sensitive Customers 64 3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.2 Direct Mechanisms and Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.3 Optimal Mechanism with Exponential Valuation Functions . . . . . . . . . . . . . . . . . . 72 3.4 Optimal Mechanism with Production and Holding Costs . . . . . . . . . . . . . . . . . . . 77 3.4.1 Positive Production Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.4.2 Positive Holding Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.4.3 Positive Production and Holding Costs . . . . . . . . . . . . . . . . . . . . . . . . 89 3.5 General Valuation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 A Technical Appendix to Chapter 1 107 A.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 A.1.1 Proof of Lemma 1.2.1: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 A.1.2 Proof of Theorem 1.3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 A.1.3 Proof of Proposition 1.4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 A.1.4 Learning the Customer Types under the Multinomial Logit Model . . . . . . . . . . 113 v A.2 Computational Complexity of Inventory-Balancing Algorithms . . . . . . . . . . . . . . . . 113 A.3 Numerical Experiments: Appendix to Section 1.6 . . . . . . . . . . . . . . . . . . . . . . . 115 A.3.1 Known Length of the Horizon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 A.3.2 Worst-Case Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 A.3.3 Learning the Customer Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 A.4 Relegated Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 A.4.1 Appendix to Section 1.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 A.4.2 Proof of Proposition A.1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 A.4.3 Appendix to Section 1.5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 A.5 Asymptotic Optimality of the Dynamic Programming Policy . . . . . . . . . . . . . . . . . 128 B Technical Appendix to Chapter 2 132 B.1 Sequential Weighted Second-Price Mechanism & Proofs of Theorems 2.4.1 and 2.5.1 . . . . 132 B.1.1 Proof of Theorem B.1.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 B.1.2 Proof of Lemma B.1.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 B.2 Proof of Theorem 2.6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 B.3 Appendix to Sections 2.4.1 and 2.5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 B.3.1 Proof of Theorem 2.4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 B.3.2 The All-Access Mechanism and Single Crossing Conditions . . . . . . . . . . . . . 143 B.3.3 Proof of Theorems 2.4.4 and 2.5.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 B.4 Delegated Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 B.4.1 Proof from Section 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 B.4.2 Proofs from Section B.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 B.4.3 Proofs from Section B.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 B.4.4 Proofs from Section B.1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 B.5 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 B.5.1 Payments in the First Round . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 B.5.2 The SSP Mechanism versus the optimal Mechanism . . . . . . . . . . . . . . . . . 160 B.5.3 More Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 C Technical Appendix to Chapter 3 163 C.1 Proof of Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 C.1.1 Appendix to Section 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 C.1.2 Lower Bound on the Revenue Gain of the Dynamic Pricing Policy . . . . . . . . . . 165 vi C.1.3 Proof of Theorem 3.4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 C.1.4 Optimal Mechanism for a Low Holding Cost . . . . . . . . . . . . . . . . . . . . . 167 C.1.5 Optimal Mechanism for a Medium Holding Cost . . . . . . . . . . . . . . . . . . . 174 C.1.6 Optimal Mechanism for a High Holding Cost . . . . . . . . . . . . . . . . . . . . . 175 C.1.7 Proof of Theorems 3.4.1 an 3.4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 C.1.8 Discussing the Assumption in Theorem 3.4.4 . . . . . . . . . . . . . . . . . . . . . 178 C.1.9 Proof of Theorem 3.5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 C.2 Technical Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 C.2.1 Proof of Lemma C.1.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 C.2.2 Proofs of Lemmas in Sections C.1.4 and C.1.5 . . . . . . . . . . . . . . . . . . . . 189 C.2.3 Proof of Lemmas in Section C.1.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 vii List of Tables 1.1 Top 10 DVDs nationally and the percentage of customers in each location who purchase each DVD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 The locations used in the simulations and the top three best-selling DVDs in each location. . 24 1.3 Revenue comparison when the length of the horizon is random. The standard errors of all numbers are less than 0.1%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 1.4 The average revenue for IB algorithms in multiple-type model and the improvement over the single-type model. All numbers are statistically significant and the standard errors are less than 0.1%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 1.5 Revenue Comparison. The standard errors of all numbers are less than 0.1%. . . . . . . . . 32 A.1 Revenue comparison when the length of horizon is known in advance. The standard errors of all numbers are less than 0.1%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 A.2 Worst-Case Performance Comparison when the length of horizon is unknown. . . . . . . . . 117 A.3 The average revenue for the LIB and EIB algorithms when the underlying parameters are unknown, and each algorithm uses the estimated parameters based on data collected in the previous periods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 B.1 Revenue of the SSP mechanism (with revenue-maximizingr) as a percentage of the optimal revenue withF =N(0:5; 0:5). Here, the standard errors of all numbers are less than 1%. . 161 viii List of Figures 1.1 The C.C.D.F. of revenues with LF= 1:8 and CV= 2. For each algorithm, the curve shows the fractions of problem instances (out of 250) whose revenue exceed certain percentages of the upper bound. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.1 The average number of agents with the updated valuations versus the cost in the efficient and All-Access (with no reserve) mechanisms withn = 2,F = Uniform(0; 1), andG i = Uniform(1; 1) fori = 1; 2. Here,c 1 =c 2 =c. . . . . . . . . . . . . . . . . . . . . . . . . 50 2.2 The average social welfare versus the cost in the efficient and All-Access (with no reserve) mechanisms withn = 2,F = Uniform(0; 1), andG i = Uniform(1; 1) fori = 1; 2. Here, c 1 =c 2 =c. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.3 Depicting the results in Theorem 2.5.5 withn = 2,F = Uniform(0; 1), andG i = Uniform(1; 1) fori = 1; 2. Here,c 1 =c 2 =c. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 2.4 The revenue of the revenue-optimal and All-Access (with revenue-maximizing reserve price) mechanisms versus the cost withn = 2,F = Uniform(0; 1), andG i = Uniform(1; 1) for i = 1; 2. Here,c 1 =c 2 =c. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.5 Selected agents in the optimal mechanism for different realizations of v 1;0 and v 2;0 with n = 2,c 1 = 0:01,c 2 = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5) fori = 1; 2. . . . . . . 58 2.6 Selected agents in the efficient mechanism for different realizations of v 1;0 and v 2;0 with n = 2,c 1 = 0:01,c 2 = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5) fori = 1; 2. . . . . . . 59 2.7 Average number of selected agents in the revenue-optimal and efficient mechanisms versus the standard deviation of second signals,, withG i =N(0; 2 ) andc i = 0:05 fori = 1; 2. . 60 2.8 Average number of selected agents in the revenue-optimal and efficient mechanisms versus the cost of information,c, withG i =N(0; 0:5). . . . . . . . . . . . . . . . . . . . . . . . . 61 2.9 Average revenue of the OPT and OPT-E mechanisms versus the cost withG i = N(0; 0:5), andc i =c fori = 1; 2. In all the figures,n = 2 andF =N(0:5; 0:5). . . . . . . . . . . . . 62 ix 3.1 At time 0, customer 1 has a higher value than customer 2. But, her value decreases faster than customer 2, and beyondt 0 , customer 2 has a higher value. . . . . . . . . . . . . . . . . 65 3.2 Time of purchase versus type in the fixed and dynamic pricing policies with the customer typeU(0; 1),V (;t) =e t , = 0:1, andh = c = 0. . . . . . . . . . . . . . . . 77 3.3 Payment versus type in the fixed and dynamic pricing policies with the customer type U(0; 1),V (;t) =e t , = 0:1, andh = c = 0. . . . . . . . . . . . . . . . . . . . . 78 3.4 Utility versus type in the fixed and dynamic pricing policies with the customer type U(0; 1),V (;t) =e t , = 0:1, andh = c = 0. . . . . . . . . . . . . . . . . . . . . 79 3.5 The cut-off c versus the production costc for the mechanism described in Theorem 3.4.1 and the FP policy. Here, the customer typeU(0; 1),V (;t) =e t , = 0:1, and the holding costh = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.6 The social welfare and revenue gain of DP (relative to FP) as the production cost varies. Here, the customer typeU(0; 1),V (;t) =e t , = 0:1, and the holding costh = 0. 82 3.7 The structure of the optimal mechanism as a function of the holding cost h. Here, the customer typeU(0; 1), = 0:1,V (;t) =e t , and the production costc = 0. . . . 83 3.8 The social welfare and the revenue gain of the optimal mechanism (relative to the FP policy) as a percentage versus the holding cost. Here, the customer type U(0; 1), = 0:1, V (;t) =e t , and the production costc = 0. . . . . . . . . . . . . . . . . . . . . . . . 84 3.9 Time of purchase versus type. Here, H l = 0:004, H h = 0:025, the customer type U(0; 1),V (;t) =e t , = 0:1, and production costc = 0. . . . . . . . . . . . . . . . 85 3.10 Payment versus type. Here, H l = 0:004, H h = 0:025, the customer type U(0; 1), V (;t) =e t , = 0:1, and production costc = 0. . . . . . . . . . . . . . . . . . . . 86 3.11 Utility versus type. Here, H l = 0:004, H h = 0:025, the customer type U(0; 1), V (;t) =e t , = 0:1, and production costc = 0. . . . . . . . . . . . . . . . . . . . 87 3.12 The thresholds versus the exponenta for the mechanism described in Theorem 3.5.1. Here, the customer typeU(0; 1),V (;t) =e g()t ,g() = a , andh = c = 0. . . . . . . 93 3.13 The social welfare and revenue gain of DP (relative to the FP policy) in percentage as a function of the exponent a. Here, the customer type U(0; 1), V (;t) = e g()t , g() = a , andh = c = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 3.14 Time of purchase versus type in the optimal mechanism described in Theorem 3.5.1. Here, the customer typeU(0; 1),V (;t) =e g()t ,g() = a , andh = c = 0. . . . . . . 94 x 3.15 Payment versus type in the optimal mechanism described in Theorem 3.5.1. Here, the cus- tomer typeU(0; 1),V (;t) =e g()t ,g() = a , andh = c = 0. . . . . . . . . . . 94 3.16 Utility versus type in the optimal mechanism described in Theorem 3.5.1. Here, the cus- tomer typeU(0; 1),V (;t) =e g()t ,g() = a , andh = c = 0. . . . . . . . . . . 95 A.1 The cumulative revenue over time forLF = 1:8 andCV = 2 when the length of the horizon is known in advance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 B.1 The payment of agent 1 in the first round,t 1 , in the optimal mechanism for different realiza- tions ofv 1;0 andv 2;0 withn = 2,c = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5). . . . . . 160 B.2 The payment of agent 1 in the first round,t 1 , in the efficient mechanism for different real- izations ofv 1;0 andv 2;0 withn = 2,c = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5). . . . 161 B.3 Revenue in the optimal and efficient mechanisms versus number of agents,n, withc = 0:05, F =N(0:5; 0:5), andG i =N(0; 0:5). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 B.4 Social welfare in the optimal and efficient mechanisms versus number of agents, n, with c = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5). . . . . . . . . . . . . . . . . . . . . . . . 162 B.5 The average number of selected agents of the optimal and efficient mechanisms versus num- ber of agents,n, withc = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5). . . . . . . . . . . . . 162 xi Abstract Markets have been studied for centuries. But, because of the rise of online markets in the past decade, our ability to experiment with markets and get immediate feedback has radically changed. Such opportunities exist because online platforms can collect unprecedented amounts of data. That data can be incorporated into designing optimal marketplaces, and contribute to making informed operational decisions. For instance, data enables online markets to provide personalized services by exploiting heterogeneity among their customers. My research interests lie in understanding the economic and operational properties of online markets and contribute to their design and advancement. In this dissertation, I focus on designing and developing fast, robust algorithms and mechanisms for two types of online markets that have been significantly influenced by the availability of massive amounts of data: online retailing markets and online advertising markets. In Chapter 1, I study the problem of optimizing personalized assortment planning in online retail stores. The online retail stores provide a platform that enables consumers to directly buy goods or services from a seller over the Internet. These platforms allow online retailers to store data from customers to get infor- mation about their behavior and needs. This gives online retailers an opportunity to use such information to personalize product-offering (aka assortment planning) based on customers’ preference, interests, and other characteristics. By personalizing assortment planning, online retailers can gain substantial potential revenue improvements. In this chapter, we propose a family of simple and effective algorithms, called Inventory-Balancing, for real-time personalized assortment optimization, that does not require any forecast- ing. Our proposed algorithm has a strong performance guarantee. We show that our algorithms obtain at least 11=e = 63% of the benchmark revenue, even when there are sudden shocks in the customers’ arrival patterns, either from seasonality or other non-stationarity effects. The results in this chapter are published xii as the journal paper by Golrezaei et al. (2014). Furthermore, this paper got nominated for the MSOM Stu- dent Paper Competition in 2016 and the Production and Operations Management Society College of Supply Chain Management 2017 Student Paper Competition. In Chapter 2, we study the impacts of data in two-sided online markets where two sets of independent self-interested agents interact with each other, and the decisions of each set of agents affect the outcomes of the other set of agents. In two-sided platforms such as online advertising markets, data impacts both sets of agents in possibly different ways. Now the challenge is to design selling mechanisms that effectively disclose and/or use data, and incentivize self-interested agents to act in a globally optimal manner. In this chapter, we study the mechanism design problem for a seller (web publisher) of an indivisible good in a setting where privately informed buyers (advertisers) can acquire additional information and refine their valuations for the good at a cost. For this setting, we propose optimal (revenue-maximizing) and efficient (welfare-maximizing) mechanisms that induce a right level of investment in information acquisition. We show that because information is costly, in the optimal and even the efficient mechanisms, not all the buyers would obtain the additional information. In fact, these mechanisms incentivize buyers with higher initial valuations to acquire information. The results in this chapter are published as the journal paper by Golrezaei & Nazerzadeh (2016). In Chapter 3, we study the problem of pricing in online retailers in the presence of strategic customers. In online retailing, time-based pricing (dynamic pricing), which is is a pricing strategy in which firms set flexible prices for products based on current market demands, has been becoming increasingly prevalent. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with demand uncertainty (see, for instance, Aviv & Pazgal 2008 and Cachon & Swinney 2011). In this chapter, we show that dynamic pricing can play an important role in differentiating between customers over time even in the absence of demand uncertainty. In many settings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only different in terms of their initial willingness-to-pay for these products when they are first introduced to the market, but they are also different in terms of how rapidly they lose their interest in these products. We characterize the optimal mechanism for selling durable products in such environments and show that delayed allocation and dynamic pricing can be an effective screening tool for maximizing profit of a firm. This chapter is based on a joint work with Professors Hamid Nazerzadeh and Ramandeep Randhawa. xiii Chapter 1 Real-time Optimization of Personalized Assortments 1.1 Introduction The availability of real-time data on customer characteristics has encouraged companies to personalize oper- ational decisions to each arriving customer. For instance, the product recommendations Amazon.com makes to each customer dynamically change depending on recent reviews, ratings, purchases of the customer herself, other customers with similar interests to hers, and several other factors (Amazon’s Recommen- dation Systems 2012). Orbitz.com, as another example, has found that users of Apple Macintosh computers spend as much as 30% more per night on hotels; consequently, the company can show Mac users different and more expensive assortments of hotels and travel options, than Windows users (Mattioli 2012). Location- based deals and coupons are offered by Groupon, Yelp, Foursquare and other Internet companies (Wortham 2012). In online advertising, the advertisements that are displayed to a user browsing a website are routinely personalized based on the user’s browsing history, demographic information, and her location (Helft & Vega 2010). Even brick-and-mortar grocery stores are starting to offer personalized real-time coupons based on each customer’s purchasing history and the available products on the shelf in the aisle where each customer is currently shopping (Clifford 2012). These examples raise a key question that motivates our work: Given the complexity of coordinating the real-time, front-end, customer-facing decision with the back-end supply chain constraints, what policies should companies use to take advantage of such data? We answer these questions by formulating a real-time, personalized, choice-based assortment optimiza- tion problem, involving multiple products with limited inventories, and arbitrary customer types. The type of each arriving customer can be arbitrary, and it is indexed by a (possibly infinite) setZ. Examples of types include the customer’s computer (Mac vs PC), her current location, purchasing history, the average 1 household income in her neighborhood, the competitors’ current offerings and prices, time of day, or a combination of other observable characteristics. For an arriving customer of typez2Z, the company must decide, in real-time, on the assortment of products to offer. Given an assortment S, the customers make choices on which products to buy, if any, according to a general choice model that is specific to each customer type. Our goal is to develop a revenue- maximizing policy that determines the assortment to offer to each arriving customer, taking into account the customer type and the current inventories. The above formulation captures the essential features of the situation faced by companies that sell ser- vices, products, or advertisement to heterogenous customer types that requires real-time decision-making, with inventory constraints. We first observe that differentiating customer types (even just their locations) can significantly increase revenues. We consider the top ten DVDs with the highest national sales volumes during the summer of 2005, and compare their sales in two locations: Urbana-Champaign, IL and Miami, FL. Table 1.1 shows the sales rate of each DVD in each location, which is defined as the proportion of the potential customers in each location 1 who purchased each DVD. Sales Rate DVD Title Urbana- National Miami Champaign 1 Lost - The Complete First Season 7.6% 8.9% 6.7% 2 Firefly - The Complete Series 6.9% 0.0% 12.0% 3 The Simpsons - The Complete Sixth Season 6.0% 5.0% 6.7% 4 Star Wars: Episode III - Revenge of the Sith 5.5% 7.4% 4.0% 5 Sin City 5.2% 9.5% 2.0% 6 Family Guy Presents Stewie Griffin - The Untold Story 4.8% 3.3% 6.0% 7 Batman Begins 4.3% 4.8% 4.0% 8 What the Bleep Do We Know!? 4.2% 9.8% 0.0% 9 Curb Your Enthusiasm: The Complete Fourth Season 3.8% 1.8% 5.3% 10 Seinfeld: Season Four 3.7% 3.3% 4.0% Table 1.1: Top 10 DVDs nationally and the percentage of customers in each location who purchase each DVD. We observe that these two locations exhibit very different purchasing behaviors, which are significantly different from the national sales pattern. Consider the sci-fi DVD “Firefly – The Complete Series”, which 1 The potential customers in each location correspond to the people in the location who bought one of the top 200 DVDs with the highest national sales volumes during the summer of 2005. We choose 200 DVDs as the cutoff because they account for a large proportion of the sales volumes. 2 is the second most popular DVD nationally, with a sales rate of 6.9%. None of the customers in Miami purchased this DVD, while 12% of the customers in Urbana-Champaign, almost twice the national rate, bought the series. On the other hand, none of the customers from Urbana-Champaign bought “What the Bleep Do We Know!?”, while almost 10% of the customers in Miami bought it. In Section 1.6.3, we evaluate the performance of our algorithms and observe that customizing the assortments of DVDs to each customer’s location leads to over 10% increase in revenues. The improvement can be as high as 21% when the size of the assortments are constrained. As our main contribution, we propose a family of simple and effective algorithms, called Inventory- Balancing, for real-time personalized assortment optimization, that do not require any forecasting: An Inventory-Balancing algorithm, maintains a “discounted-revenue index” for each product, in which the (actual) revenue is multiplied by a (virtual) discount factor that depends on the fraction of the product’s remaining inventory. Upon an arrival of each customer, based on the customer’s type, the algorithm offers to her the assortment that maximizes the expected discounted revenue. Each Inventory-Balancing algo- rithm is characterized by the penalty function that discounts the marginal revenue of each product, as the inventory level reduces. By adjusting the revenue of each product according to its remaining inventory, the algorithms hedges against the uncertainty in the types of future customers, by reducing the rate at which products with low inventory are offered. Thus, the discounted-revenue index serves as a simple mechanism that coordinates the front-end customer-facing decision with the back-end supply chain constraints. Our Inventory-Balancing algorithms offer the following benefits. No forecasting: A traditional approach for dynamic assortment optimization, both in the literature and in practice, is to forecast the demand over time by estimating the distribution of the number of customers of each type, and then find an optimal policy based on the forecast using re-optimization methods (Gallego & van Ryzin 1994a, Jasin & Kumar 2012) or dynamic programming (Bernstein et al. 2011). In our sales data, we observe a large variability in the number of customers, across time and locations. Such a large variability in the demand process often makes forecasting difficult, and not surprisingly, could lead to poor performance for the policies obtained under this approach. An alternative approach for making a real-time decision is to solve the “off-line” assortment opti- mization problem repeatedly, using the most up-to-date inventory levels of each product and the latest demand forecast. This can be done by repeatedly solving a series of linear programs; see, for exam- ple, Jasin & Kumar (2012). When the number of customers is known in advance, the re-optimization 3 methods work extremely well and yield nearly optimal revenue because they can effectively ration the inventory to all customers. However, when there is a significant uncertainty in the market size, the problem becomes more challenging. In this setting, our Inventory-Balancing algorithms perform very well, yielding 5%-11% more revenues than re-optimization methods. Strong performance under both non-stationary and stationary demand processes: As a performance benchmark, we compare the revenue of our algorithms to the revenue of a clairvoyant optimal solution that has complete knowledge of the sequence of the types of the customers that arrive in the future, but does not know the (random) choice for each future customer 2 . We prove that Inventory-Balancing algorithms with a strictly concave penalty function always obtain more than 50% of the optimal revenue; see Theorem 1.3.3 and Corollary 1.3.4. We also provide an Inventory-Balancing algorithm that obtains at least (1 1 e ) 63% of the benchmark revenue. This implies that even when there are sudden shocks in the customers’ arrival patterns, either from seasonality or other non-stationarity effects, the algorithm maintains a strong performance guarantee. The (1 1 e ) fraction of the benchmark revenue is optimal for non-stationary stochastic arrivals in the sense that in the worst-case, no deterministic or randomized policy can achieve a higher competitive ratio; see Theorem 1.3.6. When customer arrivals are stationary, our algorithms perform even better. We show that when the types of arriving customers are independently and identically distributed, our algorithm is guaranteed to obtain at least 75% of the benchmark; see Proposition 1.4.2. In our numerical experiments, pre- sented in Section 1.6, our algorithms perform even better than what is predicted by the worst-case bound, obtaining revenues that are within 96% - 99% of the benchmark. Simplicity, robustness, and flexibility: In contrast to the existing methods, our Inventory-Balancing algorithms are extremely simple and fast. We do not need to solve any offline assortment optimization problem, and consequently, we can compute the decision for each customer quickly. Our formulation allows for infinite customer types, and thus, our algorithms are robust to changing customer types over time. In addition, under mild assumptions, our analysis and performance guarantees continue to hold 2 For each future customer and an assortment, the clairvoyant algorithm knows the probability that the customer would purchase a product from that assortment, but does not know the exact choice that the customer would make. In other words, the clairvoyant algorithm knows the choice model but does not know the realization of the random choices of a future customer. 4 when the choice models of the customers are learned over time and the algorithm uses estimations of the parameters of the choice models; see Proposition 1.5.1. Our proposed algorithms are also flexible, and they can be easily combined with existing re- optimization methods while maintaining a worst-case performance guarantees; see Section 1.5.1 and Proposition 1.5.2. Our numerical experiments show that such a hybrid method brings out the advan- tages of all methods, especially when there is uncertainty in the number of future customers. The key message in this chapter is that real-time optimization of personalized assortments can be done efficiently and robustly. Our proposed policies maintain a simple index for each product, which balances the nominal revenue with the value of each unit of remaining inventory. These indices are easy to implement, and they serve as a simple mechanism that coordinates between the front-end real-time decision and the back-end supply chain constraints. As the volumes of data on customer profiles and preferences continue to grow, we believe that companies will consider personalization of other operational decisions, such as pricing or shipping options. The framework and analysis in this chapter can serve as a starting point for more complex models. 1.1.1 Literature Review Our work is related to the growing literature on assortment planning. We describe a brief overview of the area, to provide a context for our work. Assortment planning problems focus on the relationships among assortment offerings, customer choices, and inventory constraints. van Ryzin & Mahajan (1999) introduced one of the first models that capture the tradeoffs between inventory costs and product variety. Mahajan & van Ryzin (2001) followed up on this work with a study on the optimal inventory levels in the presence of stockouts and substitution behavior. Since their seminal work, researchers have considered a variety of choice models and studied how such models affect the optimal assortment and the inventory level of products we should carry. Examples include the demand substitution model (Smith & Agrawal 2000), Lancaster choice model (Gaur & Honhon 2006), ranked-list preferences (Honhon et al. 2010, Goyal et al. 2011), and multinomial logit models (Talluri & van Ryzin 2004b, Gallego et al. 2004, Liu & van Ryzin 2008a, Topaloglu 2013). Recently, Farias et al. (2013) introduced a very general class of choice models based on a distribution over permutations, and developed efficient algorithms for determining the optimal assortment. For a survey of the assortment planning literature, the reader is referred to K¨ ok et al. (2008). 5 Two of the most important decisions in modeling an assortment planning problem are determining the customers’ choice models and capturing the arrival process of the customers. The family of choice models considered in our work is quite general and includes most of the choice models used in practice or previously studied by researchers. What distinguishes this work from the existing literature is the fact that we do not impose any restrictions on the arrival process and most of our results hold even if an adversary chooses the sequence of customers. In the following, we briefly discuss the prevalent approaches to model the arrival process. A common approach to model customer arrivals is to assume that arrivals follow a stochastic process. In this model, the optimal sequence of assortments can be planned by solving a multi-dimensional dynamic program. Not surprisingly, this approach suffers from the curse of dimensionality, even for stationary pro- cesses. Recently, Bernstein et al. (2011) studied the aforementioned assortment planning model under the assumption that the type of customers (represented by multinomial logit choice models) is drawn identi- cally and independently from a stationary distribution; i.e., I.I.D. arrivals. For two products with equal revenue, two customer types (with each type following a multinomial logit choice model), and Poisson arrivals, they provide structural properties of the optimal solution. Interestingly, they show that the optimal dynamic program may withhold products with low remaining inventory for future customers that are more interested in them. Based on this observation, they propose a heuristic that, roughly speaking, reduces the general problem with multiple products to a two-product problem by separating the products into two groups based on their inventory to demand ratio. They do not provide any performance guarantees for the heuristic. In practice, we do not expect the distribution of customer types to remain constant over time because of seasonality effects or changing popular trends. In the context of airline revenue management, the fraction of business customers tend to increase as the departure date approaches. Prior to our work, to the extent of our knowledge, the best known performance guarantee for a heuristic with respect to a clairvoyant optimal solution in non-stationary stochastic environments was a ratio of 1 2 that follows from Chan & Farias (2009). When the arrivals are stochastic, with some adjustments, the assortment planning studied in the chapter would fit into the stochastic depletion framework proposed by Chan & Farias (2009). They show that in their framework the competitive ratio of a myopic policy is at least 1 2 . 6 Re-optimization policies (Jasin & Kumar 2012) are applicable to non-stationary stochastic environ- ments. However, we are not aware of any results that provide a performance guarantee. 3 The closest work to this line of research is by Ciocan & Farias (2013), who studied re-optimization policies for a network revenue management problem where the distribution of the valuation of the customer (i.e., the distribution of the types) is constant over time, however the size of the market changes over time according to a stochas- tic (e.g., multi-variate Gaussian) process. They showed that a re-optimization policy that adjusts prices of the products by solving a linear program obtains about one-third of the optimal revenue; see also Chen & Farias (2013). In contrast to dynamic pricing problems, where firms manage their profits and capacities by controlling the prices, in assortment planning models, product prices are exogenously determined and remains constant over the horizon, and the firms decide on the selection of the assortment to offer to each customer. We choose the competitive ratio as our performance benchmark because it allows for arbitrary non- stationary, even adversarial, arrivals and it does not require any prior knowledge about the arrival patterns. This notion has been previously applied by Ball & Queyranne (2009) to the problem of capacity allocation. Besbes & Zeevi (2011) and Besbes & Saur´ e (2012) also used similar notions of optimality when they studied revenue management problems where the demand may change dramatically because of shocks. The problem we study here resembles some of the aspects of the Adwords problem (Mehta et al. 2007, Buchbinder et al. 2007, Goel, Mahdian, Nazerzadeh & Saberi 2010, Azar et al. 2009), where the goal is to allocate a sequence of advertisement spaces associated to search queries to budget-constrained advertisers. Both the Adwords and personalized assortment optimization problems contain the b-matching problem as a special case (Kalyanasundaram & Pruhs 2000). Mehta et al. (2007) proposed an algorithm that achieves an optimal-competitive ratio for the Adwords problem by taking into account both the bid and budget of the advertisers; see Buchbinder & Naor (2007) for a survey on online algorithms and also Acimovic & Graves (2011) for another application in the context of inventory management. Organization: In Section 1.2, we formally define our problem. Our algorithm and main results are pre- sented in Section 1.3. We discuss the performance of our algorithm under stationary stochastic arrivals in Section 1.4, followed by discussions of extensions of our original model in Section 1.5. We present 3 The analysis of Liu & van Ryzin (2008a) and Jasin & Kumar (2012) extends to the environment with time-varying demand if the demand varies slowly overtime, but not when the demand is volatile. 7 the numerical experiments in Section 1.6. The conclusion and direction for future work are given in Sec- tion 1.7. Finally, the proof of the competitive ratio and discussions of computational complexity are in Appendix B.1.1. 1.2 Preliminaries and Problem Formulation Consider a firm that sellsn products, indexed by 1; 2;:::;n, to customers that arrive sequentially over time. The firm obtains a revenuer i > 0 for selling each unit of producti, which has an initial inventory ofc i 2Z + , with no replenishment. We denote the no-purchase option as product 0, withr 0 = 0. LetZ denote the set of possible customer types. Once a customer arrives, her type, denoted byz2Z, is revealed. For instance, the type of a customer can correspond to his or her web browser, i.e.,Z =fMac; PCg. 4 As mentioned in the introduction about assortment personalization by Orbitz.com, the typez = Mac may suggest that the user is more likely to choose expensive travel options. If we are interested in the location of each customer, then the type of the customer can correspond to his or her zip-code. 5 In addition, the revelation of each customer’s type can happen when the customer logs in to the websites, e.g., Amazon or eBay. Based on the customer’s type and the remaining inventory, the firm offers an assortmentS2S, where S denotes the set of all feasible assortments; we assumef0g2S, i.e., the firm has the option to not offer any product. The setS allows us to incorporate a variety of constraints on the assortments, such as the shelf-space or size constraints. Associated with each customer type z2Z is the probability of purchasing each product under each assortment. More specifically, each customer typez2Z corresponds to a general choice model which spec- ifies the probability of purchasing each product under each assortment. We denote by z i (S) the probability that a customer of typez purchases producti, when assortmentS is offered. In fact, all of our results con- tinue to hold when each customer may purchase more than one product at a time: define z :SS! [0; 1], where z (S 0 ;S) is the probability that a customer of typez purchases exactly the products in setS 0 , when assortmentS is offered; in addition, z (S 0 ;S) = 0 whenS 0 6 S orS62S and P S 0 S z (S 0 ;S) = 1. 4 This information is communicated to the website that the user is visiting. 5 This information may be identified through each customer’s IP address or (opt-in) cellphone’s GPS signals; see Steel & Angwin (2010). 8 Hence, we have z i (S) = P S 0 :i2S 0 ;S 0 S z (S 0 ;S). In the remainder of the chapter, we only use the notation z i (). Our goal is to design an algorithm that offers an assortment to each arriving customer in order to max- imize the total expected revenue. Let a vectorfz t g T t=1 = (z 1 ;z 2 ; ;z T ) represents the sequence of types of the arriving customers, where for eacht,z t 2Z denotes the type of the customer that arrives in periodt. Definition 1. For any algorithm A and any sequence of customer types fz t g T t=1 , we denote by Rev A fz t g T t=1 the expected revenue obtained by algorithm A from the customersfz t g T t=1 , where the expectation is taken with respect to the choices made by each customer, and possibly random selections of the algorithm (if the algorithm is not deterministic). We do not assume any arrival patterns and the algorithm does not know the sequence of the customers in advance. Therefore, we use the notion of competitive ratio, defined below, to measure the performance of an algorithm. The following lemma establishes an upper bound on the expected revenue that can be obtained by any algorithm from a sequence of customers. Lemma 1.2.1 (Revenue Upper Bound). For any sequence of customers fz t g T t=1 and any algorithm A, Rev A fz t g T t=1 is bounded by the optimal value of the linear programPrimal fz t g T t=1 defined below: MAXIMIZE P T t=1 P S2S P n i=1 r i zt i (S)y t (S) SUBJECT TO: P T t=1 P S2S zt i (S)y t (S) c i 1in P S2S y t (S) = 1 1tT y t (S) 0 1tT; 8S2S (Primal fz t g T t=1 ) In the linear program above, y t (S) corresponds to the probability that the setS is offered to the cus- tomer of type z t in period t. With a slight abuse of notation, we denote the optimal value of the linear program above byPrimal fz t g T t=1 as well. The proof, given in Appendix A.1.1, follows from the fact that Primal fz t g T t=1 is an upper bound on the expected revenue of the optimal clairvoyant solution that knows the sequence of the customer types in advance. Namely, we construct a feasible solution for the linear pro- gram above based on the optimal clairvoyant solution, taking into account the realizations of the customers’ choice models. 9 By the above lemma, no algorithm without hindsight would obtain revenue equals toPrimal fz t g T t=1 . However, an algorithm with no knowledge of the future types might be able to obtain a fraction of the revenue of this clairvoyant optimal solution. Therefore, the competitive ratio of an algorithm is defined as follows: Definition 2 (Competitive Ratio). An algorithmA is-competitive if: inf T1 inf fztg T t=1 :zt2Z8t Rev A fz t g T t=1 Primal fz t g T t=1 : The infimum is taken over all possible sequences of customer arrivals of arbitrary lengths. In other words, the competitive ratio is defined as the worst-case ratio between the “expected revenues” of an algo- rithm and the optimal clairvoyant solution over a (possibly infinite) sequence of customer types, where the expectation is with respect to the realization of customers’ choice models. 6 One potential criticism of the notion of competitive ratio could be that it compares algorithms with a benchmark that is too strong. However, as we show in the following sections, in the context of assortment planning, it leads to simple algorithms that perform very well with respect to this benchmark. Moreover, our numerical simulations demonstrate the practical relevance of our method, and show that our algorithms outperform existing methods in the literature. 1.3 Inventory-Balancing Algorithms We present a family of algorithms called Inventory-Balancing (IB), which take into account both the revenue that would be obtained from the customer and the current inventory levels, in order to decide which assort- ments to offer. Each Inventory-Balancing algorithm is defined with a penalty function : [0; 1]! [0; 1], which is an increasing function with (0) = 0 and (1) = 1. Recall thatc i is the initial inventory of producti. LetI t i denote the remaining inventory of producti at the end of periodt. Note thatI 0 i =c i , and fort 1,I t i = maxfI t1 i Q t i ; 0g, whereQ t i is a binary random 6 Note that we use the upper-bound linear program Primal fztg T t=1 as the proxy for the revenue of the optimal clairvoyant algorithm. This can be interpreted as giving even more power to the clairvoyant algorithm since it can now respect inventory constraints only in expectation. However, such additional power would be negligible with large inventory levels which we believe are the more interesting and realistic instances of the problem. 10 variable that is equal to 1 if the customer has chosen producti, and is 0 otherwise. We are now ready to describe the algorithm. INVENTORY-BALANCING WITH A PENALTY FUNCTION Upon the arrival of the customer in periodt2f1;:::;Tg, of typez t , offer an assortmentS t : S t = arg max S2S X i2S I t1 i =c i r i zt i (S) The assortmentS t can be found in polynomial time for a broad class of choice models; see Appendix A.2 for details. In the case of ties, we choose any of the sets with the smallest number of products. We can think ofr i I t1 i =c i as the discounted revenue associated with producti, where the discount factor I t1 i =c i is determined by the penalty function and it depends on the fraction of the initial inventory that remains. As we discuss in Section 1.4.1, I t1 i =c i corresponds to a dual solution to Primal fz t g T t=1 . Namely, for eacht, using I t1 i =c i , we can construct a feasible solution for the dual ofPrimal fz t g T t=1 , and the value of this feasible dual solution is within a “constant” factor ofPrimal fz t g T t=1 . The main idea behind the algorithm is simple. Sometimes it might be better to sell a product with a lower marginal revenue but a high inventory level, than to sell a product with high marginal revenue but few remaining inventory. This is because the future customers might only be interested in the products with low (or no) inventory, and if we have already sold those products, we would lose on these profitable opportunities. 7 The penalty function thus protects against the uncertainty in future customer types. The following example shows what would happen if we ignore the inventory level, and only offer assortments with the highest revenues; see Mehta et al. (2007) and Bernstein et al. (2011) for similar examples. Example 1.3.1. (MyopicPolicyandWhyInventoryLevelsMatter?) Consider the Myopic policy that does not take into account any inventory level. The policy corresponds to the following penalty function: (x) = 1 l[x > 0], i.e., (x) is equal to 1 if the remaining inventory of the product is positive and is equal to 0 otherwise. This algorithm, at any period, offers the assortment, 7 Bernstein et al. (2011) show that (under certain assumptions) the behavior of the optimal dynamic program is similar to this intuition. 11 among products with positive inventory, that maximizes the expected revenue. The following scenario shows that the competitive ratio of the Myopic policy is at most 1 2 . Suppose the length of the horizon is equal to T . There are two products with the following parameters, r 1 = 1 +, r 2 = 1, and c 1 = c 2 = T 2 . We have two customer types. The first type arrives during periods 1;:::; T 2 , and the second type arrives in periods T 2 + 1;:::;T . Fort2f1;:::; T 2 g, zt 1 (f1g) = zt 2 (f2g) = 1, and zt i (S) = 0 otherwise. For t2f T 2 + 1;:::;Tg, zt 1 (f1g) = 1 and zt i (S) = 0 otherwise. In this setting, a customer of the first type is interested in both products 1 and 2, while the second type is only interested in product 1. The Myopic policy will allocate all the inventory of product 1 to the customers that arrive in periods 1; 2;:::; T 2 , and obtains a revenue of T 2 (1+). However, the optimal solution ofPrimal fz t g T t=1 allocates all the inventory of product 2 to the customers that arrive in periods 1; 2;:::; T 2 , and then sells product 1 to customers that arrive afterwards in period T 2 + 1;:::;T , yielding a revenue of T 2 (2 +). Note that T (1+) T (2+) 1 2 +. Since can be arbitrarily small, the competitive ratio of the Myopic policy is at most 1 2 . 8 The above example, though rather stylized, highlights the importance of inventory levels in assortment planning. We will show that, by discounting the revenue of each product based on its remaining inventory, the Inventory-Balancing algorithms obtain a better competitive ratio. Throughout this chapter, we impose the following mild assumption on the choice models. Assumption 1 (Substitutability). For allz2Z,S2S, andi6=j, z i (S) z i (S[fjg). The above assumption implies that adding another product to an assortment does not increase the proba- bility of selling other products in the assortment. It is easy to verify that the above assumption encompasses all choice models that are consistent with random utility maximization 9 , including the multinomial logit choice model, the nested logit, and many others; see Appendix A.2 for details. The above assumption leads to the following desirable property. The proof is in Appendix A.4. 8 On the other hand, it is not difficult to show that the Myopic policy obtains at least 1 2 of the revenue of the benchmark revenue. Hence, ratio 1 2 is tight. 9 This is because, under the random utility maximization (cf. Talluri & van Ryzin 2004c), z i (S) = Pr U z i max `2S[f0g U z ` where (U z 0 ;U z 1 ;:::;U z n ) is the random utility vector that a customer of type z assigns to each product. The random variables (U z 0 ;U z 1 ;:::;U z n ) may be correlated and can have arbitrary distributions. Ifj6=i, then z i (S[fjg) = Pr U z i max `2S[fjg[f0g U z ` Pr U z i max `2S[f0g U z ` = z i (S): 12 Lemma 1.3.2. Under Assumption 1, the Inventory-Balancing algorithm never offers an assortment that includes a product with zero remaining inventory. In Section 1.5.3, we relax Assumption 1 and extend our results to a more general setting where stock- outs are allowed. We now use Lemma 1.3.2 to establish the competitive ratio. The proof is given in Appendix B.1.1. Theorem 1.3.3 (Competitive Ratio). Let c MIN = min i=1;:::;n c i . Suppose is an increasing, concave, and twice-differentiable penalty function. The competitive ratio of Inventory-Balancing algorithm with a penalty function is at least equal to c MIN ( ), where c MIN ( ) = min x2 h 0;1 1 c MIN i 8 < : 1x 1 c MIN + 1 (x) + R 1 x+ 1 c MIN (y)dy 9 = ; We emphasize that the competitive ratio in the above theorem only depends onc MIN and penalty function , and the ratio does not depend on the length of the horizon T . Hence, the above result holds when T increases to infinity. It also holds for any sequence of customer types of arbitrary length, even if the sequence is chosen by an adversary. Many of the previous results in the literature are established for an asymptotic regime where the size of initial inventoriesc MIN and the length of the horizonT tend to infinity. The justification for the asymp- totic analysis is that often the initial inventory of products and the number of customers are large. In this asymptotic regime, we can simplify the expression for the competitive ratio. Define: ( ) := 1 ( ) = min x2[0;1] ( 1x 1 (x) + R 1 x (y)dy ) : We observe that the competitive ratio of the algorithm improves slightly as c MIN becomes larger. For instance, for the polynomial penalty functions (x) = p x, the competitive ratio withc MIN = 2; 5, and 10 is, respectively, 0:52, 0:55, and 0:57. This ratio approaches( p x) = 0:60 asc MIN grows. As a corollary of Theorem 1.3.3, we can show that the competitive ratio is at least 1 2 for any increasing concave function. Therefore, by taking into account the remaining inventory levels, we obtain a better performance guarantee than a Myopic policy that ignores inventory. 13 Corollary 1.3.4. For the Inventory-Balancing algorithm with linear penalty function (LIB), (x) = x, the competitive ratio c MIN (x) is equal to 1 2 , for any c MIN 1. For any increasing strictly concave and differentiable penalty function, the competitive ratio is strictly greater than 1 2 . The proof is given in Appendix A.4. Note that this is a worst-case performance guarantee and does not imply that the Inventory-Balancing algorithms outperform the Myopic policy on every sequence of customers. In practice, as suggested by our numerical simulations, we expect that the IB algorithms, and even the Myopic policy, to often perform better than their theoretical worst-case bounds; see Section 1.6.2. The choice of the penalty function determines the trade-offs between the revenue from selling a product and the value of the remaining inventory. For a linear penalty function (x) = x, the derivative is always 1, and a reduction in a unit of inventory has the same penalty, regardless of the inventory level. On the other hand, the derivative of the exponential penalty function (x) = e e1 (1e x ) is given by e e1 e x , which decreases from 1:58 atx = 0 to 0:58 atx = 1. Under the exponential penalty function, consuming one unit of inventory incurs a higher penalty when the inventory is scarce. In regimes with high demand and low inventory, we would expect that the Inventory Balancing algorithm with an exponential penalty function (EIB) to be more conservative and hold back more products to hedge against future arrivals. As we show in the next section, the best competitive ratio can be obtained using an exponential penalty function (x) = e e1 (1e x ). The complexity of the IB Algorithm is determined by the complexity of determining the optimal assortment for each arriving customer, corresponding to solving the combinatorial optimization problem max S2S P i2S r i (I t1 =c i ) zt i (S). This problem can be solved efficiently for a broad class of choice models; see Appendix A.2. 1.3.1 The Tight Upper Bound on the Competitive Ratio We start this section by providing an upper bound on the competitive ratio. Then, in Theorem 1.3.6, we show that an IB algorithm with an exponential penalty function achieves this upper bound, showing that our proposed method achieves an optimal competitive ratio. Lemma 1.3.5 (Upper Bound on the Competitive Ratio). For any number of productsn, we can construct a non-stationary stochastic process for customer arrivals where for every deterministic algorithm (including 14 the optimal dynamic program), there exists a sequence of customer typesfz t g T t=1 such that the revenue of the algorithm is at most a fraction n = 1 n P n j=1 min n P j t=1 1 nt+1 ; 1 o ofPrimal fz t g T t=1 . For instance, forn = 2; 5, and 20, the upper bound n is respectively equal to 0:75, 0:69, and 0:64 and n approaches lim n!1 n = 1 1 e 63% as number of productsn increases. In the proof of the above lemma, we construct a stochastic process that consists of n products. The per-unit revenue from each product is equal to 1 and the initial inventories are equal to T n . Think ofT , the length of the horizon, as a very large number (that would tend to infinity) and a multiple ofn. The number of types is equal to 2 n 1. Each type corresponds to a nonempty set of products that a customer of that type equally likes; the “no-purchase” probability for all types is equal to zero. Note this is a special case of the multinomial logit choice model where the all the products have weight either 0 or 1. The arrival process is defined as follows: customer arrives inn phases of equal length, that is, the number of customers in each phase is T n . All the customers in each phase have the same type. Customers in the first phase are interested in all the products. After that, in each phase, customers randomly lose interest in one of the products of interest in the previous phase; i.e., there aren! sequences of customer arrivals, each with equal probability. Now, we show that the Exponential Inventory-Balancing algorithm achieves the optimal competitive ratio. Theorem 1.3.6 (Exponential IB Achieves The Optimal Competitive Ratio). The competitive ratio of the Inventory-Balancing algorithm with exponential penalty function (EIB), (x) = e e1 (1e x ),x2 [0; 1], approaches 1 1 e asc MIN increases to infinity, i.e., e e1 (1e x ) = 1 1 e . Moreover, no algorithm, deterministic or randomized, that does not know the sequence of customer types in advance can obtain a competitive ratio better than 1 1 e . The proof is given in Appendix A.4.1. The first part of the proof is based on Theorem 1.3.3. The second part follows from applying Yao’s Lemma (Yao 1977) to Lemma 1.3.5. Yao’s Lemma implies that the competitive ratio of any randomized algorithm that does not know the input sequence in advance is bounded by the competitive ratio of any deterministic algorithm that knows the distribution over the input sequence. We note that the upper bound of (1 1 e ) applies to all deterministic algorithms, including the optimal dynamic programming (Bernstein et al. 2011) and re-optimization (Jasin & Kumar 2012) policies. Thus, by 15 the theorem above, in terms of the competitive ratio, the Inventory-Balancing algorithm with an exponential penalty function is optimal for this problem. We remark that this notion of optimality does not imply that the algorithm would obtain the highest revenue from every sequence of customers. By Theorem 1.3.3, the competitive ratio of the Exponential Inventory-Balancing algorithm with limited inventory, for instance forc MIN = 5, 10, 20, and 30 is, respectively, equal to 0:57, 0:60, 0:61, and 0:62. The ratio approaches 0:63 rather rapidly asc MIN grows. We emphasize that these ratios hold for all values ofT , including the asymptotic regime whereT increases to infinity (at a possibly faster rate thanc MIN ). Moreover, as shown in our numerical experiments, our algorithms often perform much better than the worst-case guarantee bounds. 1.4 Stochastic I.I.D. Arrivals The competitive ratio of our IB algorithm in Theorems 1.3.3 and 1.3.6 hold for any arbitrary, possibly adversarially chosen, sequence of customer types. It turns that our IB algorithm performs even better if the customer arrivals follow a stochastic process; that is, when the sequence of customersfz t g T t=1 is generated by a stochastic process that is known in advance. In this model, the optimal sequence of assortments can be planned by solving a multi-dimensional dynamic program; for more details see Appendix A.5. Not surpris- ingly, this approach suffers from the curse of dimensionality, even for stationary processes; see Bernstein et al. (2011). Under stochastic models, although a dynamic programming approach may be intractable, there is room for natural and powerful heuristics. First observe that for any algorithmA, the expected revenue of the algo- rithm, denoted by E fztg T t=1 [Rev A fz t g T t=1 ], is well-defined where E fztg T t=1 is the expectation with respect to the sequence of customers. Recall that by definition, the expectation over customers’ choices is taken into account by Rev A (). Furthermore, we can establish an upper bound on the revenue of the algorithm. The proof is omitted due to its similarity to Lemma 1.2.1. Lemma 1.4.1 (Revenue Upper Bound for I.I.D. Arrivals). Suppose the types of the customers is drawn, independently and identically, from a known distribution. Let z be the expected number of customers of 16 typez2Z. The expected revenue of any algorithmA, E fztg T t=1 [Rev A fz t g T t=1 ], is bounded by the optimal value of the linear programPrimal-S defined below: 10 MAXIMIZE P z2Z P S2S P i2S z r i z i (S)y z (S) SUBJECT TO: P z2Z P S2S z z i (S)y z (S) c i 1in; P S2S y z (S) = z 8z2Z; y z (S) 0 8z2Z;S2S (Primal-S) In the above linear program,y z (S) is the probability of offering the setS to a customer of typez. As before, we also denote the optimal solution of the above linear program byPrimal-S. Note that Lemma 1.2.1 provides a stronger upper bound since it holds for every customer sequence, while the above upper bound holds only in expectation. Theorem 1.3.3 provides a bound on the performance of our algorithms with respect to the upper bound in Lemma 1.2.1. When we have I.I.D. arrivals and we use Primal-S as the benchmark, as stated by the proposition below, we obtain an even stronger performance guarantee for our algorithms. Proposition 1.4.2 (Improved Performance Guarantee in the I.I.D. Arrival Model). Suppose in every period t, the type of the arriving customer is drawn independently and identically from a common distribution over the set of typesZ. In the asymptotic regime wherec MIN andT increase to infinity with T c MIN = k for some positive integer k, then with high probability, the Inventory-Balancing algorithms with linear (LIB) and exponential (EIB) penalty functions satisfy the following inequalities: lim T;c MIN!1 E fztg T t=1 [Rev LIB fz t g T t=1 ] Primal-S 0:72 and lim T;c MIN!1 E fztg T t=1 [Rev EIB fz t g T t=1 ] Primal-S 0:75; where the expectations in E fztg T t=1 [] is taken with respect to the sequence of arriving customers. The proof is given in Appendix A.4.2. The basic idea is to construct a (Factor Revealing) linear program denoted by FRLP. With high probability, every solution obtained by the Inventory-Balancing algorithm corresponds to a feasible solution of FRLP and the objective corresponds to the ratio of the expected revenue 10 See Gallego et al. (2004), Liu & van Ryzin (2008a) for a similar linear programming formulation in the context of choice-based network revenue management. 17 E fztg T t=1 Rev EIB fz t g T t=1 of the IB algorithm andPrimal-S. FRLP is parameterized by a discretization parameter. For each, we can solve the linear program to determine the lower bound on the competitive ratio. 1.4.1 Motivating IB Algorithms via Dual-Based Heuristics In this section, we provide a motivation and intuition behind our Inventory-Balancing algorithm. The dis- counted revenue index in our IB algorithm are proxies (approximation) for the dual variables. To see this, consider the following policy for I.I.D. arrival model. 1. Observe the type of the firstT customers. 11 2. Solve the dual ofPrimal fz t g T t=1 for the firstT customer: MINIMIZE P T t=1 t + P n i=1 i c i SUBJECT TO: t P i2S (r i i ) zt i (S) 1tT;S2S; i 0 1in: (1.1) Let i (T );i = 1; 2;:::;n, be the solution of the linear program above. 3. For each subsequent customer of typez2Z, we offer an assortmentS t : S t = arg max S2S X i2S (r i i (T )) z i (S) Note that the algorithm does not need to know the distribution in advance. Following from the results of Devenur & Hayes (2009), Agrawal et al. (2009), Feldman et al. (2010), Jaillet & Lu (2012), we can show that this algorithm is asymptotically optimal for I.I.D. stochastic arrivals. 12 Since the proof is similar to the existing literature, we omit the details. 11 For the purpose of analysis, assume that no product is shown to this customer that isS t =f0g fortT . 12 The dual-heuristic is 1 O()-competitive for real-time assortment optimization problem, with high probability, if: i) max r i Primal(fz t g T t=1 ) (n+1)(ln(T)+ln(2 n )) ii) 1 c MIN 3 (n+1)(ln(T)+ln(2 n )) . 18 Note that the selection rule of the above heuristic is similar to our algorithms by replacingr i i with r i I t1 i =c i . In our Inventory-Balancing algorithms, however, we do not assume any patterns for the arrival of future customers. Thus, we can think of the discounted revenue index r i I t1 i =c i as the “estimate” of the dual parameters based on the current inventory levels. This index value does not require any forecasting, and by choosing an appropriate penalty function , we can obtain an optimal competitive ratio, as shown by Theorem 1.3.3. See Feldman et al. (2010) who apply similar ideas to the online allocation of display advertisement. 1.5 Extensions In this section, we discuss how our policies can be extended to more general settings and incorporate addi- tional information about the customers’ choice models or arrival patterns. 1.5.1 Incorporating Partial Information and Learning Customer types? So far, we assumed that in each period, the algorithm knows the customer choice models. Namely, z i (S) of the customer of typez is the exact value of the selection probability. But, this may not always be the case. The firm may learn the choice model associated with each customer type over time. For instance, in our numerical simulations, we associate the type of each customer to his or her location. We consider the case where the firm estimates the parameters of an MNL model for each location by learning from purchases of the previous customers from that location. Suppose the arriving customer in periodt is of typez. We denote by z i (S) the true selection probability of producti, when assortmentS is offered to a customer of typez. In this environment, zt i (S) represents the current estimation of the selection probabilities. 13 These estimates can be obtained using partial information or historical data. We do not require specifics about how these estimations are made. But a good example would be when the customer types are drawn from a stationary distribution and the parameters of the choice model are learned from observing customers’ choices. Under standard assumptions, we expect that the estimated selection probabilities would converge to the true selection probabilities, see Appendix A.1.4 for an example. 13 Note that z i (S) does not depend on the mechanism, however, z t i (S) is a function of the mechanism since the estimations of the mechanism for each customer type depend on the assortments offered in the past. 19 The following proposition provides a lower bound on the competitive ratio when we have estimation errors. Proposition 1.5.1 (Competitive Ratio with Estimation Errors). For eacht, let t = max i;S j zt i (S) zt i (S)j be the random variable corresponding to the maximum estimation error in periodt. Suppose the Inventory- Balancing algorithm sells at least one unit of each product. Then, the competitive ratio of the Inventory- Balancing algorithm is at least equal to min x2 h 0;1 1 c MIN i 8 > < > : 1x 1 c MIN + 1 (x) + R 1 x+ 1 c MIN (y)dy + 2 c MIN E h P T t=1 t i 9 > = > ; : The proof is given in Appendix A.4.3. Note that when there is no estimation error, i.e., zt i (S) = zt i (S), then the above expression is the same as the competitive ratio in Theorem 1.3.3. Furthermore, the only assumption made on the estimations errors is that the algorithm should sell at least one unit of each product. This assumption is made mainly for technical reasons to rule out the situations that the estimations are so far off that the algorithm never sells the products sold by the optimal solution (that knows the true estimations). We expect this condition to be satisfied when the estimation errors are small or if they vanish over time as the mechanism gathers more data about each type. In Appendix A.1.4, we present an example that demonstrates how learning customer types can be incor- porated in our framework. Furthermore, in Appendix A.3.3, using numerical simulations, we evaluate the performance of the IB algorithms when the selection probability for each customer type is unknown and must be estimated from data collected in earlier periods. 1.5.2 Incorporating (Uncertain) Information About Arrival Pattens The Inventory-Balancing algorithms do not rely on any forecast of the future customer arrivals; however, if such forecast exists, it could potentially be used to improve the performance of the algorithms. Con- sider a heuristicL, for example, the linear program re-optimization, that relies on the distribution (e.g., the estimated number) of the customers of each type. This heuristic would perform well if the estimations are accurate, however, it performs poorly when the estimate turned out to be inaccurate or there is a high degree of uncertainty; see Section 1.6. We propose a family of algorithms called the Hybrid algorithm that combines 20 the solution of such heuristics and IB algorithms. These algorithms incorporate additional information about the arrival sequence and at the same time maintain a reasonable competitive ratio in unpredictable scenarios; see Mahdian et al. (2007, 2012). The Hybrid algorithm, given below, is parameterized by a number 1. This parameter controls the extent to which one would rely on heuristicL. THE HYBRID ALGORITHM WITH PARAMETER Upon the arrival of the customer in periodt2f1;:::;Tg, of typez t : LetS t L be the set that heuristicL recommends in periodt. Offer the assortmentS t L if: 0 @ X i2S t L I t1 i =c i r i zt i (S) 1 A max S2S ( X i2S I t1 i =c i r i zt i (S) ) Otherwise, offer an assortmentS t 2 arg max S2S P i2S I t1 i =c i r i zt i (S): The next proposition provides a lower bound on the competitive ratio of the Hybrid algorithm. Proposition 1.5.2 (Competitive Ratio of the Hybrid Algorithm). Suppose is an increasing, concave, and twice-differentiable penalty function. Asc MIN !1 , the competitive ratio of the Hybrid algorithm with a penalty function and parameter , and for any heuristicL, is at least equal to 1 ( ), where 1 ( ) = min x2[0;1] ( 1x (1 (x)) + R 1 x (y)dy ) For example, for the exponential penalty function, 1 e e1 (1e x ) , for = 1:5 and = 2, is approximately equal to 0:48 and 0:39, respectively. The proof is very similar to proof of Theorem 1.3.3 and is omitted. The main idea is to assign t = P i2S tr i I t1 i =c i zt i (S t ) . Intuitively, we are 21 extending the feasible region of the dual problem which allows the algorithm to follow the heuristics on the recommendations that are considered “safe”. Our simulation results in Section 1.6.2 and Appendix A.3.1, we consider a Hybrid algorithm that com- bines the EIB algorithm (Theorem 1.3.6) and the linear program re-optimization heuristic. We show that the Hybrid algorithm outperforms the Inventory-Balancing algorithm, when the number of customers is known in advance by the re-optimization policies. On the other hand, when the number of customers is uncertain, the Hybrid algorithm outperforms the re-optimization methods. 1.5.3 Beyond Substitutability In this section, we explain how we can relax Assumption 1. Recall that, by Lemma 1.3.2, Assumption 1 implies that our algorithm does not benefit from showing a product with no remaining inventory. But, sometimes the assumption may not hold. For instance, when the dissimilarity parameters in the nested logit model are larger than 1 (Davis et al. 2011, Bhat 2002), or there are externalities among the products 14 . More specifically, we assume that the choice model satisfies the following property: Suppose a cus- tomer is offered a setS and then she chooses producti2 S. Then, the customer buys producti if it has positive inventory, otherwise leaves without making any purchase. Under this choice model, we can use the Inventory-Balancing algorithm exactly in the same way as before. The inventory level of the products that are stockout remains at 0 even if the product is shown to the customer. In this model, the optimal revenue can be upper-bounded by the linear program below. MAXIMIZE P T t=1 P S2S P n i=1 r i zt i (S)y t (S) P i r i w i SUBJECT TO: P T t=1 P S2S zt i (S)y t (S)w i c i 1in; P S2S y t (S) = 1 1tT; y t (S) 0 1tT; S2S ; w i 0 1in: 14 For example, William Poundstone, in his book Priceless, documented the following case: “Williams-Sonoma added a $429 breadmaker next to their $279 model: Sales of the cheaper model doubled even though practically nobody bought the $429 machine.” Thompson (2012) 22 This linear program is the same as the linear programPrimal fz t g T t=1 given in Section 3.1, with a new set of variables: w i denotes the amount of times that a customer selects producti after its inventory hits 0. In this case, the product will not be allocated to the customer. We now argue that the algorithm obtains the same competitive ratio as before. The argument is based on the following observation. The dual of the linear program above is as follows: MINIMIZE P T t=1 t + P n i=1 i c i SUBJECT TO: t P n i=1 zt i (S) (r i i ) 1tT;S2S; i r i 1in; i 0 1in: Note that compared to the previous dual in Section B.1.1, the above linear program has a new set of con- straints: i r i . However, these constraints are satisfied by our construction of the feasible solution in the proof of Theorem 1.3.3. Therefore, the ratio of the primal and dual solutions and the competitive ratio of the algorithm are the same as are described in Theorem 1.3.3. 1.6 Numerical Experiments In this section, we numerically evaluate our Inventory-Balancing (IB) and Hybrid algorithms, and compare them with existing methods in the literature. The first IB algorithm is the Linear Inventory-Balancing (LIB) with a linear penalty function (x) = x. The second algorithm is the Exponential Inventory-Balancing (EIB) with penalty function (x) = e e1 (1e x ). In the next section, we describe the dataset and the simulation setting. Then, we compare the revenues and running times of our IB and Hybrid algorithms with the Myopic policy and existing heuristics based on resolving linear programs. Section 1.6.3 shows the benefits of differentiating customers by their types. Finally, in Section 1.6.4, we present a synthetic simulation to compare our algorithm with the Myopic policy. 23 Location Top Three DVDs Jersey City, NJ Lost; Sin City; The Simpsons Manhattan NY Lost; Sin City; Bob Dylan - No Direction Home Orlando, FL Lost; The Muppet Show: Season One; Firefly Miami, FL What the Bleep Do We Know!?; Sin City; Lost San Jose, CA Star Wars: Episode III; Lost; The Complete Thin Man Collection Beverly Hills, CA The Simpsons; L’Auberge Espagnole; Sin City Inglewood, CA Lost; Sin City; Final Fantasy VII - Advent Children Dallas, TX My Big Fat Greek Wedding; Lost; Shark Tale Urbana Champaign, IL Firefly; Lost; The Simpsons Charlotte, NC Lost; Curb Your Enthusiasm; Family Guy Presents Stewie Griffin Table 1.2: The locations used in the simulations and the top three best-selling DVDs in each location. 1.6.1 Dataset and Simulation Setting We use the DVD sales data from a large online retailer, and consider DVDs that are sold during a four-month period from June 1, 2005 through September 30, 2005. During this period, the retailer sold over 5:7 million DVDs in the United States, spanning across 55; 875 DVD titles. We consider the location of each customer as her type, and for our analysis, we choose 10 different locations that would reflect diverse purchasing patterns of the customers. Table 1.2 shows the list of locations and the top three DVDs with the highest sale volumes in each location 15 . In our experiments, we consider a total of 73 DVD titles (n = 73), obtained by taking the union of the top 20 DVDs in each location and removing duplicates. For each DVDi = 1; 2;:::; 73, we set the revenuer i to be the average selling price of the DVD during this period. The DVD prices range from $9-$81, with more than 50% of the DVDs are priced less than $20, and the average and standard deviation of the DVD selling prices are $25.2 and $12.9, respectively. We assume that all DVDs have the same initial inventory, with no replenishment. Finally, we assume a multinomial logit choice model for each customer type. For each typez2Z, we estimate the preference weight parameters (v z 0 ;v z 1 ;:::;v z n )2R + n by computing the maximum likelihood estimate based on the sales data in each location. In the simulations, each problem class is characterized by two parameters: the loading factor (LF) and the coefficient of variation (CV) associated with the proportions of customer of each type. Note that the 15 Two of the names of the DVDs in Table 1.2 are shortened; “Lost” and “The Simpsons”, respectively, correspond to “Lost - The Complete First Season” and “The Simpsons - The Complete Sixth Season”. 24 coefficient of variation is the ratio of standard deviation to the mean; that is, for a given fixed mean, larger CV means that there is a larger uncertainty in the type of an arriving customer. The loading factor (LF) is the ratio between the (expected) total number of customers and the total number of units. For a given loading factor, we can determine the (expected) total number of customers. The number of customers of each type is generated by multiplying the total number of customers by a 10-dimensional Dirichlet random variable = ( 1 ;:::; 10 ), where i represents the proportion of customers typei 16 . Thus, on average, each type is equally likely 17 . Once the number of customers of each type is generated, the order of arrivals is random. We further assume that a single customer arrives in each period. 1.6.2 Performance Evaluation In this section, we compare the performance of our two IB algorithms to the Hybrid algorithm, the Myopic policy, and the LP-based heuristics. The Myopic policy offers the assortment with the maximum expected revenue among all assortments that include only products with positive remaining inventory. 18 The first LP-based algorithm, called the LP One-Shot (LPO), is a heuristic based on the solution of the linear program Primal-S where the expected number of each customer type z is equal to 0:1 E[T ] 19 . This follows because each of the 10 customer types is, on average, equally likely, and E[T ] is the expected number of customers. Although the linear program (LP) Primal-S has exponentially many variables, we can solve it efficiently using techniques from Gallego et al. (2004), Liu & van Ryzin (2008a), Topaloglu (2013). Under the LPO, we solve the linear program Primal-S exactly once and denote the solution by f y z (S) :z2Z;S2Sg. LPO policy is constructed as follows: upon the arrival of a customer of typez, offer her assortmentS with probability y z (S)= P S 0 2S y z (S 0 ). 16 For ak-dimensional Dirichlet distribution with parameters1;:::; k , the mean and variance ofi fori = 1; 2;:::;k are respectively given by i 0 and i ( 0 i ) 2 0 ( 0 +1) where0 = P k j=1 j . Given that1 =2 =::: = k =, the coefficient of variation is q k1 k+1 . The parameter of the Dirichlet distribution,, is chosen so that E [i] = 0:1 for alli andi has the desired coefficient of variation. 17 We get similar results when each type is not equally likely. 18 The Myopic policy corresponds to an Inventory-Balancing algorithm with penalty function (x) = 1 l[x> 0]. 19 WhenT is known, E[T ] is replaced byT . 25 We also consider an adaptive variation of this policy, called ALPO, that excludes any product with zero remaining inventory from an offered set. Note that ALPO is an adaptive policy because the assortment that we offer to each customer may change over time, depending on the product availability. The second LP-based algorithm is called LP Resolving (LPR). Under this algorithm, the linear program Primal-S is resolved periodically everyh periods, with the up-to-date inventory levels and forecasts of the proportion of customers of each type. We denote this heuristic by LPR h . In this heuristic, at the beginning of a resolving periodt, z is replaced by z (t). We define z (t) as the empirically estimated expected number of customers of typez that would arrive between periodst andT , z (t) is equal to the fraction of customers of type z that arrived between time 1 and t 1 multiplied by E[Tt+1jTt] t1 , where E[Tt + 1jT t] is the expected number of future customers given that t customers have already arrived 20 . Also, we set z (1) = 0:1. As the system evolves, the realized number of customers of each type will differ from the expected value, and z (t) incorporates these differences into the estimates. The LPR algorithm also updates the inventory level to reflect the current remaining inventory of each product. The Hybrid algorithm combines IB algorithms with LP based heuristics, see Section 1.5.2. 21 In partic- ular, we assume that Hybrid algorithm incorporates the solutions of EIB algorithm and the LPR 500 heuristic with parameter = 1:5 and 2. For = 1:5 and = 2, the competitive ratio of the Hybrid algorithm is at least equal to 0:48 and 0:39, respectively. Recall that parameter should be greater than 1 and the larger , the more the Hybrid algorithm applies the LPR heuristic. We denote the Hybrid algorithm with parameter byH . We evaluate the aforementioned algorithms by comparing the revenue with the upper-bound given by clairvoyant optimal solution which knows the arrival sequence of all the customers in advance; see Lemma 1.2.1 for more details. Simulation Parameters: We consider 9 problem classes, corresponding with loading factors 1:4, 1:6, and 1:8, and coefficients of variation of 2:0, 1:0, and 0:1. We choose the loading factor greater than 1 because we are interested in cases where the demand exceeds supply. Each problem class consists of 250 problem instances. In each problem instance, we set the initial inventory levels to 100, i.e.,c i = 100,i = 1; 2;:::; 73, 20 Note that when the length of the horizon is known, E[Tt + 1jTt] is simplyTt + 1. 21 We do not present the results for the Hybrid algorithm that combines LIB solution with LP resolving heuristics since its performance is very close to the one we considered here. 26 and the length of the horizon is randomly and uniformly chosen from an interval T; T 22 . In table 1.3, for each problem class, we present the average revenue of each algorithm as a percentage of the upper bound, averaged over 250 instances in each problem class. We present the results for periodic LPR withh = 50 andh = 500. As expected, the upper bound increases as the loading factor increases and the variability decreases. Both the LIB and EIB algorithms surpass all other policies. In addition, LPR policies cannot obtain more than 92% of the upper bound, even if we increase the frequency of resolving 23 . In all cases, theH 1:5 algorithm performs better than theH 2 algorithm because the larger , the more the Hybrid algorithm relies on the LPR heuristic. Note that the Hybrid algorithms get 5%-12% more revenue than the LP resolving heuristics. This highlights the benefits of combining the solution of IB policies and LP based heuristics. As we will discuss later, we also consider the worst-case performance of all policies; see Figure 1.1 and Appendix A.3.1. In most problem classes, the LPO algorithm has the lowest revenue and its performance decreases by increasing CV and loading factor. The main reason for the poor performance of the LPO algorithm is the lack of strategy adjustment. The LPO algorithm solves the linear program exactly once and does not update its decision based on the current inventory level (it just checks whether the product is available, in case of ALPO), nor does it consider the proportion of each customer type. When the uncertainty in type of arriving customers and the number of customers is high, this lack of adjustment can result in significant reduction in the performance of the LPO algorithm, e.g., for CV=0.1 and loading factor of 1.4, the ALPO algorithm obtains 93:5% of the optimal solution while for CV = 2, it gets 81:4% of that. Note that the ALPO algorithm that adjusts its offered set based on the product availability (offering only those with positive inventories) yields 1% 8% increase in the revenue compared with the non-adaptive LPO algorithm. This also emphasizes the need for a real-time algorithm that adjusts its strategy based on the remaining inventory of products and customers’ choices. 22 We choose the interval T; T by first computing the expected number of customers E[T ] from the loading factor and the total initial inventory of each problem class. Then, we setT = 0:5E[T ] and T = 1:5E[T ]. 23 When the degree of uncertainty in the number of customers reduces, the LPR heuristic performs better, e.g., when T = 1:3E[T ],T = 0:7 E[T ], LF=1.4 and CV=1, the LPR 500 obtains 94% of the upper bound. Even in this case IB algorithms beat the LPR heuristic by 3%. 27 Problem Upper Average Revenue Under Different Policies (as % of the Upper Bound) Class Bound Inventory-Balancing Myopic One-shot LP LP Resolving Hybrid LF CV (in $1000) EIB LIB Policy LPO ALPO LPR 500 LPR 50 H1:5 H2 1.4 2.0 165 97.0 96.9 96.4 73.6 81.4 91.3 91.3 96.5 95.8 1.0 169 96.8 96.9 95.8 84.3 88.4 89.8 90.0 96.6 96.5 0.1 172 97.5 97.5 96.4 92.8 93.5 90.9 90.8 96.8 96.7 1.6 2.0 170 97.3 97.3 96.8 66.7 73.6 88.5 88.6 96.8 95.9 1.0 174 97.5 97.6 96.5 82.5 86.2 88.1 88.1 97.2 96.2 0.1 178 98.4 98.5 97.3 91.8 92.4 89.5 89.7 98.3 97.7 1.8 2.0 174 98.0 97.9 97.2 62.9 68.8 86.9 87.0 97.9 97.2 1.0 178 98.0 98.1 97.1 78.2 81.9 86.5 86.6 98.1 97.9 0.1 178 97.8 97.9 96.7 85.1 85.2 86.1 86.3 97.8 97.5 Table 1.3: Revenue comparison when the length of the horizon is random. The standard errors of all numbers are less than 0.1%. Explanation of LPR performance: In appendix A.3.1, we show that when the horizon lengthT — cor- responding to the total number of customers — is known in advance, the LPR heuristic performs very well and beats all other policies. We also observe that the LPR revenue increases linearly over time, which shows that the LPR method exploits the known length of the horizon to effectively ration the inventory to all cus- tomers. However, whenT is random, and the length of the horizon is less than the expectation, the LPR heuristic may end up with rather large left-over inventory (of more profitable products). On the other hand, ifT is larger than the expectation, then the LPR heuristic may run out of some the products, and customers would face stock out. We note that, in all problem classes, Myopic policy performs well because there is only a small chance that a customer will not purchase a product; that is, most of the preference weight parameters of the MNL model are positive for all customer types. Thus, if the Myopic policy runs out of the most profitable products quickly by offering them to customers, then it can still collect high revenues because future customers continue to like the rest of the available products. To demonstrate this point, in Section 1.6.4, we conduct a simulation on synthetic data where each customer type is only interested in a subset of products, and the preference weights are small for most products. We observe that in this setting, our IB algorithms can beat the Myopic policy up to 8%. Running Time: Although the LPR 50 and LPR 500 heuristics do not solve the linear program very often, they are still twenty five and four times slower than the Inventory-Balancing algorithms, respectively. We 28 90 91 92 93 94 95 96 97 98 99 100 0 0.2 0.4 0.6 0.8 1 Revenue (as % of the Upper Bound) Fraction of Problem Instances ALPO LPO LPR 500 Myopic H 1.5 LIB EIB Figure 1.1: The C.C.D.F. of revenues with LF= 1:8 and CV= 2. For each algorithm, the curve shows the fractions of problem instances (out of 250) whose revenue exceed certain percentages of the upper bound. note that, in the special case of the multinomial logit choice model, the linear program Primal-S with O(2 n ) variables andO(jZj) can be reduced to an equivalent LP with justO(njZj) variables andO(njZj) constraints Topaloglu (2013). However, for the general model where this reduction is not possible, the LPR heuristic will be even slower. Distribution of revenues across problem instances: To get more insight into the performance of dif- ferent policies, in Figure 1.1, we depict the Complementary Cumulative Distribution Function (C.C.D.F.) of revenue, as percentage of the optimal clairvoyant revenues, across 250 problems instances in the prob- lem class with LF= 1:8 and CV= 2, for Myopic, LIB, EIB, LPR 500 , LPO, ALPO, andH 1:5 algorithms. In Figure 1.1, the curve for each algorithm shows, for 2 [90; 100], the fraction of problem instances whose revenues are at least% of the upper bound. It can be seen that both IB polices almost dominate all other policies stochastically. 24 In addition Myopic and Hybrid policies stochastically dominate the LPR 500 heuristic, which demonstrates the weak performance of LPR heuristics, in the worse-case sense, when there is uncertainty in the number of customers. We observe that for 80% of problem instances, EIB and LIB algorithms obtain more than 95:8% of the optimal clairvoyant revenue. However, the LPR 500 heuristic gets more than 95:8% of the upper bound only for about 33% of all instances. 24 PolicyA stochastically dominates policyB if for anyx, FA(x) FB(x) where FA and FB is the complementary cumulative distribution function of revenue (as % of the Upper Bound) for policyA andB, respectively. Note that when policyA dominates policyB, the worst-case performance of policyA is also better (higher) than that of policyB. 29 1.6.3 Benefits of Personalization: The Importance of Knowing Customer Types To quantify the benefits of personalization, we compare the revenue in two cases: multiple-type versus single-type. We consider the same 10 locations as in section 1.6.1. The total number of DVDs is 73, corresponding to the union of top 20 DVDs with the highest sales volumes in each location. The initial inventory of each DVD is set at 30. The setting for the multiple-type case is exactly the same as Section 1.6.1, where we estimate the parameters (v z 0 ;v z 1 ;:::;v z n ) for each typez, and customize the decision to the type of each arriving customer. On the other hand, in the single-type case, we estimate a single parameter vector (v 0 ;v 1 ;:::;v n ), and use it to make decision for all customers. In our simulation, when we compute the revenue, we assume that each customer makes a decision based on her own type. Since the multiple- type model is more accurate, we expect that the multiple-type setting will yield higher revenue, but the key question is the magnitude of the improvements. Assort- Revenue in the Improvements Over Loading ment Upper Multiple-type Case the Single-type Case Factor Size Bound (as % of the Upper Bound) (as % of the Upper Bound) (LF) (C) (in $100 ) LIB EIB LIB EIB 1.2 10 526 82.3 82.5 17.4 15.4 20 90.8 90.7 11.6 9.8 1.4 10 546 88.0 88.0 19.3 17.5 20 95.9 95.6 11.1 10.0 1.6 10 549 95.3 94.9 21.3 19.2 20 98.9 98.7 6.5 5.6 Table 1.4: The average revenue for IB algorithms in multiple-type model and the improvement over the single-type model. All numbers are statistically significant and the standard errors are less than 0.1%. We impose a constraint C on the size of the assortment that we can offer to each customer, and we considerC2f10; 20g. Columns 4 and 5 in Table 1.4 show the revenues of the IB algorithms in the multiple- type case, for different values of the size constraints. The corresponding difference between the revenues in the multiple-type and single-type case is reported in columns 6 and 7. We consider the loading factors of 1:2, 1:4, and 1:6, but set the coefficient of variation at 0:2 because the results for the other coefficients of variations are similar. 25 As expected, the revenue from the multiple-type case exceeds the single-type, 25 Note that we did not use the optimal solution because it is rather challenging to solve the corresponding linear programs with constrained assortment size. When there is no restriction on the size of the assortment, though the LP has exponentially many variables, we can solve it efficiently using the techniques from Topaloglu (2013). 30 regardless of the algorithms we use. However, the benefits of customization is very significant when the assortment size is small. For example, for loading factor of 1:6 andC = 10, using the LIB algorithm in the multiple-type case yields over 21% improvement. As the size of the assortment increases, however, the benefits of customization decreases. 1.6.4 Comparison to Myopic Policy Theoretically, it is easy to show that our algorithm would perform significantly better than a Myopic policy. Using the experiment on real data (for choice models) in Section 1.6.2, we also observe improvement of 1% -2% over the Myopic policies. In this section, we run a synthetic simulation in which our algorithm can get up to 7.4% more revenue compare to Myopic policy. In this simulation there are 73 DVDs and 10 types. As in our simulations, for each DVD i = 1; 2;:::; 73, the revenue r i is the average selling price of the DVD obtained from the dataset and the initial inventory levelc i is set to 30. We also assume that customers purchase according to MNL model. However, in this simulation we do not obtain the preference weight parameters, (v z 0 ;v z 1 ;:::;v z n ) for each typez2Z, from the dataset. Rather, we order the products based on their prices so that product 1 has the highest price and product 73 has the lowest price. We assume that for each of the first 9 customer types (z2f1; 2;:::; 9g), the customer typez is only interested in the products indexed byA z =f1; 2;:::; 7zg, withv z i = 1 fori2 A z , andv z i = 0:001 fori = 2 A z . Customers of type z = 10 are interested in all product equally likely. We further assume thatv z 0 = 1 for all 10 types. The way we construct the preference weight parameters reflects the fact that some customers are only interested in recently released DVDs that are of course more expensive while some customers are less sensitive to this issue. The arrival process is exactly the same as previous sections namely the number of customers of each type is computed by multiplying the total number of customers by a 10-dimensional Dirichlet random variable. After generating the number of customers of each type, we uniformly permute them to determine the order of arrival. Table 1.5 compares different policies in this setting when the length of the horizon is drawn from the uniform distribution with TT = E[T ] for coefficients of variations 0:5 and 1:0 and loading factor of 1:2, 1:4, and 1:6. We observe that the both EIB and LIB algorithms beat the Myopic policy by 4.5%-7.4%. In addition, IB policies surpass all other policies as well. 31 Problem Upper Avg. Revenue Under Different Policies (as % of the Upper Bound) Class Bound Inventory-Balancing Myopic One-shot LP LP Resolving LF CV (in $100 ) EIB LIB Policy LPO ALPO LPR 500 1.2 1.0 499 95.5 96.0 90.1 65.2 66.6 90.5 0.5 516 94.9 95.5 88.1 80.8 82.2 90.8 1.4 1.0 514 96.1 96.6 90.8 63.9 65.3 89.8 0.5 530 95.6 96.2 89.5 79.4 80.7 90.7 1.6 1.0 524 96.8 97.3 92.3 71.3 72.7 91.3 0.5 538 96.5 97.0 90.9 88.7 89.8 92.7 Table 1.5: Revenue Comparison. The standard errors of all numbers are less than 0.1%. 1.7 Conclusion and Future Work Motivated by the availability of instantaneous data on customer characteristics, we formulated a real-time, personalized, choice-based assortment optimization problem, with arbitrary customer types. Our proposed Inventory-Balancing algorithms are simple, intuitive, and effective. We establish the competitive ratio for our algorithms, and prove that it is the best possible for this problem. The managerial insight from our work is that companies can increase their revenue by personalizing the operational decisions to each customer, using the real-time information about the customer’s characteristics. This process requires coordination between front-end customer-facing decisions with backend supply chain constraints. We demonstrate that such a coordination can be achieved using simple index-based algorithms that can be easily implemented. As the volume and speed in which real-time data become available, the opportunity in this area will continue to grow, and we believe this work can serve as a starting point for more complex models. Our proposed Inventory-Balancing algorithms maintain an index for each product, in which the marginal revenue is discounted by a penalty based on the product’s remaining inventory. The index serves as a simple mechanism for coordinating between the fast-moving customer-facing decision and the back-end operational constraints. In addition to inventory, it would be interesting to consider other supply chain constraints. For example, in the network revenue management, each product corresponds to an itinerary and it uses a common pool of resources, corresponding to seats on flights. When resources are shared among products, coordinating and valuing the benefit of each additional unit of resource becomes challenging. One way to extend our index-based framework is to consider an index that depends on the inventory of all resources simultaneously. 32 Another exciting direction is to explicitly model the mechanism for learning the choice model of each customer. Recent advances by Farias et al. (2013) have allowed one to model and estimate a very rich class of choice models, and it would be interesting to understand how we can incorporate the learning mechanism with the real-time decision-making process considered in this chapter. Finally, it would be interesting to consider personalization of other operational decisions, such as pricing, warranty services, or shipping options. We believe that the framework in this work can serve as a starting point for analyzing the personalization of these decisions. 33 Chapter 2 Auctions with Dynamic Costly Information Acquisition 2.1 Introduction In the traditional auction theory, it is often assumed that bidders have full information about their valuations for the items that are sold at the auction, and the challenge for the auctioneer is to design a mechanism that elicits the preferences and valuations of the bidders. However, this assumption may not always hold, for instance, in the context of online advertising. We explain this below. The Internet has provided an unprecedented platform for advertisers to reach their consumers and cus- tomize their advertisements at an individual level. For instance, major retailers and department stores, such as Macy’s or Nordstrom, target users who have previously visited their websites by showing them ads fea- turing products viewed by the users. These ads usually include a picture of the product(s), often with a promotional price or a discount linked to the retailer’s website. This level of targeting has been made pos- sible by using HTTP cookies that allow advertisers to track the users that visit their websites. A cookie contains the information that browsers communicate to the websites visited by the users (see RFC6265 (2011)). A growing trend in online advertising is the emergence of companies that provide information about Internet users; we will refer to these as information providers. 1 The information about the users is usually gathered via third-party cookies, which are often installed by a website that a user may not have visited. Using this technology, it is possible to track a user across different websites in order to build a browsing history. The browsing history of the user provides valuable information for advertisers, including the user’s interests and intentions. For example, consider a user who has searched a website for flights to Hawaii. Later, when the user visits other websites, the advertisers can follow her and show her deals for vacation 1 Examples include BlueKai, eXelate, Acxiom, PulsePoint, LiveRamp, Neustar, DataLogix, and OpenTracker. 34 packages to Hawaii (cf. Helft & Vega (2010)). Some companies, such as Epsilon and Acxiom, take this one step further by merging the information gathered online with offline marketing databases (cf. Singer (2012)). The information providers, who have a detailed and up-to-date history for each user, sell their data to the advertisers, for instance, via cookie-matching services. The advertisers use this data to update their valuations and target their potential customers. In many other environments, bidders can obtain additional information about their valuations at a cost. For instance, in the sale of complex financial or business assets, to make a better investment decision, the bidders invest heavily in the due diligence process to uncover the value of the assets (Vallen & Bullinger 1999). In timber auctions, bidders need to examine the volume and composition of wood on tracts (Roberts & Sweeting 2010, Athey & Levin 1999). Similarly, in oil and gas auctions, bidders can conduct seismic studies to better assess the likelihood of finding oil and gas (Hendricks et al. 2003). In all of the examples above, the bidders incur the cost of information. But, the seller in fact indirectly bears these costs. The intuition is that these costs will affect the bidders’ (ex-ante) willingness to pay, and this in turn impacts their participation and bidding. Therefore, the seller should ensure that bidders do not over-invest or under-invest in information acquisition. On one hand, to avoid under-investment, the seller should motivate some of the bidders to obtain information. On the other hand, to avoid over-investment, the seller may would like to restrict some of the bidders from accessing to costly information if he is able to control access to information. In the aforementioned applications, the seller can exert such a control. For instance, in the sale of complex financial assets, the seller may control how much detail is disclosed to the potential buyers. In timber (oil and gas) auctions, the bidders can only obtain the additional information if the auctioneer allows them to examine the tracts (fields). In the context of online advertising, the seller (online publisher) can release the identity of the user to only a subset of the advertisers. The identity of the user can be revealed to advertisers by disclosing HTTP cookies (Kristol 2001). If the publisher releases these cookies to an advertiser, the advertiser can subsequently take the cookies to the aforementioned information provider and obtain (purchase) information. On the flip side, if the publisher does not disclose these cookies to some advertisers, then these advertisers cannot acquire any information about the user. 2 2 Major ad exchanges, such as Google AdX, allow publishers to run private auctions (Google DoubleClick Documentation 2016) where they can control the information that is provided to the advertisers. 35 In this work, we answer the following question: If the seller controls access to (additional) information, how can he incentivize a right set of bidders to invest in information acquisition? Our Contributions We present a model to study costly information acquisition in auctions. Our model consists of an auctioneer that sells an item to a set of agents. The agents have an initial private valuation for the item and can obtain additional information at a cost. However, “access” to this additional information is controlled by the auctioneer, and the mechanism may grant access only to a subset (or none) of the agents. We present a two-stage efficient mechanism in our setting in Section 2.4. A mechanism is efficient if it maximizes the sum of the social welfare of the auctioneer and the agents, taking into account the cost incurred to obtain the additional information. When there is no such cost, the efficient mechanism allows all agents to obtain the additional information. However, when the information is costly, the efficient mechanism grants access to the additional information only to a subset of the agents. The efficient mechanism works as follows: In the first stage, agents bid in an initial round of bidding. Then, based on their initial bids, the auctioneer selects a subset of agents and grants them access to obtain additional information. Each selected agent can acquire the additional information by incurring a cost. The selected agents then update their valuations and bid in the second round of bidding. The second stage corresponds to the second-price auction with no reserve. 3 In order to increase the revenue, the seller may want to set a reserve price in the second stage. In Section 2.5.1, we extend our analysis to mechanisms where the item is allocated via a second-price auction with a reserve. These mechanisms are appealing from a practical perspective because they allocate the item using the second-price auctions which is the prevalent mechanism used in ad exchanges (Muthukrishnan 2009). We further present a revenue-optimal mechanism in our setting. It turns out that the allocation stage of the revenue-optimal mechanism is a bit more complicated than a second-price auction. In order to optimize the revenue, the mechanism selects a set of agents so that it maximizes the “virtual revenue” minus the cost 3 In this work, we consider two stage mechanisms where all the agents are selected at the same time (in the first stage) and the item is allocated in the second stage. See the discussion on adaptive selection rules at the end of Section 2.4. 36 of information. The item is allocated via a weighted second-price auction where the weights favor the agents with higher initial bids. In the above mechanisms, the auctioneer controls access to information. To study the impacts of such a control, in Sections 2.4.1 and 2.5.3, we investigate a mechanism called All-Access, where the auctioneer does not control access to the additional information. In this mechanism, agents obtain additional informa- tion if they choose to incur the cost. That is, the mechanism leaves the decision on obtaining the costly information to the agents. The item is allocated via a standard second-price auction (with a reserve). We show that the mechanism always admits a pure strategy Nash equilibrium. The equilibria, however, might not be efficient/revenue-optimal. We observe that under the All-Access mechanism, the agents tend to over- invest in information acquisition when the cost is low. Similarly, when the cost is high, the agents tend to under-invest in information acquisition. We numerically compare the above mechanisms. Interestingly, on average, the revenue-optimal mech- anism allows fewer agents to obtain the additional information compared to the efficient mechanism. It is well established that the revenue-optimal mechanism distorts the allocation and creates inefficiencies in order to extract more revenue from agents with higher valuations. We observe that the revenue-optimal mechanism distorts the revelation of information in addition to the allocation; see Section 2.6. In addition, we observe that the revenue-optimal mechanism can significantly increase the revenue compared with the aforementioned All-Access mechanism; see Section 2.5.3. Our proposed mechanisms are flexible and can be generalized to a setting where multiple units of the item are sold; see Section 2.7.2. Furthermore, they can be extended to environments where the cost of information can be seen as an entry cost, and all agents who may receive the item with non-zero probability must invest in obtaining information; see Section 2.7.1. Related Work In this section, we briefly discuss the literature related to our work. Dynamic Mechanism Design: Our work belongs to the growing body of research on mechanism design; see Bergemann & Said (2011b) for a survey. In particular, our work is closely related to that of ¨ Eso & Szentes (2007) and generalizes their model to a setting where information acquisition is costly. In the 37 absence of this cost, ¨ Eso & Szentes (2007) show that the revenue-optimal mechanism grants all agents access to the additional information. In contrast, we show that when obtaining the additional information is costly, the auctioneer, even in the efficient mechanism, may not allow all bidders to acquire the additional information. The selection rule of our mechanism determines the set of agents who could access (and obtain) the additional information. From a technical perspective, as we discuss later in Appendix B.1.1, this makes our proof a bit challenging. See Kakade et al. (2013), Pavan et al. (2014), Battaglini & Lamba (2012), Boleslavsky & Said (2013), and Lobel & Xiao (2013) for recent results on designing optimal dynamic mechanisms. Costly Information Acquisition: Most previous work on information acquisition considers settings where the bidders do not have any private information prior to entering the auction. In such a setting, where the auctioneer controls the bidder’s access to information, Cr´ emer et al. (2009) show that the auctioneer can extract all the surplus by imposing an admission fee; see also Pancs (2013). Information acquisition has also been studied in the principle-agent context (Cremer & Khalil 1992, Szalay 2009) and reverse auctions (Beil et al. 2015). Shi (2012) studies costly information acquisition in a setting where bidders do not have any private information prior to entry and can decide on how much to invest in order to obtain information. He shows that the optimal mechanism takes the form of standard auctions (e.g., second-price) with a reserve price. In contract, in our setting the bidders are privately informed before they decide on the obtaining additional information. Ye (2007) and Quint & Hendricks (2012) study indicative bidding auctions that are commonly used in selling financial assets. The auction works as follows: bidders submit non-binding bids to indicate their interest in the assets. The auctioneer then selects some of the bidders that have higher valuations to proceed to the second round, which involves a costly due diligence process and final bidding. Ye (2007) shows that in the indicative bidding, efficient entry of the bidders is not guaranteed and the most qualified bidders might not be selected by the auctioneer. Note that in contrast to our work, the number of selected bidders is predetermined. In addition, only selected bidders who invest in obtaining the additional information may participate in the allocation stage of the mechanism. One of the closest works in the literature to ours is that by Lu & Ye (2014), who study the design of a two- stage revenue-maximizing mechanism when acquiring information is costly. Similar to indicative bidding 38 auctions, they assume that obtaining the costly additional information is necessary for agents to participate in the second stage. Under this assumption, as the initial valuations of the agents increase, fewer agents will be allowed to acquire information. Specifically, the selection rule of the mechanism is “monotone” in initial valuations. In contrast, we observe in Section 2.6 that the selection rule of our proposed mechanisms is non- monotone. The reason is that in our model all agents—including those who do not update their valuations— participate in the second stage and have a chance to receive the item; see Section 7 for details. The non- monotonicity of the allocation rule makes our proofs more complicated. Furthermore, they assume that the seller can observe who obtains the additional information, which may not be a realistic assumption in many practical contexts. In contrast, in our setting the seller does not observe who obtains the additional informa- tion. Therefore, our mechanisms should be designed in a way that all selected agents willingly acquire the costly information. This requirement makes the mechanism design problem more challenging. In addition, we observe that the seller earns (significantly) higher revenue in our setting. We provide a numerical example in Section 2.7.1. This is quite intuitive; When the cost of information is high as in our setting, the seller can allocate the item even when no agent invests in information. Another reason that the seller can obtain higher revenue in our setting is that he can allocate the item to one of the agents that did not obtain information when the updated valuations were all low. In Section 2.7.1, we discussed how our mechanisms can be extended to the setting studied by Lu and Ye (2014). In particular, we extend their revenue-maximizing mechanism to a setting where the cost of information is not the same across agents. Note that in Lu and Ye’s paper, all agents incur the same cost when they obtain information. In addition, we presented a two-stage efficient mechanism in their setting. Another related work is that by Hatfield et al. (2015). They focus on efficient mechanisms where bid- ders can invest in costly information acquisition to determine their valuations. They show that bidders make efficient investment choices when the utility of an agent is equal to his marginal contribution to the social welfare; see also Bergemann & V¨ alim¨ aki (2002). In contrast, we observer that for the All-Access mecha- nism, there might exist an equilibrium where agents do not make efficient investment decisions; see Example 2.4.3 in Section 2.4.1. For a discussion on settings where the computation of valuation is costly, see Larson & Sandholm (2001). 39 Information Disclosure: Our work also contributes to the vast literature on information disclosure. In the following, we briefly discuss this literature, focusing on the works motivated by applications in online advertising. Recently, several papers have studied the effect of sharing cookies and targeting in advertising. Abraham et al. (2011) show that in a common value setting when some advertisers are able to better utilize information obtained from cookies, asymmetry of information can sometimes lead to low revenue in this market; see also Syrgkanis et al. (2013). Several recent papers, such as Ghosh et al. (2007), Rayo & Segal (2010), Bergemann & Bonatti (2011), Emek et al. (2012), Hummel & McAfee (2012), Bergemann & Bonatti (2013), and Bhawalkar et al. (2014), analyze the role of providing more (targeting) information in the context of online advertising and show that more information may reduce the revenue. Our proposed mechanisms can control the access to information in order to maximize the revenue. Information disclosure has been studied in other applications. For instance, Jing (2011) studies customer learning for new durable goods. In his model, the seller invests in informing customers before releasing the goods to the market. In addition, see Lewis (2011). The remainder of this chapter is organized as follows: In Section 2.2, we formally define our model. Direct mechanisms are defined in Section 2.3. We present our efficient mechanism followed by the All- Access mechanism in Section 2.4. Section 2.5 discusses revenue maximization and presents the revenue- optimal mechanism. We discuss the selection rule of our mechanisms in Section 2.6. Finally, Section 2.7 explores some of the extensions of our mechanisms. 2.2 Setting We consider a setting with a seller of one (indivisible) item andn agents. The initial valuation of each agent i for the item is denoted byv i;0 2 [v;v], which is drawn independently from distributionF , with probability distribution function (p.d.f.) f. DistributionF is known to the seller and all the agents. However, v i;0 is known only to agenti. The seller may allow some of the agents to obtain (additional) information about their valuations. Sup- pose agenti is one of the agents to whom the seller has “granted” access to the additional information. In this 40 case, agenti may decide to incur costc i and obtain signal i about his valuation where i is drawn indepen- dently (ofv i;0 and other agents’ second signals) from distributionG i . The distributionsG i ,i = 1; 2;:::;n, are publicly known, but the second signals are private information. Without loss of generality, we assume E[ i ] = 0. Note that if E[ i ] = > 0, we can add tov i;0 and then subtract from i . If agenti obtains second signal i , his updated final valuation, denoted byv i;1 , would be equal tov i;0 + i . For the agents who did not learn their second signals, either because the seller denied them the access or by their own choice, letv i;1 =v i;0 . As an example, suppose an advertiser values male users at $0 and female users at $6. Assume that each user has the same chance of being male as of being female. Thus, when the user’s gender is unknown, his expected value, that is, his initial valuation, is 6+0 2 = 3. By revealing the gender, the valuation of the advertiser will change; with probability 1 2 , his valuation is increased by $3, and with probability 1 2 , it is decreased by the same amount. That is, the second signal is either 3 or3 with equal probability. Throughout this chapter, we denote the vector of the initial and final valuations of all agents byv 0 and v 1 . Also,v i;0 andv i;1 , respectively denote the vector of the initial and final valuations of all agents except for agenti. The agents are risk neutral. The utility of an agenti who has received the item is equal to his valuation, v i;1 , minus his payment to the mechanism and the (possible) cost of information acquisition. We will specify utility of the agents more precisely in the next section. 2.3 Direct Mechanisms In this section, we consider direct revelation mechanisms (Myerson 1986) where agents report their valua- tions in two rounds. First, they report their initial valuations to the mechanism. Then the mechanism decides on the set of agents that will have access to information. Those agents report their updated valuations to the mechanism in the second round and finally, the mechanism allocates the item. More precisely, any direct mechanismM is defined by a tuple (s;q;p), which respectively represents its selection, allocation, and payment rules. The seller announces the mechanism to the agents and commits to (s;q;p). Following are the stages of the mechanism: 41 1. Initial Bidding: Agents report in the first round. The initial report (bid) of agenti is denoted byb i;0 . Throughout this chapter, we will use “reporting” and “bidding” interchangeably. 2. Selection: Based on the initial reports, the mechanism selects a set of agents that we call selected agents. The mechanism grants access to each selected agenti to acquire additional information (signal i ) and in return charges them pricet i . More precisely, selection rules :R n ! (f0; 1gR) n maps the initial bids to a pair (s i ;t i ) for each agenti. If agenti is selected,s i (b 0 ) is equal to 1. Otherwise,s i is equal to 0. Each selected agent pays amountt i to the mechanism to access his signal (the agent would still need to incur an additional costc i to learn the signal). To simplify the presentation, we assumet i is equal to 0 for non-selected agents. 4 3. Obtaining Information: Each selected agenti decides on whether to incur costc i and learn i . We definee i to denote the decision variable for agenti for incurring costc i and updating his valuation. e i is equal to 1 if the selected agenti learns i . For non-selected agents,e i is defined to be 0. Neither the mechanism nor other agents can observe decisione i of an agenti; it is only known to that agent. 4. Final Bidding: In the final round of reporting, only selected agents get a chance to update their reports. For any selected agenti,b i;1 denotes the updated (and final) report of agenti. For all other agents, let b j;1 =b j;0 . 5. Allocation and Payments: Based on the initial and final bids of all agents (both selected and unse- lected), the seller decides to whom to allocate the item, allocation ruleq : (RR) n !R + , and how much to charge each agent, payment rulep : (RR) n !R. Namely, given all the bids and decision variables,q i (b 0 ;b 1 ) is the allocation probability, andp i (b 0 ;b 1 ) is the payment of agenti. The utility of agenti participating in the mechanism is equal toq i v i;1 p i t i e i c i (more precisely, q i (b 0 ;b 1 )v i;1 p i (b 0 ;b 1 )t i (b 0 )e i c i ). Each agent chooses a best-response strategy to deal with the mechanism and strategies of the other agents in order to maximize his expected utility, where the expectation is taken with respect to the second signals of the (selected) agents. More formally, the best response strategy of each agent i can be described with the following mappings: b i;0 : R! R, e i : R 3 !f0; 1g, and b i;1 : R 6 ! R. With slight abuse of notation, we denote the decision variables and functions with the 4 This assumption to a large extent is without loss of generality. In general, a mechanism can charge any agents in the first round independent of being selected or not. We are not putting any restrictions on the payment in the second round; therefore, any such payment in the first round can be added to the payment in the second round. 42 same notation. Functionb i;0 mapsv i;0 , the initial valuation of the agent, to the bid in the first roundb i;0 . e i is a function of the initial valuation, v i;0 , and initial bid, b i;0 , of the agent and his paymentt i . Finally, b i;1 is a function of the whole history of agenti (i.e.,< v i;0 ;b i;0 ;s i ;t i ;e i ;v i;1 >) and determines the final bid. Given the strategy of the other agents, agent i optimizes over tuple (b i;0 ;e i ;b i;1 ) to obtain his best (utility-maximizing) strategy. The truthful strategy for agent i consists of i) reporting truthfully in the first round (b i;0 = v i;0 ); ii) obtaining additional information if selected (e i = 1 ifs i = 1); and iii) reporting truthfully in the final round (b i;1 =v i;1 ). A dynamic direct mechanism is incentive compatible (IC) if for every agent and every truthful history, truth-telling is a best response given that all other agents report truthfully. More precisely, a mechanism is IC if E TRUTHFUL q i (v 0 ;v 1 )v i;1 p i (v 0 ;v 1 )t i (v 0 )e i c i = max b i;0 ;e i ;b i;1 E q i (b i;0 ;v i;0 ); (b i;1 ;v i;1 ) v i;1 p i (b i;0 ;v i;0 ); (b i;1 ;v i;1 ) t i (b i;0 ;v i;0 ) e i c i ; where the expectations are taken assuming other agents are truthful; that is, agents report truthfully in both rounds, i.e.,b i;0 =v i;0 andb i;1 =v i;1 , and obtain information if selected, that is,e j = 1 ifs j = 1 for j6=i. In addition, in the l.h.s., the expectation is taken under the truthful strategy of agenti. We show that our proposed mechanisms satisfy stronger incentive compatibility properties. Namely, selected agents always bid truthfully in the final round even if they have deviated from the truthful strategy in the past. In addition, each selected agent prefers to obtain additional information even if they observe other agents’ initial valuations. Currently, we assume that the agent only observes s i and t i , which may contain information about other agents’ valuations. We can now define the participation constraints for the mechanism. An IC mechanism is individually rational (IR) if for each agenti, the expected utility under the truthful strategy is non-negative, that is, E TRUTHFUL q i (v 0 ;v 1 )v i;1 p i (v 0 ;v 1 )t i (v 0 )e i c i 0: 43 2.4 The Efficient Mechanism The social welfare of a mechanism is defined as the sum of the utility of the agents and the seller minus the cost incurred to obtain additional information. Note that an IC and IR mechanism is efficient if it obtains the maximum social welfare equal to: max Sf1;;ng ( E S " n X i=1 q i v i;1 # X i2S c i ) = max Sf1;;ng f (v 0 ;S)g; where E S denotes the expectation with respect to the realizations of the second signals when all agents in set S (and only those agents) obtain the additional information. In addition, (v 0 ;S), defined below, denotes the maximum social welfare that can be obtained when setS of agents acquires information. (v 0 ;S) = E S max n max j2S fv j;0 + j g; max j= 2S fv j;0 g; 0 o X j2S c j : (2.1) To gain intuition, let us consider the following scenario. Assume all the agents bid truthfully, agents in set S obtain information, and subsequently, each agent j who updates his valuation incurs cost c j . The total cost of information is equal to P j2S c j . The mechanism maximizes the social wel- fare by allocating the item to the agent with the highest non-negative final bid, that is, agent i ? 2 arg max max j2S fv j;0 + j g; max j= 2S fv j;0 g , where max j2S fv j;0 + j g and max j= 2S fv j;0 g are, respec- tively, the maximum updated bids of agents who obtain information and the maximum bid of agents who do not. The item is allocated toi ? if his (final) bid is positive; otherwise, there would be no allocation. Throughout this chapter, we assume that the maximum expected social welfare that can be obtained is bounded; that is, E max Sf1;2;:::;ng f (v 0 ;S)g <1, where the expectation is with respect to initial valuationsv 0 . We now present an efficient mechanism. M EFF Mechanism: The selection, allocation, and payment rules are defined as follows: Selection: Select a set of agents such that granting them access to information maximizes the social welfare of the seller and agents, taking into account the cost of information. Specifically, select the 44 following set of agentsS(b 0 ) = arg max Sf1;;ng (b 0 ;S) , where (b 0 ;S) is defined in Eq. (2.1). The payment of selected agenti is equal to t i (b 0 ) =c i + E [q i v i;1 p i jv 0 =b 0 ] Z b i;0 v E [q i jv i;0 =z; v i;0 =b i;0 ]dz; (2.2) where the expectations are with respect to the second signals. Notation E [q i jv i;0 =z;v i;0 =x i ] denotes the expected probability of the allocation of agent i, where the initial valuations of agent i and other agents are, respectively, equal to z and x i , assuming all the agents, including agent i, are truthful. Note that the first term in the payment implies that the seller subsidizes the cost of information,c i , for each selected agenti. In lemmas B.1.6 and B.1.9 and their proofs, we discuss how this payment is calculated using the Envelope Theorem and show that it incentivizes the agents to be truthful. Recall that for non-selected agents, t i is equal to 0. See Appendix B.5.1 for examples that depict initial payments. Allocation and Payments: Allocate the item to the agent with the highest non-negative bid at a price equal to the second highest bid or a reserve. More precisely, consider an agenti ? 2 argmax i fb i;1 g. If b i ? ;1 0, agenti ? receives the item and paysp i ? = maxfmax i6=i ?fb i;1 g;rg, wherer :R n !R is a function of initial bids and will be defined later in Eq. (2.3). Let agent`2 arg max j= 2S(b 0 ) fb j;0 g be an unselected agent with the highest initial bid. The reserve price r is simply zero when b `;0 < 0 or all agents are selected; otherwise r is the solution of the equation below Z b `;0 maxfr;0g Pr h z max j2S(b 0 ) b j;0 + j i dz = Z b `;0 v E h q ` v `;0 =z;v `;0 =b `;0 i dz: (2.3) Lemma B.1.1 and Corollary B.1.2 in Online Appendix, Section B.1, show that there exists an r2 [0;b `;0 ] that satisfies the above equation. If there are multiple solutions to the above equation, we choose the largest one. Observe that becauser2 [0;b `;0 ], the reserve price does not change the allocation or the payment of the selected agents. Specifically, if a selected agenti wins the item, he pays maxfmax j6=i fb j;1 g;rg, 45 which is identical to maxfmax j6=i fb j;1 g; 0g. Then, one can describe the payment of the mechanism as follows: If agenti ? was a selected agent, then he pays maxfmax j6=i ?fb j;1 g; 0g. Ifi ? was not a selected agent, then he pays maxfmax j6=i ?fb j;1 g;rg. In fact, by introducing the reserve pricer, the mechanism charges agent` differently in the second round to incentivize him to bid truthfully in the first round. Note that the initial bid of agent` can change the set of selected agents. In addition, similar to all selected agents, agent` has a chance to win the item. However, unlike the selected agents, agent` was not charged in the first round. We now present the main result of this section. Theorem 2.4.1 (Efficient Mechanism). MechanismM EFF is individually rational, incentive compatible, and efficient. From its construction, it is not difficult to see that if mechanismM EFF is IC, then it is also efficient. We prove the incentive compatibility of the mechanism in Appendix B.1. Observe that selected agents bid truthfully in the second round because the item is allocated using a second-price auction. We then show that any selected agent that bids truthfully in the first round obtains information. To this aim, we show that the mechanism’s selection rule aligns with the agent’s incentive. Specifically, the marginal change in the utility of a selected agent from not obtaining information is equal to the change in the social welfare. The challenging part of the proof is showing that agents bid truthfully in the first round because agents’ bids in the first round determine the set of selected agents endogenously. The selection rule ensures that a right set of agents invest in information acquisition. Note that as more agents obtain information, there is a higher chance that an agent has a high valuation for the item, which could increase the social welfare. On the other hand, the increase in the highest valuation comes at the cost of information acquisition. Thus, there is a trade-off here, and the selection rule aims to avoid over or under- investment in information acquisition. In fact, as we show in Section 2.4.1, over or under-investment in information acquisition may not be avoided if the seller cannot control access to information via a selection rule. We also note that the selection rule of the efficient mechanism chooses a set of agents and allows them to update their valuations simultaneously. Alternatively, one can consider an “adaptive” selection rule that discloses information step-by-step (cf. McAfee & McMillan (1988)). During the selection stage, at each 46 step, the mechanism selects one of the agents to obtain information. Then, based on the report of that agent, the mechanism makes a decision on obtaining more information or proceeding to the allocation stage. In this work, we consider a two-stage information disclosure, which could be more appealing from a practical perspective. The sequential search can be time-consuming and complex. For instance, in the example from online advertising, the mechanism should be executed in milliseconds, and sequential information acquisition may not be feasible. 2.4.1 What If All Agents Are Allowed to Access the Additional Information? Intuitively, the ability of the seller to control access to the additional information could impact the social welfare and the utilities of the agents. To formalize this intuition, in this section, we present a mechanism called “All-Access.” In this mechanism, the seller allows all agents to obtain information, if they wish, i.e., s i = 1 andt i = 0 fori = 1; 2;:::;n. We observe that when the seller leaves the decision on acquiring costly information to the agents, the agents’ decisions can create inefficiency. Without the seller’s control, the agents may over-invest or under-invest in information acquisition. In addition, we observe that the agents tend to invest in information acquisition when their initial valuations are not too high or too low. This is in contrast with the efficient mechanism that incentivizes the agents with higher initial valuations to invest in information acquisition. We provide examples in Section 2.6. The All- Access mechanism works as follows: First, agents observe their initial valuations. Next, each agenti makes a decision on incurring costc i and obtaining his second signal. The item is allocated via a second-price auction with no reserve price. Similar to our original setting, the investment decision of an agent is only known to him. In addition, initial valuations and second signals are private information and only their distributions are publicly known. In this mechanism, agents bid only once after they decide on refining their valuations. To be consistence with the notation of our original setting, we denote the bid of an agent i in the second-price auction by b i;1 ; the initial bid of an agenti, b i;0 , is set to be zero. The item is allocated to an agent with the highest non-negative bid, that is,i ? 2 arg max i fb i;1 g ifb i ? ;1 0; in case of ties, the item is allocated at random to one of the agents. Agenti ? pays the second highest bidp i ? = maxfmax i6=i ?fb i;1 g; 0g. If agenti ? has obtained additional information, his utility will be equal to (v i ? ;0 + i ?)p i ?c i ?. Otherwise, his utility is equal tov i ? ;0 p i ?. For any agenti6=i ? that does not receive the item,p i = 0. 47 We now consider the Nash Equilibrium (NE) of the All-Access mechanism. We assume that all the agents bid truthfully (b i;1 = v i;1 ) in the auction because bidding truthfully is a weakly dominant strategy for any agent i, independent of his and other agents’ decisions on obtaining information. Therefore, to characterize the Nash Equilibrium, we focus on the decision of the agents on information acquisition. We define ~ e i (v i;0 ) =e i (v i;0 ;b i;0 = 0;t i = 0) as the investment strategy of agenti with initial valuationv i;0 in the All-Access mechanism. The next theorem shows that the All-Access mechanism always admits a pure strategy Nash equilibrium (NE). Theorem 2.4.2 (Equilibria of the All-Access Mechanism). The All-Access mechanism induces a game of incomplete information among agents where strategy of the agents are defined by ~ e =< ~ e 1 ; ~ e 2 ;:::; ~ e n >. For this game, there exists a pure strategy Nash equilibrium such that no agenti can gain by changing his investment strategy ~ e i () if the investment strategies of the other agents remain unchanged. All the proofs of this section are provided in Appendix B.2. To gain insight into the All-Access mechanism, in the rest of this section, we consider the following example. Example 2.4.3. Assume that there are two agents that participate in the All-Access mechanism with no reserve. The cost of obtaining second signals for both agents is the same c 1 = c 2 . The initial valuation of agents is drawn from a uniform distribution over [0; 1], i.e., Uniform(0; 1), and the second signals are drawn from Uniform(1; 1). Although the setting in Example 2.4.3 is seemingly simple, it highlights challenges in characterizing equilibrium of the All-Access mechanism. To characterize equilibrium of a game of incomplete information, the single crossing conditions are often used (Athey 2001). However, for the setting in Example 2.4.3, we show that the single crossing conditions do not hold; see Proposition B.3.1 in Appendix B.3.2. In the light of this observation, in the next theorem, we present the equilibria of the All-Access mechanism for a wide range of the cost. Theorem 2.4.4. Consider the All-Access mechanism with no reserve price and the setting in Example 2.4.3. Then, 48 when cost c 7 96 , there exists an equilibrium in which both agents always obtain the additional information, i.e., ~ e i (v i;0 ) = 1 fori = 1; 2 andv i;0 2 [0; 1], and when cost c 7 48 , there exists an equilibrium in which none of the agents obtain the additional information, i.e., ~ e i (v i;0 ) = 0, fori = 1; 2 andv i;0 2 [0; 1]. Theorem 2.4.4 characterizes the equilibrium of the All-Access mechanism when the cost is less than 7 96 0:072 and greater than 7 48 0:145. In order to compare the All-Access and the efficient mechanisms, we numerically (using an iterative procedure) find the equilibrium of the All-Access mechanism when the cost is within ( 7 96 ; 7 48 ); see Online Appendix, Section B.3.1, for details. Our numerical studies show that there exist equilibria in which agents follow interval-based investment decisions. An equilibrium with interval-based investment decisions can be defined with four parameters a 1 , a 2 , 1 , and 2 where a 1 1 2 [0; 1] and a 2 2 2 [0; 1]. In the equilibrium, each agent i obtains information only when his initial valuationv i;0 lies in [a i ; i ]. We note that the range [a 1 ; 1 ] is not necessarily equal to [a 2 ; 2 ]. That is, there exist equilibria in which the investment decisions of the agents are not symmetric. 5 Under an interval investment strategy, agents do not obtain information when their initial valuations are too low or too high. The intuition is that an agent with high initial valuation does not have incentive to invest in information acquisition as he already has a high chance of winning the item without incurring the cost of information. On the other hand, an agent with low initial valuation is not willing to acquire costly information because he has a slim chance of winning the item. Figures 2.1 and 2.2 compare the All-Access mechanism with the efficient mechanism. Observe that the equilibria of the All-Access mechanism may not be efficient as agents in the All-Access mechanism tend to over-invest in information acquisition when the cost is low, and they tend to under-invest when the cost is high. Yet another source of inefficiency comes from that fact that agents follow interval-based investment strategies and as a result, they may not acquire information when their initial valuations are high. This is in contrast with the efficient mechanism that motivates agents with higher initial valuations to acquire information; see Section 2.6. 5 We note that when the cost is less than 7 96 , the All-Access mechanism has an interval-based equilibrium witha1 = a2 = 0 and1 = 2 = 1. Similarly, when the cost is greater than 7 48 , the All-Access mechanism has an interval-based equilibrium with a1 =1 anda2 =2. 49 2.5 Maximizing Revenue In the previous sections, we presented a welfare-maximizing (efficient) mechanism. Our goal here is to design a revenue-optimal mechanism. We start with a heuristic called “Sequential Second-Price (SSP) Mechanism.” In this mechanism, the allocation and payments are determined via a second-price auction that makes the mechanism appealing from a practical perspective. Furthermore, this class of mechanisms is motivated in part by the structure of the efficient mechanism. Despite the desirable properties of the SSP mechanism, it is not able to achieve the maximum revenue. Therefore, in Section 2.5.2, we present a revenue-optimal mechanism. The allocation rule of this mechanism favors agents with higher initial valuations and extracts more revenue from those agents in the first round. Both these mechanisms control agents’ access to the additional information. In Section 2.5.3, we investigate the impacts of such a control by re-visiting the All-Access mechanism. 2.5.1 Sequential Second-Price Mechanisms The second-price auctions and their variations are prevalent in online advertising and are used by Google and other major platforms. In this section, we present a class of mechanisms, called Sequential Second-Price (SSP), which extends the second-price auction to our setting with dynamic information acquisition. The SSP mechanisms are similar to the efficient mechanism. The main difference is that the SSP mech- anism sets a lower bound (reserve price)r on the final bid of the agents. That is, it allocates the item to the 0 0.05 0.1 0.15 0.2 Cost (c) 0 0.5 1 1.5 2 E[e 1 +e 2 ] Efficient All-Access with No Reserve Figure 2.1: The average number of agents with the updated valuations versus the cost in the efficient and All-Access (with no reserve) mechanisms withn = 2,F = Uniform(0; 1), andG i = Uniform(1; 1) for i = 1; 2. Here,c 1 =c 2 =c. 50 agent with the highest bid as long as his bid is greater than or equal tor. In fact, the mechanism selects agents in setS r , where S r (b 0 )2 arg max Sf1;;ng f r (b 0 ;S)g Here, r (b 0 ;S) = arg max Sf1;;ng E S max n max i2S fb i;0 + i g; max i= 2S fb i;0 g;r o X i2S c i : Each selected agenti payst i (b 0 ) in the first round, wheret i (b 0 ) is given in Eq. (2.2). The parameter r in the SSP mechanism can be optimized to maximize the revenue of the seller. In Appendix B.5.2, we compare the revenue of the SSP mechanism with the optimal reserve and the revenue- optimal mechanism. In our examples, the SSP mechanism yields more than 84% of the optimal revenue. In practice, the SSP mechanism can be implemented via private auctions (Google AdX Documentation 2015) and pre-negotiated contracts that grant advertisers access to additional information in advance and advertisers bid for the impressions over time. For instance, consider a set of advertisers that are willing to display their ads on a specific website over a period of time. Their initial valuations, which depend on the contents of the website and advertisers’ products and services, remain constant over time. However, advertisers’ final valuation may vary over time because it also depends on the demographic or behavioral attributes of the user(s). In this case, there is no need for advertisers to report their initial valuations for every impression; they only need to report their updated valuations. 0 0.05 0.1 0.15 0.2 Cost (c) 0.65 0.7 0.75 0.8 0.85 Welfare Efficient All-Access with No Reserve Figure 2.2: The average social welfare versus the cost in the efficient and All-Access (with no reserve) mechanisms withn = 2,F = Uniform(0; 1), andG i = Uniform(1; 1) fori = 1; 2. Here,c 1 =c 2 =c. 51 2.5.2 Revenue-Optimal Mechanism For revenue maximization, without loss of generality, using the revelation principle (cf. Myerson (1986)), we focus on IC and IR mechanisms. Definition 3 (Optimality). An incentive compatible and individually rational mechanism is optimal if it maximizes the revenue, equal to E [ P n i=1 (t i +p i )], among all incentive compatible and individually rational mechanisms. 6 Let(v i;0 ) = (1F (v i;0 )) f(v i;0 ) be the negative of the inverse hazard rate associated with distributionF . We make the following assumption about(). Assumption 2 (Monotone Hazard Rate). DistributionF , with p.d.f. f, has a monotone hazard rate; that is, () is non-decreasing inv i;0 . Furthermore, assume that() is differentiable and sup v i;0 2[v; v] f 0 (v i;0 )g< 1. The above assumption is standard in the revenue-optimal mechanism design and ensures that the virtual valuations of the agents are increasing in their initial valuations (Myerson 1981). We now present a revenue- optimal mechanism. M OPT Mechanism: The selection, allocation, and payment rules are defined as follows: Selection: Select the following set of agents: S OPT (b 0 )2 arg max Sf1;2;:::;ng ( E S max max i2S fb i;0 +(b i;0 ) + i g; max i= 2S fb i;0 +(b i;0 )g; 0 X i2S c i ) ; (2.4) andt i is defined the same as before; see Eq. (2.2). Allocation and Payments: Allocate the item to the agent with the highest non-negative weighted bid. More precisely, consider an agenti ? 2 argmax i fb i;1 +(b i;0 )g. Ifb i ? ;1 +(b i ? ;0 ) 0, then the item is allocated to agenti ? at pricep i ? = maxfmax i6=i ?fb i;1 +(b i;0 )g;rg(b i ? ;0 ) wherer :R n !R is a function of initial bids and is defined below. 6 The optimality is defined among all two-stage mechanisms. As discussed in the previous section, a mechanism with an adaptive selection rule may obtain higher revenue. 52 With slight abuse of notations, let`2 arg max j2S OPT (b 0 ) fb j;0 +(b j;0 )g be an unselected agent with the highest weighted bid. Then, ifb `;0 +(b `;0 )< 0 or all agents are selected,r = 0. Otherwise,r solves the following equation Z b `;0 +(b `;0 ) maxfr;0g Pr h z max j2S OPT (b 0 ) b j;0 + j +(b j;0 ) i dz = Z b `;0 v E h q ` v `;0 =z;v `;0 =b `;0 i dz: (2.5) Lemma B.1.1 and Corollary B.1.3 in Appendix B.1 show that there exists an r2 [0;b `;0 +(b `;0 )] that solves the above equation. Note that similar to the efficient mechanism,r does not change the allocation or payment for selected agents. MechanismM OPT is built upon the ideas of virtual value formulation of Myerson (1981) for static revenue-maximizing mechanism design. It allocates the item to the agent that has the highest (final) virtual valuation v i;1 + (v i;0 ). The mechanism maximizes the virtual value of the winner minus the cost of information acquisition. The following theorem establishes the optimality ofM OPT . Theorem 2.5.1 (Revenue-Optimal Mechanism). Suppose Assumption 2 holds. MechanismM OPT described above is incentive compatible, individually rational, and optimal. In Appendix B.1, we show that mechanismM OPT is IC and IR. To complete the proof of Theorem 2.5.1, in the following we show that mechanismM OPT is revenue-optimal. First, via the following lemma, we show that the selection rule aims to optimize the revenue by maximizing the expected “virtual revenue.” Lemma 2.5.2 (Revenue ofM OPT ). If all the agents follow the truthful strategy, then the expected revenue of M OPT is equal to E " max Sf1;2;:::;ng ( E S max max i2S fv i;0 +(v i;0 ) + i g; max i= 2S fv i;0 +(v i;0 )g; 0 X i2S c i )# ; (2.6) where the inner expectation is with respect to the second signals and the outer expectation is with respect to the initial valuations. The proof is given in Appendix B.4.1. We now provide an upper bound on the revenue of any IC mechanisms that matches the revenue of mechanismM OPT . 53 Lemma 2.5.3 (Upper Bound). The expected revenue of the seller is at most equal to Eq. (2.6). The upper bound is established using a closely related problem with fewer constraints called the relaxed problem, cf. ¨ Eso & Szentes (2007), Kakade et al. (2013), and Pavan et al. (2014). In the relaxed problem, the mechanism, on the behalf of an agent, can decide to obtain information, and then both the agent and the mechanism learn his second signal. Because any mechanism that is IC in the original setting would also be IC in the relaxed setting, the revenue of the optimal relaxed mechanism provides an upper bound for the revenue of the revenue-optimal mechanism in the original setting; see Appendix B.4.1 for details. The revenue-optimal mechanism is a generalization of the handicap mechanism of ¨ Eso & Szentes (2007) and matches the mechanism when information acquisition is costless (i.e.,c i = 0) where the seller allows all agents access to information. We show that when the information is costly, the seller grants access to additional information only to a subset of agents. The intuition is that the seller indirectly bears the cost of information because the costs will affect the agents’ (ex-ante) willingness to pay. An important distinction between our mechanism and the handicap mechanism is the selection rule that grants access to additional information to a right set of agents. A technical challenge that we need to address is that the selection rule depends on the initial bids which, as we discuss in Section 2.6 is “non-monotone” in the initial valuations of the agents; also see Lemma B.1.9 for more details. 2.5.3 What If All Agents Are Allowed to Access to the Additional Information? Here, we re-visit the All-Access mechanism to study the impacts of controlling access to the additional information on the revenue-optimal mechanism. Specifically, we will introduce a reserve pricer to the All- Access mechanism to ensure that the seller has a degree of freedom to extract more revenue from the agents. In this mechanism, privately informed agents decide on updating their valuations and then participate in a second price auction with reserve pricer. One can easily extend Theorem 2.4.2 to show that the All-Access mechanism with a reserve price always admits a pure strategy Nash equilibrium. In the following, we will re-examine Example 2.4.3 when agents participate in the All-Access mecha- nism with reserve pricer. 54 Example 2.5.4. Assume that there are two agents that participate in the All-Access mechanism with reserve pricer 0. The cost of obtaining second signals for both agents is the same, i Uniform(1; 1), and v i;0 Uniform(0; 1) fori = 1; 2. The following theorem, which generalizes Theorem 2.4.4, sheds light on the equilibria of the All-Access mechanism with reserve pricer for a wide range of the cost. Theorem 2.5.5. Consider the All-Access mechanism with reserve pricer2 [0; 1] and the setting in Example 2.5.4. Then, when cost c minf 4r 3 3r 2 6r+5 48 ; 8r 3 +6r 2 +7 96 g, there exists an equilibrium in which both agents always obtain information, i.e., ~ e i (v i;0 ) = 1,i = 1; 2, forv i;0 2 [0; 1], and when c 3r 4 +8r 3 +6r 2 +7 48 and r p 2 1, or c 3rr 3 +1 12 and r2 ( p 2 1; 1], there exists an equilibrium in which none of the agents obtain information, i.e., ~ e i (v i;0 ) = 0,i = 1; 2, forv i;0 2 [0; 1]. The proof is given in Appendix B.3.3. Figure 2.3 depicts the results of Theorem 2.5.5. Next, in Figure ??, we compare the All-Access mechanism (with revenue-maximizing reserve price) with mechanismM OPT in terms of their collected revenue for different values of the cost. As expected, the All-Access mechanism fails to obtain the maximum revenue. Surprisingly, the revenue of the All-Access mechanism is not monotone in the cost. For instance, the revenue of the All-Access mechanism when the costc is 0:18 is less than when the cost is 0:2. This follows from agent’s investment strategies. Forc = 0:2, none of the agents acquire information while forc = 0:18, agents update their valuations when their initial valuations are close to 0:5. For the latter case, it is likely that the updated valuations of the agents fall below the reserve price considering the fact that the revenue-maximizing reserve price is almost 0:5 and second signals are drawn from Uniform(1; 1). This will reduce the seller’s revenue. In fact, the seller prefers that none of the agents obtain information. 2.6 Who Will Be Selected? Here, we discuss the selection rule of our mechanisms in more detail. We first present an example that shows that the selection rule may not be monotone in initial valuations. We then show that under certain symmetry 55 assumptions, the selection rule favors agents with higher initial valuations. Finally, we demonstrate how the selection rule reacts to increase in the cost and variance of the second signals. Figures 2.5 and 2.6 depict the selected agents in the revenue-optimal and efficient mechanisms, respec- tively, for all realizations of v 1;0 and v 2;0 in the range of [1:5; 2:5] with n = 2, F = N(0:5; 0:5), and G i = N(0; 0:5) fori = 1; 2. The cost of information for the first and second agents is, respectively, 0:01 and 0:05, that is, c 1 = 0:01 and c 2 = 0:05. The x-axis is the initial valuation of the second agent, and the y-axis is the initial valuation of the first agent. The areas in the figures are divided into several regions. In the white and green regions, the number of selected agents is zero and two, respectively. In the purple regions, only agent 1 whose cost of information is lower is selected while in the yellow regions, only agent 2 is selected. Non-monotonicity of Selection Rule: Observe that the number of selected agents does not always increase as we move along one of the axes. Furthermore, when the initial valuation of a selected agent increases and the initial valuation of the other agent remains the same, he will not necessarily be selected. For instance, the efficient mechanism selects both agents when v 1;0 = v 2;0 = 1 but does not select any agents whenv 1;0 = 2:4 andv 2;0 = 1. The reason that the selection rule is not monotone is that all agents, including those who did not update their valuations, participate in the second round and have a chance to win the item; see Section 2.7.1. The selection rule will remain non-monotone even if the cost of information is the same across agents. Also observe that the selection rule of both mechanisms favors agents with higher initial valuations and lower costs, as agents with high initial valuations are more likely to win the item in the second round. 0 0.2 0.4 0.6 0.8 1 Reserve Price (r) 0 0.05 0.1 0.15 0.2 0.25 Cost (c) Both agents obtain information for any initial valautions None of the agents obtain information for any initial valautions Figure 2.3: Depicting the results in Theorem 2.5.5 with n = 2, F = Uniform(0; 1), and G i = Uniform(1; 1) fori = 1; 2. Here,c 1 =c 2 =c. 56 We can formalize this intuition under certain symmetric and independence assumptions. Theorem 2.6.1. Suppose for each agent i, the second signal i is drawn, independently of other agents’ signals, from distribution G i =G. In addition, assume that distributionG is symmetric and c i = c for i = 1; 2;:::;n. If an agent is selected in the revenue-optimal or efficient mechanisms, then all agents with higher initial bids will also be selected. A distributionG is symmetric ifG(y) = 1G(y). For instance, normal and uniform distributions satisfy the assumption. The proof is presented in Appendix B.2. The intuition is that the seller’s objective is an increasing function of the maximum valuation of the agents. When agents are symmetric in terms of the cost and distribution of second signals, the seller would rather select agents with higher initial valuations. These are the agents who have a greater chance of receiving the item in the second round. We note that when the distribution of the valuations are asymmetric, it is likely that the mechanism selects an agent with lower initial valuation but higher variance of the second signal. Theorem 2.6.1 provides a simple way to find the selected agents. One can sort the agents according to initial valuations (bids) in descending order and evaluate the value of the selection rule’s objective function for each of then + 1 subsets;,f1g,f1; 2g, , andf1; 2; ;ng and then select the subset that maximizes the objective. See Guha et al. (2006) and Goel, Guha & Munagala (2010) for optimization problems similar to our selection rule in more general settings. 0 0.05 0.1 0.15 0.2 Cost (c) 0.4 0.45 0.5 0.55 Revenue Revenue-Optimal All-Access Figure 2.4: The revenue of the revenue-optimal and All-Access (with revenue-maximizing reserve price) mechanisms versus the cost withn = 2,F = Uniform(0; 1), andG i = Uniform(1; 1) fori = 1; 2. Here, c 1 =c 2 =c. 57 2.6.1 Impacts of the Cost and Variance of Second Signals on Selection Rules To get more insight about the selection rule, we numerically study how the expected number of selected agents changes as the cost and variance of second signals increase. Figure 2.7 illustrates the impact of the variance of second signals, 2 , on the average number of selected agents withn = 2, F = N(0:5; 0:5), G i = N(0; 2 ), andc i = 0:05 fori = 1; 2 7 When the second signals are more “uncertain,” the average number of selected agents in the revenue-optimal (OPT) and efficient (EFF) mechanisms increases. The intuition is that for larger variance, the seller anticipates seeing larger second signals and selects more agents. Figure 2.8 depicts the impacts of the cost of information, c i = c, on the average number of selected agents whenn = 2,F =N(0:5; 0:5), andG i =N(0; 0:5). Both revenue-optimal and efficient mechanisms react to an increase in the cost of information by restricting the number of agents that can obtain information. Interestingly, to obtain higher revenue, the revenue-optimal mechanism selects fewer agents. This implies that the revenue-optimal mechanism distorts the revelation of information to extract more revenue from the agents. 7 See Appendix B.5.3 regarding the impact of the number of agents. Figure 2.5: Selected agents in the optimal mechanism for different realizations ofv 1;0 andv 2;0 withn = 2, c 1 = 0:01,c 2 = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5) fori = 1; 2. 58 2.7 Extensions In this section, we discuss some of the extensions of our mechanisms. In particular, we extend our mecha- nism to a setting where agents are“extremely risk-averse” in a sense that they do not engage in the auction without obtaining additional information. Moreover, we generalize our mechanisms to a setting with multi- ple units for sale. 2.7.1 Information Acquisition as an Entry Cost In our model, the seller may allocate the item to an agent who has not obtained additional information. However, in some applications, such as the sale of high-valued assets, buyers might face a significant risk if they purchase the item without gathering enough information. For these applications, one can interpret the cost of obtaining information as an entry cost. That is, buyers must invest in information to be considered in the allocation round. Figure 2.6: Selected agents in the efficient mechanism for different realizations ofv 1;0 andv 2;0 withn = 2, c 1 = 0:01,c 2 = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5) fori = 1; 2. 59 Our proposed mechanisms can be extended to this setting by excluding unselected agents from the allocation stage. More precisely, an efficient mechanism in this setting selects the following set of agents S EFF-E (b 0 ) = arg max Sf1;2;:::;ng 8 < : E S max n max j2S fb j;0 + j g; 0 o X j2S c j 9 = ; ; and similarly, a revenue-optimal mechanism selects the following set of agents S OPT-E (b 0 ) = arg max Sf1;2;:::;ng 8 < : E S max n max j2S fb j;0 +(b j;0 ) + j g; 0 o X j2S c j 9 = ; : Note that whenS =;, the mechanisms do not allocate the item. We refer to the corresponding efficient and revenue-optimal mechanisms as EFF-E and OPT-E, respectively. We note that mechanism OPT-E is a generalization of a mechanism proposed by Lu & Ye (2014), who study the problem of designing a revenue-maximizing mechanism with an entry cost when the cost of information is the same across agents. They show that the selection rule in this setting is monotone, that is, the number of selected agents decreases as the initial valuation of an agent increases. Using monotonicity in the selection rule, Lu & Ye (2014) show that their proposed mechanism is incentive compatible. Note that, as we saw in Section 2.6, the selection rule will not be monotone if the item can be allocated to an unselected agent. 8 8 We also relax an assumption in Lu & Ye (2014) where the mechanism can verify whether or not a selected agent has invested in obtaining information. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.5 1 1.5 2 Average Number of Selected Agents OPT EFF Figure 2.7: Average number of selected agents in the revenue-optimal and efficient mechanisms versus the standard deviation of second signals,, withG i =N(0; 2 ) andc i = 0:05 fori = 1; 2. 60 To get more insight, we numerically compare mechanismsM OPT and OPT-E. We assume that the cost of information is the same across agents, the number of agentsn = 2,F =N(0:5; 0:5), andG i =N(0; 0:5) fori = 1; 2. Figure 2.9 compares the revenue of the mechanisms. For any value of the cost, mechanism M OPT yields more revenue than OPT-E. The revenue of the mechanism OPT-E approaches zero as acquiring information becomes more costly. However, mechanismM OPT is more robust to the cost of information because, when the cost is high, it can allocate the item without allowing any agents to update their valuations. 2.7.2 Multi-unit In this section, we discuss how our mechanisms can be extended to a setting wherem 1 units of the item are sold to n > m agents. Specifically, we assume that each agent i needs at most one unit of the item, and his initial valuation for the item isv i;0 2 [v;v]. Similar to our original setting,v i;0 is agenti’s private information and is drawn (independently) from distributionF . We start with an efficient mechanism. The mechanism allocates the item tom agents with the highest (non-negative) bids. In case the number of agents with a positive bid is less thanm, the mechanism allocates the item to agents with positive bids. The efficient mechanism selects the following set of agents max Sf1;2;:::;ng ( E S " max Af1;2;:::;ng;jAjm X i2A b i;1 # X i2S c i ) ; (2.7) 0.1 0.2 0.3 0.4 0 0.5 1 1.5 2 Cost Average Number of Selected Agents OPT EFF Figure 2.8: Average number of selected agents in the revenue-optimal and efficient mechanisms versus the cost of information,c, withG i =N(0; 0:5). 61 whereA is the set of agents that the item will be allocated to,jAj is the cardinality of setA, and the condition jAj m implies that we cannot sell more than m units. In addition, b i;1 = b i;0 + i if i2 S, and b i;0 otherwise. It is easy to see that the selection rule of the efficient mechanism selects a welfare-maximizing set of agents provided that it is incentive compatible. The initial payment of the selected agents is the same as before and is given in Eq. (2.2). In addition, only agents that receive the item will pay in the second round. Specifically, any selected agent that wins the item pays maxfb (m+1) ; 0g, where with slight abuse of notations,b (m+1) is the (m + 1) th highest final bid. Furthermore, any unselected agentj that receives the item pays maxfb (m+1) ;r j g, wherer j solves the following equation Z b j;0 maxfr j ;0g Pr h zb (m) j i dz = Z b j;0 v E h q j v j;0 =z;v j;0 =b j;0 i dz; (2.8) whereb (m) j is them th highest final bid among all agents except for agentj. The term inside the integral of the l.h.s., Pr h zb (m) j i , is the cumulative distribution function of random variableb (m) j at pointz. Using a similar argument in Lemma B.1.1 in Appendix B.1, one can show that there existsr j 2 [0;b j;0 ] that solves the above equation. 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 Cost Revenue OPT E OPT Figure 2.9: Average revenue of the OPT and OPT-E mechanisms versus the cost withG i = N(0; 0:5), and c i =c fori = 1; 2. In all the figures,n = 2 andF =N(0:5; 0:5). 62 Next, we will present a revenue-optimal mechanism in this setting. Similar to mechanismM OPT , the mechanism allocates the item tom agents with the highest (non-negative) weighted bid, where the weighted bid of an agenti isb i;1 +(b i;0 ). The mechanism allows the following set of agents max Sf1;2;:::;ng ( E S " max Af1;2;:::;ng;jAjm X i2A b i;1 +(b i;0 ) # X i2S c i ) (2.9) to acquire information. The initial payment is given in Eq. (2.2). In the second round, any selected agent that receives the item has to pay the (m + 1) th highest weighted bid if the (m + 1) th highest weighted bid is positive and is zero otherwise. In addition, any unselected agentj that wins the item has to pay a maximum of (m + 1) th highest weighted bid andr j . Here,r j solves Eq. (2.8), whereb (m) j should be replaced by the m th highest final weighted bid among all agents except for agentj. For any agents that do not receive the item, their payment in the second round is zero. 2.8 Conclusion Information structure plays a crucial role in the outcome of auctions. This role becomes even more impor- tant when information acquisition is costly. We observe that in such environments, agents may over or under-invest in information. We also presented efficient and revenue-optimal mechanisms that shows how auctioneer should control the access to information via a selection rule and prices. In the previous section, we discussed some of the extensions of our mechanism. An important direction for future research is to extend the results to settings with adaptive selection rules where information is disclosed sequentially over time, and the mechanism makes “selection decisions” based on updated reported valuations of the agents (McAfee & McMillan 1988). Another important direction is studying environments where the agents are risk-averse. 63 Chapter 3 Dynamic Pricing for Heterogeneous Time-Sensitive Customers Dynamic pricing is increasingly prevalent in many industries. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with demand uncertainty (see, for instance, Aviv & Pazgal 2008 an Cachon & Swinney 2011). In this paper, we show that dynamic pricing (DP) can play an important role in differentiating between customers over time even in the absence of demand uncertainty. In many settings, especially in fashion and electronic retail, a customer’s willingness to pay (or valuation) for a product is time-sensitive and decreases over time. In these situations, customers are not only different in terms of their initial willingness to pay for these items when they are first introduced to the market, but they are also different in terms of how rapidly they lose interest in these products. Thus, we may have customers who initially value the product at a high level, but as time progresses, they lose interest in the product completely. We may also have customers who initially value the product at a low level, but still remain interested in the product as time progresses. That is, the willingness to pay of the lower valuation customers diminishes at a lower rate relative to that of the higher valuation customers. This phenomenon is illustrated in Figure 3.1. In this paper, we show that when a firm sells to customers who have heterogeneously decreasing val- uations, the firm can achieve significant benefits by incorporating dynamic pricing even in the absence of demand uncertainty. This is not the case if customers were homogeneous in their valuation decay rate. In that case, in the absence of demand uncertainty, the firm’s optimal pricing strategy would be to post a fixed price, and dynamic pricing would have no benefit. When customer valuations decrease at different rates, the ranking of customers (in terms of their valuations) changes over time (as in Figure 3.1). This allows a firm to generate more revenue by revising its initial price to target customers who currently have higher valuations even though they initially had lower valuations. 64 Valuation Time Customer 1 Customer 2 ! " 0 Figure 3.1: At time 0, customer 1 has a higher value than customer 2. But, her value decreases faster than customer 2, and beyondt 0 , customer 2 has a higher value. Formally, we characterize a revenue-optimal selling mechanism for a firm with customers who have heterogeneous valuations that decrease in a heterogeneous fashion. We assume that the firm knows the total demand and the customer valuation distribution, but does not know the precise valuation of each individual customer. We also include the firms’ production costs and product holding costs. That is, the firm incurs a constant cost to produce each product and does so at time 0, and it incurs a cost to hold these products in inventory until the sale. In our setting, the firm commits to a price trajectory, and the customers are strategic in selecting the best time to purchase so as to maximize their individual net utility. We assume that customers with higher initial valuation also have a higher rate of valuation decrease. To the best of our knowledge, this setting has not been studied in the literature. We next describe the main characteristics of this optimal mechanism ignoring the production and hold- ing costs. The optimal mechanism consists of the firm posting a series of decreasing prices, which essen- tially divides the customers into three groups based on their initial valuations. The first group comprises all customers with initial valuations above a threshold (high type customers) who purchase the product imme- diately. The second group consists of all customers with initial valuations below a smaller second threshold (low type customers). For these customers, the posted prices are designed in a way to extract their entire sur- plus. Finally, the third group consists of customers with valuations between the two thresholds (medium type customers) who do not purchase the product immediately but purchase before the low valuation customers, and obtain a positive net utility, or surplus. The low type customers in our mechanism play an important role in contrast to what occurs in fixed pricing. In a fixed pricing policy, all customers with valuations above the price would immediately purchase, 65 and those with valuations below the price would not purchase. However, in our optimal mechanism, the low type customers purchase the product after some delay and the firm is able to extract their entire surplus. In the absence of production and holding costs, the firm sells the product to all customers in this fashion. Selling to all the customers not only can increase social welfare but also can generate significant additional revenue. For instance, we show that the firm can increase its revenue by approximately 23% by employing the optimal mechanism (relative to fixed pricing) when the initial valuation distribution is uniform and the valuation decay rates are proportional to initial valuations; in fact, more than three quarters of this increase is obtained by selling to the low type customers. We next investigate the impact of production and holding costs on the optimal selling mechanism. Both of these costs motivate the firm to reduce the length of the selling period, but in different ways. The presence of production costs motivates the firm to sell fewer units by targeting customers with higher initial valuation. Interestingly, we find that the optimal mechanism here can be obtained from the baseline optimal mecha- nism (obtained without these costs) in a simple manner. The optimal mechanism prices in such a way that customers with initial valuations above a certain cut-off purchase the item at the same time as in the baseline setting (but at a higher price), while those with initial valuations lower than the cut-off do not purchase the product at all. That is, the production costs introduce a cut-off in customers’ initial valuation. Holding costs motivate the firm to price in a manner so that customers are incentivized to make their purchases earlier (than the baseline case). We find that depending on the holding cost, there are three types of optimal mechanisms. If the holding cost is larger than a threshold, then the firm determines it is too expensive to carry the product and simply posts a fixed price so that all customers who purchase the product do so immediately. If the holding cost is moderate (below the previous threshold and above another lower threshold), then the firm benefits from DP but cannot extract the entire surplus of customers with low initial valuations. If the holding cost is below the lower threshold, then the structure of the optimal mechanism is similar to that of the baseline optimal mechanism. There are three distinct groups of customers and the firm can extract the entire surplus of customers with low valuations. Overall, the value of DP decreases with increasing holding and production costs. Finally, we summarize our main technical contribution. In our setting, one of the hurdles in characteriz- ing the optimal selling mechanism is the lack of consistent customer ranking based on customer types. As a 66 result, satisfying the individual rationality and incentive compatibility constraints is challenging. 1 Note that when there is a consistent ranking of customers, individual rationality constraints are binding for the lowest customer type that the firm would like to sell the product to, and the mechanism is incentive compatible if the allocation rule is monotone in the type of the customers. In contrast, in our setting, the individual rationality constraint is binding for a group of customers with low initial valuation. Furthermore, the monotonicity of the time of purchase in the initial valuation does not guarantee that the mechanism is incentive compatible. To characterize the optimal mechanism, we first establish necessary and sufficient conditions to thus have an incentive compatible mechanism. One of these conditions resembles the traditional envelope condition (Myerson 1981), which ensures that the mechanism is locally incentive compatible. The other condition, called interval condition, ensures that the mechanism is globally incentive compatible. We first relax the problem by ignoring the interval condition and characterizing a revenue-optimal mechanism that satisfies the individual rationality constraints and envelope condition. Then, by establishing several additional prop- erties of this mechanism, we show that the mechanism indeed satisfies the interval condition, and thus is optimal. Related Work Our work is related to the growing literature on pricing mechanisms for customers who strategically time their purchases. There is also an extensive literature on dynamic pricing with myopic customers (see for example Lazear (1984), Wang (1993), Gallego & Van Ryzin (1994b), Feng & Gallego (1995), Bitran & Mondschein (1997), Federgruen & Heching (1999), and Talluri & Van Ryzin (2004a)). We do not provide a summary of this line of literature here, but we refer the reader to excellent surveys by Bitran & Caldentey (2003), Chan et al. (2004), and Shen & Su (2007). Coase (1972) is one of the first papers to study pricing for strategic customers. Coase conjectured that when a firm sells a durable good to patient and strategic customers and cannot commit to a sequence of posted prices, then the prices would converge to the production cost. Later Stokey (1979), Gul et al. (1986), and Besanko & Winston (1990) found that with commitment, posting a decreasing sequence of prices is revenue-optimal. In particular, Stokey (1979) showed that when production cost declines over time, posting 1 To characterize the optimal mechanism, using the revelation principle, it suffices to focus only on mechanisms in which customers have an incentive to participate, that is, the individual rationality constraints hold, and customers are willing to reveal their private information to the mechanism designer, that is, the incentive compatibility constraints hold (see Myerson 1981). 67 a decreasing sequence of prices results in higher revenue for the firm. However, when the production cost is zero, DP is not beneficial. In contrast, we show that with heterogeneous decay rates, DP can improve revenue even if the production cost is zero. In the context of revenue management, several works show that DP can increase firm revenue when demand is uncertain (Su 2007, Aviv & Pazgal 2008, Elmaghraby et al. 2008, Araman & Caldentey 2009, Cachon & Swinney 2011, Aviv et al. 2015, and Yu et al. 2015). Specifically, Aviv & Pazgal (2008) studied a model in which a firm sells a limited inventory of a product in two periods to an unknown number of strategic customers who are heterogeneous in their valuations and time of arrival. They showed that when the level of heterogeneity in customers’ valuation increases, the benefit of customer segmentation using pricing decreases. Conversely, in this work we show that as the level of heterogeneity in customers’ decay rates increases, the firm can better differentiate customers and generate more revenue. One important factor that differentiates our work from the aforementioned research is that in our work, demand uncertainty is not a key driver of DP. That is, even in the absence of demand uncertainty, DP increases revenue significantly. Furthermore, in the aforementioned papers, the firm uses the customers’ fear of rationing to extract more revenue from strategic customers (see also Liu & Van Ryzin 2008b and Bansal & Maglaras 2009), but in our work, customers do not face such a risk. In fact, when the holding and production costs are zero, all the customers purchase the item. Other works examined intertemporal pricing with new consumers arriving in every period. Conlisk et al. (1984) and Besbes & Lobel (2015) showed that when customers arrive over time, the firm’s optimal strategy is to use a cyclic pricing policy. Borgs et al. 2014 studied how to set prices to extract revenue while guaranteeing service availability to all paying customers arriving and departing at different times. Chen & Farias (2015) proposed a robust pricing policy for a setting where customers arrive over time and the distribution of waiting cost and valuation of customer is unknown. See Board & Skrzypacz (2016) and Garrett (2011) for other papers that study pricing with heterogeneous arrivals. In these papers, the firm can gain from DP since it can differentiate customers based on their arrival times. In contrast, in our work, all of the customers are in the market when the sales starts and they strategically optimize their time of purchase. That is, we attempt to isolate and capture the impact of heterogeneity of valuation decay rates on the optimal DP policy, absent any other considerations. Our work also relates to the growing body of research on dynamic mechanism design; see Bergemann & Said (2011a) for a survey. There, the firm offers a direct mechanism that allocates the items over time as a 68 function of customers reports of their private valuations. See Akan et al. (2009), Kakade et al. (2013), Pavan et al. (2014), Battaglini & Lamba (2012), Boleslavsky & Said (2013), Golrezaei & Nazerzadeh (2016), and Lobel & Xiao (2013) for recent results on designing optimal dynamic mechanisms. In particular, Akan et al. (2009) studied a setting where customers are heterogeneous in their valuation distribution and in how fast they learn their true value. They show that when high type customers (such as business travelers) learn their valuation slower, relative to low type customers (such as leisure travelers), in the optimal mechanism, the firm sequentially screens customers by offering them a menu of expiring refund contracts. The remainder of this paper is organized as follows: In Section 3.1, we formally define our model. Section 3.2 presents the direct mechanisms. Section 3.3 develops our optimal mechanism when production and holding costs are zero. In Section 3.4, we study the optimal mechanism with positive production and holding costs. We discuss the impact of the degree of customer heterogeneity on the optimal mechanism in Section 3.5. We conclude our paper in Section 3.6. 3.1 Model We consider a firm that sells multiple units of an item (product) to a mass of customers over a time horizon of lengthT =1. We assume that the firm produces and stores all units just prior to the start of the sale period. The cost for producing each unit isc, and the holding cost to store each unit ish per unit time. The firm’s goal is to implement a selling mechanism to maximize his revenue. At time 0, he declares and commits to a price trajectory p(t),t 0. Given the pre-announced prices, customers decide whether and when to purchase the item. Each customer is assumed to be infinitesimal and demands a single unit of the item. The valuation of a customer at timet isV (;t) whereV : R 2 ! R and is the customer type. Valuation function V is known to the firm and customers. However, the customer type is the customer’s private information, and these types are independently drawn from a known distributionF with probability density function (p.d.f)f, whereF : [; ]! [0; 1] and 0. The negative inverse hazard rate associated with distributionF is denoted by : [; ]!R, and is defined as(x) = 1F (x) f(x) . Throughout the paper, we make the following assumption, which implies() is non-decreasing. This assumption is standard in the optimal mechanism design literature (c.f. Myerson (1981)). Assumption 3 (Monotone Hazard Rate). The type distributionF has a non-decreasing hazard rate. 69 We assume that all the customers are present in the market at time 0 and exit after making a purchase. That is, customers can make a purchase at any time t 0. Furthermore, they are fully strategic about whether and when they purchase the item from the firm. Specifically, each customer either does not purchase the item, or purchases a unit in the period in which her utility gets maximized. Customers are risk neutral, and utility of a customer with type that purchases the item at timet at pricep isV (;t)p. Furthermore, all customers are present in the entire time horizon. Then, given pricesp =fp t :t 0g, the customer with type purchases a unit of the item at time t () := arg max 0 fV (;)p g ifV (;t ())p t () 0, and she does not purchase otherwise. Here,p t is the price for the item at timet. Here, we consider a deterministic baseline model where the firm knows the total mass of customers. The assumption of deterministic demand is justified when the number of customers is large and fairly predictable. This modeling choice allows us to study the impact of strategic customers and decay in customer valuation, but it deliberately removes the element of uncertainty from the model. That is, we seek to understand if the firm gains from DP when there is no demand uncertainty. 3.2 Direct Mechanisms and Optimality To characterize a revenue-maximizing (optimal) selling mechanism, by the revelation principle, we focus on direct incentive-compatible and individually rational mechanisms where customers first report their type and then the mechanism determines their payment and time of allocation. More precisely, any direct mechanismM consists of a pair (t;p), where p : [; ]! R is a transfer scheme and t : [; ]!R is an allocation rule. That is, p() and t() are respectively the price for a unit of the item and time of purchase for a customer with type. 2 When the customer does not purchase the item t() =1. We start by defining incentive compatibility and individual rationality. Letu(; ^ ) be the expected utility of a customer with type when she reports ^ . That is, u(; ^ ) = V (;t( ^ )) p( ^ ); 2 Note that the allocation rule t for a customer is only a function of the type of customer, and does not depend on the type of other customers, because in our model, each customer is infinitesimal and there is no inventory constraint. 70 Then, mechanismM is incentive compatible (IC) if for each customer with type2 [; ], truthfulness is a best response, that is,u(; ^ )u(;). Roughly speaking, in an incentive-compatible mechanism, no customer wants to deviate from the truthful strategy. We can now define the individual rationality constraints for the mechanism. An incentive compatible mechanism is individually rational (IR) if for each customer with type, her utility under the truthful strategy is non-negative, i.e., for any2 [; ], we haveu(;) 0. The following lemma presents the necessary and sufficient conditions under which a mechanism is IC. Lemma 3.2.1 (Necessary and Sufficient Conditions for IC). The mechanismM with allocation rule t() is incentive compatible if and only if both conditions stated below are satisfied: Envelope Condition: For any; ^ 2 [; ], u(;)u( ^ ; ^ ) = Z z= ^ @ 1 V (z;t(z))dz; (3.1) where@ 1 V (;t) = @V (;t) @ . Interval Condition: For any ^ <, Z z= ^ @ 1 V (z;t( ^ ))dz Z z= ^ @ 1 V (z;t(z))dz Z z= ^ @ 1 V (z;t())dz; (3.2) All the proofs are presented in the Online Appendix. Lemma 3.2.1 is analogous to the characterization of incentive compatibility in standard static settings, where an envelope condition and monotonicity of allocation rule are used to characterize incentive compat- ibility (see Myerson (1981)). The envelope condition above is a standard one, but the interval conditions replace the monotonicity conditions. The interval conditions compare the utility obtained by the truthful strategy (middle term in Eq. (3.2)) with untruthful strategies. We are now ready to characterize the firm revenue under any IC mechanism. Note that the expected revenue of a mechanismM from selling one unit of the item is the customer’s payment minus the production and holding costs, that is, E[p()cht()], where the expectation is with respect to the type of customer . Then, the total revenue of the mechanism is the market size times the expected revenue from selling 71 one unit of the item. Considering that the market size is constant, that is, demand is deterministic, the total revenue of the mechanism is maximized if we maximize the expected revenue from selling one unit of the item. An incentive-compatible and individually rational mechanism is optimal if it maximizes the expected revenue among all incentive-compatible and individually rational mechanisms. The following lemma characterizes firm revenue in any incentive-compatible mechanismM. Lemma 3.2.2 (Revenue of IC Mechanisms). In any incentive-compatible mechanism, the expected firm revenue from selling one unit of the item is given by E h V (;t()) +()@ 1 V (;t())ht()cu(; ) i ; (3.3) where the expectation is taken with respect to the type of customer. Lemma 3.2.2 suggests that in order to optimize revenue, the optimal mechanism should maximize virtual revenue, that is, E h (V (;t()) +()@ 1 V (;t()))u(; ) i , and pick a transfer scheme that makes it both IC and IR. Throughout the paper, we refer to V (;t()) +()@ 1 V (;t())ht()c as virtual value/revenue of a customer with type at timet. The results in this section hold for any generic customer valuation functionV (;t). In the next section, we specialize our generic customer valuation function to model the main scenario we are interested in. In this scenario, customers lose interest in the product over time and customers who initially have higher interest in the product lose interest at a faster rate as well. Specifically, we focus on the case of the exponential valuation functionV (;t) =e t , which has these properties. We discuss the general valuations function V that satisfies these properties in Section 3.5. 3.3 Optimal Mechanism with Exponential Valuation Functions Here, we present the optimal mechanism when the valuation function is exponential such that V (;t) = e t . Under exponential valuation functions, one cannot define a persistent ranking for customers over time; see Figure 3.1. That a persistent ranking for customers does not exist makes the problem of designing 72 an optimal mechanism challenging. Furthermore, it allows the firm to extract more revenue by revisiting its prices over time. We begin by presenting an optimal mechanism for the case when both production and holding costs are zero. Then, we devote Section 3.4 to the study of the optimal mechanism with positive production and holding costs. By Lemma 3.2.2, an optimal mechanism should maximize virtual revenue subject to IC and IR con- straints. Note that when V (;t) = e t and c = h = 0, the expected virtual revenue is given by E [V (;t()) +()@ 1 V (;t())u(; )] = E h e t() +() 1t() u(; ) i := E [R(;t())u(; )]; (3.4) whereR(;t) =e t +() 1t is the virtual value of a customer of type at timet. Note that initial virtual value, i.e.,R(; 0) = +(), is equal to the virtual value in a standard static setting (c.f. Myerson 1981). To characterize the optimal mechanism, we need to solve the following optimization problem. max ft();p():2[; ]g;u(;)0 E max R(;t())u(; ); 0 s:t: IC and IR constraints (OPT) Solving the above optimization problem is rather involved because we are maximizing over the allocation and payment functions t() and p(). For this reason, we characterize the optimal solution of the above equation in two steps. In the first step, we relax the problem by ignoring the interval conditions. Recall that by Lemma 3.2.1, satisfying the IC constraints is equivalent to satisfying the envelope and interval conditions. 73 That is, in the first step, we only focus on satisfying the envelope conditions and IR constraints. In particular, we consider the following relaxed problem: max ft():2[; ]g;u(;)0 E max R(;t())u(; ); 0 s.t. u(;) = u(; ) + Z e zt(z) (1zt(z))dz 0 for 2 [; ] (IR) (RELAXED) Note that the equationu(;) =u(; )+ R e zt(z) (1zt(z))dz follows from the envelope conditions. In the second step, we show that the solution to the relaxed problem satisfies the envelop conditions. This implies that the optimal solution of Problem RELAXED is also an optimal solution of Problem OPT. More details about the proof are presented in Section 3.4. The following lemma characterizes the optimal solution of the relaxed problem. Lemma 3.3.1 (Optimal Solution of Problem RELAXED). Given that Assumption 3 holds, in an optimal solution of Problem RELAXED,u(; ) = 0 and the allocation rule, denoted by t R , is given by t R () = 8 > > > > > < > > > > > : 0 if H High-Type; +2() () if2 [ L ; H ] Medium-Type; 1 if2 [; L ]; Low-Type (3.5) where H solves H + 2( H ) = 0 and L solves L +( L ) = 0. The following theorem shows that the optimal solution of the relaxed problem, given in Lemma 3.3.1, fulfills the interval conditions. This implies that a mechanism with time of allocation t R () is revenue- optimal. Theorem 3.3.2 (Optimal Mechanism). If Assumption 3 holds, the customer valuation function is given by V (;t) = e t , and the production and holding costs are zero, then in the optimal mechanism a customer of type2 [; ] purchases one unit of the item at time t R (), given in Eq. (3.6), and at price p() =V (;t R ()) R e t R (z)z (1t R (z)z)dz. 74 Note that t R () does not depend on . Thus, if gets doubled, the purchase time of all customers decreases by a factor of 2. Then, considering the fact thatu(;), p(), and the revenue of the firm only depends ont R (), we can conclude thatu(;) and p() stays the same when changes. In the following, we will discuss the main insights of Theorem 3.3.2. First we note that the firm sells the item to all customers. In addition, the purchase time of customers t R () is decreasing in customer type, that is, customers with lower initial valuation purchase the item later than customers with higher initial valuation. Theorem 3.3.2 shows that the optimal mechanism divides the customers into three groups: high-type, medium-type, and low-type. The high-type customers who have high initial valuation ( H ) purchase the item immediately. The low-type customers who have low initial valuation ( L ) delay their time of purchase. We note that the purchase time of high- and low-type customers does not depend on the distribution of the customer type,F , directly. The time of purchase of these customers depend onF only through thresholds H and L . Observe that low-type customers with type purchase the item at time 1 and more importantly, get zero utility. To understand why, note that these customers pay p() = V (;t R ()) Z e t R (z)z (1t R (z)z)dz = V (;t R ()); where the second equality holds because t R (z) = 1 z for any z L . Note that this is in contrast with the traditional static mechanism design. In the static mechanism design, customers whose “type” is high enough get the product and enjoy a positive surplus, whereas low type customers do not get the product at all. In fact, there is typically one customer type on the boundary that does gets zero utility after purchasing the product. However, in our setting, we show that there exists a group of customers who purchase the item and obtain zero utility. Note that low-type customers have negative virtual values at time zero, that is,R(; 0) = +() 0 for any L . However, the virtual value of these customers eventually becomes positive over time, as their valuations do not decay very quickly. Therefore, the firm gains from selling the item to them later in time. The medium-type customers who have medium initial valuation2 [ L ; H ] do not purchase the item immediately. However, unlike the low-type customers, these customers enjoy a positive utility. 75 The optimal mechanism presented in Theorem 3.3.2 highlights the fact that the firm benefits from het- erogeneity in valuation decay rate by adopting DP and delaying allocation. In fact, the extra revenue that the firm makes comes partly from the low-type customers from whom the firm extracts their entire surplus. Comparison to a Model with Homogenous Valuation Decay Rate: When customer valuation decay rate is homogeneous, that is, isV (;t) =e t , the optimal mechanism posts a fixed price L . Then, customers with a type greater than L purchase the item at time zero. Note that here the low-type customers do not purchase the item at all. The reason is that, with homogeneous decay rates, these customers have negative virtual value throughout the time horizon. Thus, the firm is not willing to sell the item to these customers. In addition, we observe that medium-type customers, who delay their time of purchase under heterogeneous decay rates, get the item immediately with homogeneous valuation decay rates. To shed more lights on Theorem 3.3.2, we study the following example. Example 3.3.3 (Optimal Mechanism). Assume that follows the uniform distribution in the range of [0; 1]; that is,U(0; 1). Then, the time of purchase in the optimal mechanism is given by t R () = 8 > > > > > < > > > > > : 0 if 2 3 ; 32 (1) if2 [ 1 2 ; 2 3 ]; 1 if2 [0; 1 2 ]; (3.6) The allocation rule, t R (), the payment rule, p(), and the utility of customers,u(;), are depicted in Figures 3.2, 3.3, and 3.4 when = 0:1. This figure compares the DP policy of Theorem 3.3.2 to the FP policy. Note that the revenue-maximizing FP policy, posts a price of L = 1 2 at time zero. We observe that under FP , there is a one-time sale where only customers with a type greater than 1 2 purchase the item immediately. But, under DP , customers with type H = 2 3 purchase the item at time zero, and the rest of the customers delay their purchase. Note that the purchase time under DP , t R (), is decreasing in the type of customer. In Figure 3.3, we observe that the payment of customers under DP increases as their type grows. High- type customers pay more when the firm uses DP rather than FP . However, the medium-type customers have a lower payment under DP . This group of customers delay their purchase and their valuation at the purchase time is not as high as their initial valuation. Therefore, the firm needs to reduce their payments. 76 Furthermore, the DP policy enables the firm to extract revenue from low-type customers. The firm increases its revenue by 18% by only selling the item to this group of customers. Note that under dynamic pricing the firm earns an expected revenue of 0:31 from each customer. However, it only earns an expected revenue of 0:25 from each customer by using FP . This implies that by adopting DP , the firm increases its revenue by more than 23%. Figure 3.4 shows that under both dynamic and fixed pricing policies the utility of customers, u(;), is an increasing function of. But, the customers earn higher utility under the FP policy. However, as we show in Example 3.4.3, the social welfare of the customers and the firm under DP is higher than that under FP . Example 3.3.3 shows that the DP policy earns significantly more revenue than the FP policy. Motivated by that, we present a lower bound on the revenue gain of DP over FP in Appendix C.1.2. We drive the lower bound on the revenue gain by characterizing the extra revenue the firm extracts from the low-type customers. The bound suggests that for the setting in Example 3.3.3, the DP policy earns at most 18% more revenue than FP. Another interpretation of this result is, if the firm ignores the heterogeneity in decay rates and follows the optimal mechanism under the homogenous model, it will suffer from at least 18% revenue loss. 3.4 Optimal Mechanism with Production and Holding Costs In this section, we present the optimal mechanism when the production and holding costs are positive. These costs have significant impact on the optimal selling mechanism. In particular, as we show in Section 3.4.1 0.2 0.4 0.5 0.6 0.8 1 Type 0 20 40 60 80 100 Time of Purchase Dynamic Pricing Fixed Pricing Low-Type Medium- Type High-Type Figure 3.2: Time of purchase versus type in the fixed and dynamic pricing policies with the customer type U(0; 1),V (;t) =e t , = 0:1, andh = c = 0. 77 that when the firm only faces production costs, it ends the sale sooner, compared to when the production cost is zero. In particular, the production cost introduces a cut-off such that customers whose type is greater than the cut-off purchase at time t R (), and other customers do not purchase the item at all. That is, a positive production cost does not change the time of allocation of customers who purchase the item. Similarly, a positive holding cost motivates the firm to end the sale sooner; see Section 3.4.2. However, with a positive holding cost, the firm incentivizes customers to purchase the item earlier, as carrying the items in inventory is costly. This is in contrast with the optimal mechanism with a positive production cost. There, the purchase time for all the customers that make a purchase remain the same, whereas with a positive holding cost, customers are incentivized to purchase the item sooner. When the firm faces both production and holding costs, the optimal mechanism is obtained by intro- ducing a cut-off to the optimal mechanism with the same holding cost and zero production cost, called the baseline mechanism. Namely, customers whose initial valuation is above the cut-off purchase the item at the same time as the baseline mechanism, and customers whose initial valuation is lower than the cut-off do not purchase the item at all. In the optimal mechanism, all customers who do not purchase the item in the baseline mechanism do not purchase the item. But, customers with initial valuation lower than the cut-off and who purchase the item in the baseline mechanism no longer purchase the item anymore, as the firm increases the prices. The structure of the optimal mechanism is intuitive. One can think about the holding and production costs as the dynamic and fixed costs of carrying inventory, respectively. The dynamic cost determines how the firm allocates the items to the customers over time, and the production cost determines whether the firm is willing to sell the items to the customers. 0 0.2 0.4 0.5 0.6 0.8 1 Type 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Payment Dynamic Pricing Fixed Pricing High-Type Medium- Type Low-Type Figure 3.3: Payment versus type in the fixed and dynamic pricing policies with the customer type U(0; 1),V (;t) =e t , = 0:1, andh = c = 0. 78 To capture the impact of each cost separately in the next section, we focus on the case of a positive production cost. Then, we study the impact of the holding cost in Section 3.4.2. Finally, in Section 3.4.3, we present the optimal mechanism with positive production and holding costs. 3.4.1 Positive Production Cost Here, we present an optimal mechanism when the production cost isc 0, and the holding costh is zero. Defining c as the smallest value that solvesR c ;t R ( c ) = c, we present the main result of this section. Here,R ;t) and t R () are defined in Equations (B.14) and (3.6), respectively. Theorem 3.4.1 (Positive Production Cost). Given that Assumption 3 holds, the customer valuation function is given byV (;t) = e t , the production costc 2 [0; ], and the holding costh = 0, the optimal mechanism only sells to customers with type c at time t R (), given in Eq. (3.6), and at price p() = V (;t R ()) R c e t R (z)z (1t R (z)z)dz. Furthermore, p() =1 for< c . In Theorem 3.4.1, we assume that the production costc is less than , as the firm has no incentive to produce and sell the items when the production cost is greater than the maximum valuation of customers, that is, . The main idea of the proof is to show that the virtual value of a customer with type at time t R (), that is,R ;t R () , is increasing in. Then, provided thatR c ;t R ( c ) c = 0, we haveR ;t R () c < 0 for any< c . This implies that the seller would rather not sell the item to customers with type< c . Theorem 3.4.1 suggests that the production cost will not change the allocation time of customers with type c ; rather, it only changes the payment rule such that the lower-type customers are not willing to 0 0.2 0.4 0.6 0.8 1 Type 0 0.1 0.2 0.3 0.4 0.5 Utility Dynamic Pricing Fixed Price Medium- Type High-Type Low-Type Figure 3.4: Utility versus type in the fixed and dynamic pricing policies with the customer typeU(0; 1), V (;t) =e t , = 0:1, andh = c = 0. 79 purchase the item. In other words, the payment rule is designed to enforce a cut-off, c , in the allocation rule. For more insight into Theorem 3.4.1, we revisit Example 3.3.3. Example 3.4.2 (Revisiting Example 3.3.3: Positive Production Costs). Consider the same setting in Exam- ple 3.3.3. Figure 3.5 illustrates the cut-off c as a function of the production cost. Observe that when the production cost is greater than 0:33, the firm only sells the product to the high-type customers with c . This implies that posting a fixed price is optimal when the production cost is high enough. However, when the production cost is between 0:18 and 0:33, then the firm sells the item to all the high-type customers and some medium-type customers, that is, those with type2 [ c ; 2 3 ]. Thus, with medium production costs, DP increases firm revenue. Finally, when the production cost is less than 0:18, the firm considers selling the item to all high-type and medium-type customers and some of low-type customers with 2 [ c ; 1 2 ]. As a result, with a low production cost, the firm has an opportunity to extract the entire surplus of some of the low-type customers. Figure 3.5 compares the threshold c with that in the FP policy. Note that in the FP policy, the threshold c;f solves R( c;f ;t = 0) = c. We observe that the threshold is smaller than in the DP policy, which suggests that the DP policy sells to more customers than the FP policy. In Figure 3.6, we further compare the DP and FP policies in terms of their revenue and social welfare. Note that with allocation time t R () and cut-off c , the social welfare is E (e t R () c)1 lf c g where the first term,e t R () , is the valuation of a customer with type who purchases the item at time t R (), and the second term is the production cost. Here, 1 lfAg is an indicator function; that is 1 lfAg = 1 when eventA happens and is zero otherwise. Figure 3.6 shows that the relative welfare gain of DP to FP is not monotone in the production cost. When the production cost is small enough, social welfare under the DP policy is greater than that under the FP policy. Under the DP policy, the firm sells to more customers, as compared to the FP policy. However, under the DP policy, customers purchase the item later, as compared to the FP policy. While the former has a positive impact on the social welfare of the DP policy, the latter impacts its social welfare negatively. We observe that when the production cost is small enough, the former factor dominates the latter, and as a result, DP outperforms FP in terms of social welfare. But when the production cost is large, the latter factor dominates, and thus, the FP policy yields more social welfare than the DP policy. 80 Figure 3.6 shows that as the production cost gets smaller, DP earns more revenue than FP , because with a small production cost, the firm can afford to sell at a lower price and delay the allocation time of customers in order to extract more revenue from them. 3.4.2 Positive Holding Cost In this section, we characterize an optimal mechanism for when the holding cost is positive. We will show that by having a positive holding cost, the firm motivates the customers to purchase earlier. Furthermore, by having a positive holding cost, the firm induces more customers to purchase the item immediately. This is in contrast with a positive production cost. Recall that with a positive production cost, the purchase time for all customers who make a purchase remain the same. Before presenting the optimal mechanism with a positive holding cost, to get intuition on the impact of the holding cost, we revisit Example 3.3.3 whenh> 0. Example 3.4.3 (Revisiting Example 3.3.3: Positive Holding Costs). Here, we present the optimal mecha- nism for the setting in Example 3.3.3 when the holding costh > 0 and the production costc = 0. Figure 3.7 shows how the optimal mechanism divides customers in different regions. A precise definition of the boundaries of these regions will be given later in Equations (3.7) and (3.8). Observe that when the holding cost is small (h H l := 0:004), there are four regions: high-type, medium-type, low-type, and no-allocation. We later define H l in Eq. (3.7) for any type distribution F . 0.2 0.4 0.6 0.8 1 Production Cost, c 0 0.2 0.4 0.6 0.8 1 Lowest Type that Purchases Fixed Pricing Dynamic Pricing Figure 3.5: The cut-off c versus the production costc for the mechanism described in Theorem 3.4.1 and the FP policy. Here, the customer typeU(0; 1),V (;t) =e t , = 0:1, and the holding costh = 0. 81 While customers in the high-type region get the item immediately, customers in the low-type and medium- type regions delay their purchase time. Moreover, customers in the no-allocation region do not purchase the item at all. These customers and customers in the low-type region get zero utility. We note that as the holding cost increases, the low-type region shrinks, whereas other regions grow. This pattern continues until the holding cost hitsH l . Ath =H l , the low-type region vanishes and there will be only three regions: high-type, medium-type, and no-allocation. By increasing the holding cost from H l to H h := 0:025, the high-type region gets larger while the medium-type region gets smaller; see the definition ofH h in Eq. (3.7) for any type distributionF . In fact, ath = H h , the medium-type region disappears. Finally, forh H h , there will be only two regions: high-type and no-allocation. That is, when the holding cost is high enough, the firm posts a fixed price, which only incentivizes the high-type customers to purchase the item at time zero. Figure 3.8 shows the social welfare and revenue gain of DP as a percentage (relative to the FP policy) versus the holding cost. We note that the FP policy does not change as the holding cost varies. For any value of the holding cost, the FP policy posts a price of L = 1 2 . Thus, the social welfare of the FP policy is E[ 1 lf L g]. The social welfare of the DP policy is the expected value of customers at the time of purchase t h () minus the holding cost, that is, E[(e t h () ht h ()) 1 lf h g] where h is the lowest type that purchases the item. We formally define t h () and the cut-off h in Equations (3.8) and (3.7), respectively. Interestingly, social welfare is not monotone in the holding cost. At first glance, we expect social welfare decreases when the holding cost gets larger, but this is only case when the holding cost is not too large. For 0 0.1 0.2 0.3 0.4 Production Cost, c -20 -10 0 10 20 30 Relative Gain to Fixed Pricing (%) Revenue Social Welfare Figure 3.6: The social welfare and revenue gain of DP (relative to FP) as the production cost varies. Here, the customer typeU(0; 1),V (;t) =e t , = 0:1, and the holding costh = 0. 82 larger holding cost values, social welfare increases inh. To understand why, note that by increasing the holding cost, the firm incentivizes customers to purchase earlier as holding the items is costly; see Figure 3.9. This, in turn, enhances the social welfare as value increases for customers at their time of purchase. Furthermore, we observe that when the holding cost is not too large, the social welfare of the optimal DP mechanism is greater than that of FP . Thus, for small holding cost values, DP not only increases firm revenue but also social welfare. Figure 3.8 shows that DP outperforms FP by a higher percentage when the holding cost is small, because a smaller holding cost allows the firm to lower prices and further delay the time of allocation to customers in order to extract more surplus from them. Figures 3.9, 3.10, and 3.11, respectively, show the time of purchase of customers and their utility and payment as a function of the customer type for different holding cost values, in particular, h = 0 (no holding cost),H l =5 (low holding cost), and (H l +H h )=2 (medium holding cost). We further show the curves forH l andH h because at these holding costs, the structure of the optimal mechanism changes. Recall that H l is the lowest holding cost under which the firm only sells to high- and medium-type customers, andH h is the lowest holding cost under which the firm only sells to high-type customers. We begin by comparing the curves forh = 0 andh =H l =5. We observe that by increasingh from 0 to H l =5, the purchase time of high- and low-type customers who purchase the item does not change. However, medium-type customers purchase the item earlier. Because of this, the utility of medium- and high-type customers increases slightly. We observe a similar phenomena when the holding cost increases fromH l =5 toH l . This implies that increasing the holding cost can benefit higher-type customers. Dynamic Pricing for Customers with Time-Sensitive Valuations Negin Golrezaei Hamid Nazerzadeh Ramandeep Randhawa Marshall School of Business University of Southern California Los Angeles, CA 90089 {negin.golrezaei.2017,hamidnz,rrandhaw}@marshall.usc.edu A core problem in the area of revenue management is pricing goods in the presence of strategic customers. We study this problem when the customers are heterogeneous with respect to both their initial valuation for the item and their valuation decay rate, which represents the value lost due to delay in the purchase. We characterize the optimal mechanism for selling durable goods in such environments and show that delayed allocation and dynamic pricing can be e↵ ective screening tools for maximizing profit of a firm. We further investigate the impact of production and holding costs on the optimal mechanism. We show that by lowering the holding and production costs, the firm can extract more revenue from customers, as he has more flexibility to delay the allocation. Key words: 1. Introduction ✓ h H ✓ h L ✓ h Dynamic pricing has becoming increasingly prevalent in many industries. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with de- manduncertainty(see,forinstance,AvivandPazgal2008andCachonandSwinney2011). In this paper, we show that dynamic pricing can play an important role in di↵ erentiating between customers over time even in the absence of demand uncertainty. In many set- tings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only di↵ erent in terms of their initial willingness-to-pay for these items when they are first introduced to the market, but they are also di↵ erent in terms of how rapidly they lose their interest in these products. So, we may have customers who initially 1 Dynamic Pricing for Customers with Time-Sensitive Valuations Negin Golrezaei Hamid Nazerzadeh Ramandeep Randhawa Marshall School of Business University of Southern California Los Angeles, CA 90089 {negin.golrezaei.2017,hamidnz,rrandhaw}@marshall.usc.edu A core problem in the area of revenue management is pricing goods in the presence of strategic customers. We study this problem when the customers are heterogeneous with respect to both their initial valuation for the item and their valuation decay rate, which represents the value lost due to delay in the purchase. We characterize the optimal mechanism for selling durable goods in such environments and show that delayed allocation and dynamic pricing can be e↵ ective screening tools for maximizing profit of a firm. We further investigate the impact of production and holding costs on the optimal mechanism. We show that by lowering the holding and production costs, the firm can extract more revenue from customers, as he has more flexibility to delay the allocation. Key words: 1. Introduction ✓ h H ✓ h L ✓ h Dynamic pricing has becoming increasingly prevalent in many industries. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with de- manduncertainty(see,forinstance,AvivandPazgal2008andCachonandSwinney2011). In this paper, we show that dynamic pricing can play an important role in di↵ erentiating between customers over time even in the absence of demand uncertainty. In many set- tings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only di↵ erent in terms of their initial willingness-to-pay for these items when they are first introduced to the market, but they are also di↵ erent in terms of how rapidly they lose their interest in these products. So, we may have customers who initially 1 Dynamic Pricing for Customers with Time-Sensitive Valuations Negin Golrezaei Hamid Nazerzadeh Ramandeep Randhawa Marshall School of Business University of Southern California Los Angeles, CA 90089 {negin.golrezaei.2017,hamidnz,rrandhaw}@marshall.usc.edu A core problem in the area of revenue management is pricing goods in the presence of strategic customers. We study this problem when the customers are heterogeneous with respect to both their initial valuation for the item and their valuation decay rate, which represents the value lost due to delay in the purchase. We characterize the optimal mechanism for selling durable goods in such environments and show that delayed allocation and dynamic pricing can be e↵ ective screening tools for maximizing profit of a firm. We further investigate the impact of production and holding costs on the optimal mechanism. We show that by lowering the holding and production costs, the firm can extract more revenue from customers, as he has more flexibility to delay the allocation. Key words: 1. Introduction ✓ h H ✓ h L ✓ h Dynamic pricing has becoming increasingly prevalent in many industries. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with de- manduncertainty(see,forinstance,AvivandPazgal2008andCachonandSwinney2011). In this paper, we show that dynamic pricing can play an important role in di↵ erentiating between customers over time even in the absence of demand uncertainty. In many set- tings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only di↵ erent in terms of their initial willingness-to-pay for these items when they are first introduced to the market, but they are also di↵ erent in terms of how rapidly they lose their interest in these products. So, we may have customers who initially 1 High- Type Medium- Type Low- Type 0 0.005 0.01 0.015 0.02 0.025 Holding Cost, h 0 0.2 0.4 0.6 0.8 Customer Type 0 0.005 0.01 0.015 0.02 0.025 Holding Cost, h 0 0.2 0.4 0.6 0.8 Customer Type Figure 3.7: The structure of the optimal mechanism as a function of the holding costh. Here, the customer typeU(0; 1), = 0:1,V (;t) =e t , and the production costc = 0. 83 Next, we compare the case ofh =H l andh = (H l +H h )=2. As expected, customers purchase earlier whenh = (H l +H h )=2. Moreover, relative toh =H l , whenh = (H l +H h )=2, customers with higher-type customers pay a lower price, while lower-type customers end up paying more. As a result, by increasing h from H l to (H l +H h )=2, while the utility of higher-type customers increases, the utility of lower-type customers decreases. Example 3.4.3 illustrates how the holding cost influences the structure of the optimal mechanism. Next, we formalize these observations by presenting the optimal mechanism with a positive holding cost. We will show that the optimal mechanism only sells to customers with initial valuation h , where the cut-off h depends on the holding cost,h, and is given by h := 8 > > > > > < > > > > > : maxf L ; g ifh<H l Low Holding Cost; M ifh2 [H l ;H h ] Medium Holding Cost; H ifh>H h High Holding Cost: (3.7) Here,H l = ( ~ ) 2 e 1 andH h = ( L ) 2 where ~ solves 2 ~ +( ~ ) = 0 and L is defined in Eq. (3.6). Considering the fact that ~ L , it is easy to observe thatH l < H h . We say the holding cost is low and medium whenh<H l andh2 [H l ;H h ], respectively. In addition, whenh>H h , we say the holding cost is high. For low holding costs (hH l ), the cut-off h = L , where L solvese 1 ( L ) 2 = h. Observe that ath =H l , where the cut-off L = ~ , andh = 0, we have L = 0. 0 0.005 0.01 0.015 0.02 0.025 0.03 Holding Cost, h -5 0 5 10 15 20 25 Relative Gain to Fixed Pricing (%) Revenue Welfare Low Holding Cost Medium Holding Cost High Holding Cost Figure 3.8: The social welfare and the revenue gain of the optimal mechanism (relative to the FP policy) as a percentage versus the holding cost. Here, the customer typeU(0; 1), = 0:1,V (;t) =e t , and the production costc = 0. 84 When the holding cost is medium, only customers with initial valuation M purchase the item where M solves R( M ;t f ( M )) ht f ( M ) = 0 and R(;t) is defined in Eq. (B.14). Here, t f (), called the FOC solution, is a solution that satisfies the first order condition (FOC). Precisely, t f () solves @R(;t)ht @t t f () = 0. Thus, one can show that whenh =H l , ~ solvesR( ~ ;t f ( ~ ))ht f ( ~ ) = 0. We show that in the optimal mechanism, the time of sale of a customer with type h is t h () := 8 > > > > > < > > > > > : 0 if h H High-Type; t f () if 2 [ h L ; h H ] Medium-Type; 1 if 2 [; h L ] Low-Type: (3.8) Here, forh>H h , we have h H = L , and forhH h , h H 2 [ L ; H ] solves @(R( h H ;t)ht) @t t=0 = h H ( h H + 2( h H ))h = 0: That is, forh H h , the FOC solution at h H , that is, t f ( h H ), is zero. We note that for any < h H , the FOC solution is negative, andR(;t)ht is maximized att = 0. We also observe that ath = 0 and h = H h , h H is respectively H and L . Furthermore, h H is decreasing inh, indicating that as the holding cost increases, more customers purchase the item at time zero. 0.2 0.4 0.6 0.8 1 Type 0 20 40 60 80 Time of Purchase h = 0 h = H l /5 h = H l h = (H l +H h )/2 h = H h Figure 3.9: Time of purchase versus type. Here,H l = 0:004,H h = 0:025, the customer typeU(0; 1), V (;t) =e t , = 0:1, and production costc = 0. 85 We now define h L in Eq. (3.8), wherehH l , h L 2 [ ~ ; L ] solves @(R( h L ;t)ht) @t t= 1 h L = h L e 1 ( h L +( h L ))h = 0; and forh>H l , we have h L = ~ . That is, forhH l , we have t f ( h L ) = 1 h L . We note that ath = 0 and h = H l , h L is respectively L and ~ . Furthermore, h L is decreasing inh. This suggests that as the holding cost increases, the highest-type customer who make a purchase and get zero utility decreases. That is, the low-type group gets smaller. Figure 3.7 in Example 3.4.3 shows how h L and h L vary as the holding costh increases. We now can describe the optimal mechanism by consolidating the time of purchase t h () and the cut-off h . In the optimal mechanism, when the holding cost is low, that is, h H l , the firm sells to high- and medium-type customers and some low-type customers with2 [maxf; L g; h L ]. Thus, with low holding costs, the firm can extract the full surplus of some of low-type customers. However, under medium and high holding costs, the firm has no such opportunity, as it must end the sale early due to the high cost of carrying the items. In particular, when the holding cost is medium,h2 [H l ;H h ], the firm only sells to high-type and some of medium-type customers, that is, those with2 [ M ; h H ]. Finally, when the holding cost is high, the firm does not benefit from the heterogeneity of valuation decay rates, and it simply posts a fixed price of L . Then, only customers with a type greater than h = L purchase the item at time zero. The following theorem formally characterizes the optimal mechanism. 0 0.2 0.4 0.6 0.8 1 Type 0 0.2 0.4 0.6 0.8 Payment h = 0 h = H l /5 h = H l h = (H l +H h )/2 h = H h Figure 3.10: Payment versus type. Here, H l = 0:004, H h = 0:025, the customer type U(0; 1), V (;t) =e t , = 0:1, and production costc = 0. 86 Theorem 3.4.4 (Positive Holding Cost). If Assumption 3 holds and ~ is the unique solution ofR(;t f ()) H l t f () = 0, then the optimal mechanism sells to customers of type h at time t h () and at price p() =V (;t h ()) R h e t h (z)z (1t h (z)z)dz where t h () and h are defined in Equations (3.7) and (3.8), respectively. Furthermore, for< h , p() =1. In Theorem 3.4.4, we assume that ath = H l , the solution ofR(;t f ())ht f () = 0 is unique. In Lemma C.1.13 in Appendix C.1.8, we show that ifR(;t f ())H l t f () = 0 has a unique solution, then the solution ofR(;t f ())ht f () = 0 is also unique for anyh2 [H l ;H h ]. We use this assumption to characterize the optimal mechanism when the holding cost is medium and large (hH l ). This assumption ensures thatR(;t f ())ht f () 0 for any< h = M . We note that this holds when the virtual value of customers, that is,R(;t f ())ht f () is increasing in the customer type. In this sense, this assumption resembles the standard assumption in the standard mechanism design literature where it is assumed that the virtual value of customers is monotone in their types. In Appendix C.1.8, we provide sufficient conditions to satisfy this assumption. We show that if for any ~ , 0 () is small enough, then this assumption holds. The aforementioned condition is satisfied for the Uniform, Exponential, and truncated Normal distributions. Theorem 3.4.4 shows that a positive holding cost, similar to a positive production cost, introduces a cut-off h . That is, the mechanism only sells the item to customers with type greater than or equal to h . However, unlike the production cost, the holding cost changes both the time of sales and the price. Moreover, the thresholds that divide customers into different groups, that is, h H and h L , also change withh. 0 0.2 0.4 0.6 0.8 1 Cuetomer Type 0 0.1 0.2 0.3 0.4 0.5 Utility Dynamic Pricing for Customers with Time-Sensitive Valuations Negin Golrezaei Hamid Nazerzadeh Ramandeep Randhawa Marshall School of Business University of Southern California Los Angeles, CA 90089 {negin.golrezaei.2017,hamidnz,rrandhaw}@marshall.usc.edu A core problem in the area of revenue management is pricing goods in the presence of strategic customers. We study this problem when the customers are heterogeneous with respect to both their initial valuation for the item and their valuation decay rate, which represents the value lost due to delay in the purchase. We characterize the optimal mechanism for selling durable goods in such environments and show that delayed allocation and dynamic pricing can be e↵ ective screening tools for maximizing profit of a firm. We further investigate the impact of production and holding costs on the optimal mechanism. We show that by lowering the holding and production costs, the firm can extract more revenue from customers, as he has more flexibility to delay the allocation. Key words: 1. Introduction h=0 h=H l /5 h=H l h=(H l +H h )/2 h=H h Dynamic pricing has becoming increasingly prevalent in many industries. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with de- manduncertainty(see,forinstance,AvivandPazgal2008andCachonandSwinney2011). In this paper, we show that dynamic pricing can play an important role in di↵ erentiating between customers over time even in the absence of demand uncertainty. In many set- tings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only di↵ erent in terms of their initial willingness-to-pay for these items 1 Dynamic Pricing for Customers with Time-Sensitive Valuations Negin Golrezaei Hamid Nazerzadeh Ramandeep Randhawa Marshall School of Business University of Southern California Los Angeles, CA 90089 {negin.golrezaei.2017,hamidnz,rrandhaw}@marshall.usc.edu A core problem in the area of revenue management is pricing goods in the presence of strategic customers. We study this problem when the customers are heterogeneous with respect to both their initial valuation for the item and their valuation decay rate, which represents the value lost due to delay in the purchase. We characterize the optimal mechanism for selling durable goods in such environments and show that delayed allocation and dynamic pricing can be e↵ ective screening tools for maximizing profit of a firm. We further investigate the impact of production and holding costs on the optimal mechanism. We show that by lowering the holding and production costs, the firm can extract more revenue from customers, as he has more flexibility to delay the allocation. Key words: 1. Introduction h=0 h=H l /5 h=H l h=(H l +H h )/2 h=H h Dynamic pricing has becoming increasingly prevalent in many industries. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with de- manduncertainty(see,forinstance,AvivandPazgal2008andCachonandSwinney2011). In this paper, we show that dynamic pricing can play an important role in di↵ erentiating between customers over time even in the absence of demand uncertainty. In many set- tings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only di↵ erent in terms of their initial willingness-to-pay for these items 1 Dynamic Pricing for Customers with Time-Sensitive Valuations Negin Golrezaei Hamid Nazerzadeh Ramandeep Randhawa Marshall School of Business University of Southern California Los Angeles, CA 90089 {negin.golrezaei.2017,hamidnz,rrandhaw}@marshall.usc.edu A core problem in the area of revenue management is pricing goods in the presence of strategic customers. We study this problem when the customers are heterogeneous with respect to both their initial valuation for the item and their valuation decay rate, which represents the value lost due to delay in the purchase. We characterize the optimal mechanism for selling durable goods in such environments and show that delayed allocation and dynamic pricing can be e↵ ective screening tools for maximizing profit of a firm. We further investigate the impact of production and holding costs on the optimal mechanism. We show that by lowering the holding and production costs, the firm can extract more revenue from customers, as he has more flexibility to delay the allocation. Key words: 1. Introduction h=0 h=H l /5 h=H l h=(H l +H h )/2 h=H h Dynamic pricing has becoming increasingly prevalent in many industries. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with de- manduncertainty(see,forinstance,AvivandPazgal2008andCachonandSwinney2011). In this paper, we show that dynamic pricing can play an important role in di↵ erentiating between customers over time even in the absence of demand uncertainty. In many set- tings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only di↵ erent in terms of their initial willingness-to-pay for these items 1 Dynamic Pricing for Customers with Time-Sensitive Valuations Negin Golrezaei Hamid Nazerzadeh Ramandeep Randhawa Marshall School of Business University of Southern California Los Angeles, CA 90089 {negin.golrezaei.2017,hamidnz,rrandhaw}@marshall.usc.edu A core problem in the area of revenue management is pricing goods in the presence of strategic customers. We study this problem when the customers are heterogeneous with respect to both their initial valuation for the item and their valuation decay rate, which represents the value lost due to delay in the purchase. We characterize the optimal mechanism for selling durable goods in such environments and show that delayed allocation and dynamic pricing can be e↵ ective screening tools for maximizing profit of a firm. We further investigate the impact of production and holding costs on the optimal mechanism. We show that by lowering the holding and production costs, the firm can extract more revenue from customers, as he has more flexibility to delay the allocation. Key words: 1. Introduction h=0 h=H l /5 h=H l h=(H l +H h )/2 h=H h Dynamic pricing has becoming increasingly prevalent in many industries. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with de- manduncertainty(see,forinstance,AvivandPazgal2008andCachonandSwinney2011). In this paper, we show that dynamic pricing can play an important role in di↵ erentiating between customers over time even in the absence of demand uncertainty. In many set- tings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only di↵ erent in terms of their initial willingness-to-pay for these items 1 Dynamic Pricing for Customers with Time-Sensitive Valuations Negin Golrezaei Hamid Nazerzadeh Ramandeep Randhawa Marshall School of Business University of Southern California Los Angeles, CA 90089 {negin.golrezaei.2017,hamidnz,rrandhaw}@marshall.usc.edu A core problem in the area of revenue management is pricing goods in the presence of strategic customers. We study this problem when the customers are heterogeneous with respect to both their initial valuation for the item and their valuation decay rate, which represents the value lost due to delay in the purchase. We characterize the optimal mechanism for selling durable goods in such environments and show that delayed allocation and dynamic pricing can be e↵ ective screening tools for maximizing profit of a firm. We further investigate the impact of production and holding costs on the optimal mechanism. We show that by lowering the holding and production costs, the firm can extract more revenue from customers, as he has more flexibility to delay the allocation. Key words: 1. Introduction h=0 h=H l /5 h=H l h=(H l +H h )/2 h=H h Dynamic pricing has becoming increasingly prevalent in many industries. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with de- manduncertainty(see,forinstance,AvivandPazgal2008andCachonandSwinney2011). In this paper, we show that dynamic pricing can play an important role in di↵ erentiating between customers over time even in the absence of demand uncertainty. In many set- tings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only di↵ erent in terms of their initial willingness-to-pay for these items 1 Dynamic Pricing for Customers with Time-Sensitive Valuations Negin Golrezaei Hamid Nazerzadeh Ramandeep Randhawa Marshall School of Business University of Southern California Los Angeles, CA 90089 {negin.golrezaei.2017,hamidnz,rrandhaw}@marshall.usc.edu A core problem in the area of revenue management is pricing goods in the presence of strategic customers. We study this problem when the customers are heterogeneous with respect to both their initial valuation for the item and their valuation decay rate, which represents the value lost due to delay in the purchase. We characterize the optimal mechanism for selling durable goods in such environments and show that delayed allocation and dynamic pricing can be e↵ ective screening tools for maximizing profit of a firm. We further investigate the impact of production and holding costs on the optimal mechanism. We show that by lowering the holding and production costs, the firm can extract more revenue from customers, as he has more flexibility to delay the allocation. Key words: 1. Introduction h=0 h=H l /5 h=H l h=(H l +H h )/2 h=H h Dynamic pricing has becoming increasingly prevalent in many industries. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with de- manduncertainty(see,forinstance,AvivandPazgal2008andCachonandSwinney2011). In this paper, we show that dynamic pricing can play an important role in di↵ erentiating between customers over time even in the absence of demand uncertainty. In many set- tings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only di↵ erent in terms of their initial willingness-to-pay for these items 1 Dynamic Pricing for Customers with Time-Sensitive Valuations Negin Golrezaei Hamid Nazerzadeh Ramandeep Randhawa Marshall School of Business University of Southern California Los Angeles, CA 90089 {negin.golrezaei.2017,hamidnz,rrandhaw}@marshall.usc.edu A core problem in the area of revenue management is pricing goods in the presence of strategic customers. We study this problem when the customers are heterogeneous with respect to both their initial valuation for the item and their valuation decay rate, which represents the value lost due to delay in the purchase. We characterize the optimal mechanism for selling durable goods in such environments and show that delayed allocation and dynamic pricing can be e↵ ective screening tools for maximizing profit of a firm. We further investigate the impact of production and holding costs on the optimal mechanism. We show that by lowering the holding and production costs, the firm can extract more revenue from customers, as he has more flexibility to delay the allocation. Key words: 1. Introduction h=0 h=H l /5 h=H l h=(H l +H h )/2 h=H h Dynamic pricing has becoming increasingly prevalent in many industries. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with de- manduncertainty(see,forinstance,AvivandPazgal2008andCachonandSwinney2011). In this paper, we show that dynamic pricing can play an important role in di↵ erentiating between customers over time even in the absence of demand uncertainty. In many set- tings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only di↵ erent in terms of their initial willingness-to-pay for these items 1 0 0.2 0.4 0.6 0.8 1 Customer Type 0 0.1 0.2 0.3 0.4 0.5 Utility h = 0 h = H l /5 h = H l h = (H l +H h )/2 h = H h Figure 3.11: Utility versus type. Here,H l = 0:004,H h = 0:025, the customer typeU(0; 1),V (;t) = e t , = 0:1, and production costc = 0. 87 Observe that whenh = 0, we can recover the optimal mechanism with no holding cost, as presented in Theorem 3.3.2. To understand why, note that 0 H = H , 0 L = L , and 0 L = 0. Furthermore, the FOC solution, t f () = t R () = +2() () . When we increase the holding cost from 0 to 0, the time of sale remains the same for low-type and high-type customers, but medium-type customers purchase the item sooner; see Example 3.4.3. In addition, the time of sale for low- and high-type customers does not depend on the holding cost. The holding cost only changes the time of sale for medium-type customers and the thresholds that separate customers into different groups; see Figure 3.9. Next, we discuss the proof of Theorem 3.4.4 and the insights therein. In the proof of Theorem 3.4.4, we show that for any value ofh, e zt h (z) (1zt h (z)) 0. Then, considering the fact thatu(;) = R h e zt h (z) (1zt h (z))dz, we can conclude that the utility of customers,u(;), is increasing in. Thus, despite the fact that lower-type customers delay their time of purchase to maximize their utility, their utility is nevertheless not as high as higher-type customers. In the proof, we further show that the purchase time t h () is decreasing. Thus, lower-type customers purchase the item later. Note that in the optimal mechanism, high-type customers purchase the item imme- diately while others delay their purchase time. In Appendix C.1.3, we divide the proof of Theorem 3.4.4 into three lemmas: Lemma C.1.2, C.1.3, and C.1.4. In Lemma C.1.2, C.1.3, and C.1.4, we characterize the optimal mechanism for when the holding cost is low, medium, and high, respectively. Note that the proof of Lemma C.1.2 encompasses the proof of Theorem 3.3.2. In the following, we briefly describe the main idea behind the proof. First we explain the proof for when the holding cost is low. To characterize the optimal mechanism, by Lemma 3.2.2, we should solve the following optimization problem. max ft();p():2[; ];u(;)0g E h max R(;t())ht(); 0 i u(; ) s.t. u(;) u(; ^ ) ; ^ 2 [; ] (IC) u(;) 0 2 [; ] (IR) (OPT-H) 88 Here the objective function is the virtual revenue andR(;t()) is defined in Eq. (B.14). The first and second sets of constraints ensure that the mechanism is IC and IR, respectively. By Lemma 3.2.1, a mechanism is IC if and only if the interval and envelope conditions hold. This implies that one can replace the IC constraints with these two conditions. However, as we noted in Section 3.3, characterizing the optimal mechanism that satisfies these two conditions is rather complicated. Thus, we relax Problem OPT-H and we only consider the IR constraints and envelope conditions. That is, in the relaxed problem, we need to find an optimal mechanism that satisfies the IR constraints, that is,u(;) = u(; ) + R e zt(z) (1zt(z)) 0 for any2 [; ]. Considering that the term inside the integral, that is,e zt(z) (1zt(z)), is not necessarily positive, to fulfill the IR constraints, it is not enough to set u(; ) 0. Note that in the traditional mechanism design, the utility of a customer is an integral of the probability of allocation, which is always positive. Therefore, the IR constraints are satisfied if we simply setu(; ) to zero. To characterize the optimal solution for the relaxed problem, we dualize the IR constraints to construct an upper bound on the relaxed problem. We show that the mechanism described in Theorem 3.4.4 obtains the upper bound, and thus, it is optimal. We then show that the solution of the relaxed problem also satisfies the interval condition. We verify the interval condition is satisfied by showing that the time of allocation t h () is decreasing and (1t h ()) 0 for any h . Recall thatu(;) =u(; )+ R e zt h (z) (1 zt h (z)). That the interval condition holds suggests that the mechanism given in Theorem 3.4.4 is revenue- optimal. The proof of Theorem 3.4.4 is similar for the medium holding costs. One of the main challenges there is to show that the mechanism described in Theorem 3.4.4 is IR; that is, we need to verify that (1t h ()) 0 for any h . To this end, we first show that whenhH l and ~ , we have 1t h () 0. Then, through showing M ~ , we obtain 1t h () 0 for any M . 3.4.3 Positive Production and Holding Costs Up to here, we presented optimal mechanisms when either the holding costh is zero, or the production cost c is zero. In this section, we characterize an optimal mechanism when both production and holding costs are positive. The following is the main result of this section. In this theorem, with a slight abuse of notations, we define c h as the smallest value that solvesR c ;t h ( c ) ht h ( c ) = c whereR ;t), t h (), and h 89 are defined in Equations (B.14), (3.8), and (3.7), respectively. We show that a positive production cost does not change the allocation time of customers with type c . That is, the firm can price such that customers with type c purchase the item at time t h (), and other customers do not purchase the item at all. Theorem 3.4.5 (Positive Production and Holding Costs). If Assumption 3 holds, ~ is the unique solution ofR(;t f ())H l t f () = 0, the holding costh 0, and production costc2 [0; ], then the optimal mechanism sells to customers of type c at time t h (), given in Eq. (3.8), and at price p() =V (;t h ()) Z c e t h (z)z (1t h (z)z)dz where c solvesR c ;t h ( c ) ht h ( c ) = c. Furthermore, for< c , p() =1. 3.5 General Valuation Functions In this section, we present an optimal selling mechanism under the general valuation function V (;t) = e g()t , where g() 0 is log-concave and increasing. Log-concavity implies that g 0 (x) g(x) is decreasing. Every positive concave function is log-concave. However, the reverse does not necessarily hold ?. We show that the main insights of Section in 3.3 still hold under general valuation functions. For convenience, we assume that the production and holding costs are zero. We show that in the optimal mechanism, customers with type purchase the item at time t g (), defined below. t g () := 8 > > > > > < > > > > > : 0 if g H High-Type; g()+()(g 0 ()+g()) ()g()g 0 () if2 [ g L ; g H ] Medium-Type; 1 g 0 () if2 [; g L ] Low-Type; (3.9) where g H solves g H +( g H ) + ( g H )g 0 ( g H ) g H g( g H ) = 0, and g L solvesg( g L ) +( g L )g 0 ( g L ) = 0. The following theorem presents the optimal mechanism under a general valuation function. Theorem 3.5.1 (General Valuation Functions). Suppose that Assumption 3 holds and the customer valuation function is given byV (;t) =e g()t , whereg() 0 is non-decreasing and log-concave. Then, given that 90 time of allocation t g (), defined in Eq. (3.9), is non-decreasing, the optimal mechanism sells to customers of type at time t g (), and at price p() =V (;t g ()) R e g(z)tg (z) 1 t g (z)g 0 (z)z dz. The assumption thatg() is non-decreasing indicates that the customers with higher initial valuations lose interest in the item much faster than customers with lower initial valuations. The log-concavity ofg() implies that 1t g ()g 0 () 0 for any; see the proof of Theorem 3.5.1. Given that 1t g ()g 0 () 0, we can conclude that the described mechanism is IR; that is u(;) 0. This leads to monotonicity of u(;) in. The monotonicity of time of allocation,t g (), is a standard assumption in the mechanism design literature Myerson (1981). We note that t g () is decreasing ifg 0 () and()g 0 () are increasing. These conditions hold wheng() = a ,a2 [0; 1]. Later, in Example 3.5.2, we will show that wheng() = a witha > 1 andU(0; 1), the allocation time t g () is monotone. The proof of Theorem 3.5.1 is similar to that of Theorem 3.3.2. We start by relaxing the problem by ignoring the interval condition. Then, we show that the solution of the relaxed problem satisfies the interval condition, and is thus optimal. Again, in the optimal mechanism there exist three groups of customers: high-type, medium-type, and low-type. High-type customers are customers with type g H . These customers purchase the item at time zero. Medium-type customers are customers with type2 ( g L ; g H ), and these customers delay their purchase time and enjoy a positive utility. Low-type customers are customers with type g L who also delay their purchase time, but obtain zero utility. To get insight into Theorem 3.5.1, we revisit Example 3.3.3. Example 3.5.2 (Revisiting Example 3.3.3: General Valuation Functions). Here we present the optimal mechanism for the setting described in Example 3.3.3 when V (;t) = e g()t , g() = a , a 0, and h = c = 0. Note that as “a” increases, the customers’ decay rates become more heterogeneous. For this valuation function, thresholds g H and g L are respectively a+1 a+2 and a a+1 . Note that both g H and g L are increasing ina and converges to 1 whena goes to infinity. Furthermore, ata = 1, g H = H = 2 3 and g L = L = 1 2 . Thresholds g H and g L are depicted in Figure 3.12. We observe that asa increases, the high and medium-type regions shrink while the low-type region expands. Recall that the low-type customers get zero utility. Thus, when heterogeneity among customers increases, i.e., a increases, then the firm can 91 extract the entire surplus of more customers. As a result, as depicted in Figure 3.13, the revenue of the firm increases when decay rate of customers gets more heterogeneous. Figure 3.13 shows that the revenue gain of the optimal mechanism relative to the FP policy. We observe that as a increases, the firm increases his revenue to more than 90% (at a = 10) by employing DP . The reason is that by increasinga, the valuation decay rates gets more heterogeneous. This, in turn, increases the value of differentiating customers via using DP . Figure 3.13 also shows that the social welfare of the customers and firm increases whena increases. Furthermore, for any value ofa, DP outperforms FP in terms of obtained social welfare. Next, we discuss the time of purchase, payment and utility of customers. Given thatV (;t) =e g()t withg() = a , the time of purchase in the optimal mechanism is given by t g () = 8 > > > > > < > > > > > : 0 if a+1 a+2 ; (a+2)(a+1) a(1) a if2 [ a a+1 ; a+1 a+2 ]; 1 a a if2 [0; a a+1 ]; (3.10) The time of purchase is shown in Figure 3.14 for a = 0:2; 0:6; 1, and 1:4. The figure shows that when a increases, customers with lower type and higher type purchase the item later. Specifically, the time of purchase of low-type customers increases significantly when we increasea from 0:2 to 1:4 while the time of purchase of other customers does not vary remarkably. Figures 3.15 and 3.16, respectively, show the payment and utility of customers as a function of their types in the optimal mechanism. The utility of customers decreases whena increases. This is so because whena is large, the firm can better differentiate customers. Furthermore, Figures 3.15 shows that by increasinga, only payment of customers with lower and higher types increases. 3.6 Conclusion Dynamic pricing is a common practice in many industries. Such practice has shown to be an effective tool to mitigate negative impact of demand uncertainty. This work contributes to the literature by showing dynamic pricing can be significantly beneficial even in the absence of demand uncertainty. Specifically, we show that 92 0 2 4 6 8 10 Exponent a 0 0.2 0.4 0.6 0.8 Customer Type H g L g Medium- Type High-Type Low-Type Figure 3.12: The thresholds versus the exponenta for the mechanism described in Theorem 3.5.1. Here, the customer typeU(0; 1),V (;t) =e g()t ,g() = a , andh = c = 0. 0 2 4 6 8 10 Exponent a 0 20 40 60 80 100 Relative Gain to Fixed Pricing (%) Revenue Social Welfare Figure 3.13: The social welfare and revenue gain of DP (relative to the FP policy) in percentage as a function of the exponenta. Here, the customer typeU(0; 1),V (;t) =e g()t ,g() = a , andh = c = 0. when customers’ valuations are time-sensitive and decay at different rates, even with deterministic demand, the firms can increase their revenue by deploying dynamic pricing. The heterogeneity in valuation decay rates enables the firms to differentiate customers effectively, and extract more revenue from them, relative to the case where customers’ valuation does not even decay over time. With heterogeneous valuation decay rates, the relative ranking of customers in terms of their valua- tions vary over time. The change in customers’ ranking motivates the firms to revisit their prices as time progresses. This way, they can sell the products to customers that have lower valuation/rank initially. Selling the products to these customers not only can increase the revenue of the firms but also improve the social welfare. We show that dynamic pricing can be still beneficial when when producing and carrying the products are costly. However, for lower values of production and holding costs, the firms have more flexibility to delay the sales, and as a result, they can gain more from dynamic pricing. 93 0.2 0.4 0.6 0.8 1 Customer Type 0 5 10 15 Time of Purchase a = 0.2 a = 0.6 a = 1.0 a = 1.4 Figure 3.14: Time of purchase versus type in the optimal mechanism described in Theorem 3.5.1. Here, the customer typeU(0; 1),V (;t) =e g()t ,g() = a , andh = c = 0. 0.2 0.4 0.6 0.8 1 Customer Type 0 0.2 0.4 0.6 0.8 Payment a = 0.2 a = 0.6 a = 1.0 a = 1.4 Figure 3.15: Payment versus type in the optimal mechanism described in Theorem 3.5.1. Here, the customer typeU(0; 1),V (;t) =e g()t ,g() = a , andh = c = 0. 94 0 0.5 1 Customer Type 0 0.1 0.2 0.3 0.4 0.5 0.6 Utility a = 0.2 a = 0.6 a = 1.0 a = 1.4 Figure 3.16: Utility versus type in the optimal mechanism described in Theorem 3.5.1. Here, the customer typeU(0; 1),V (;t) =e g()t ,g() = a , andh = c = 0. 95 References Abraham, I., Athey, S., Babaioff, M. & Grubb, M. (2011), Peaches, lemons, and cookies: Designing auction markets with dispersed information, Technical report, Tech. Rep. MSRTR-2011-68, Microsoft Research. January. Acimovic, J. & Graves, S. (2011), Making better fulfillment decisions on the fly in an online retail environ- ment. Working Paper, MIT. Agrawal, S., Wang, Z. & Ye, Y . (2009), A dynamic near-optimal algorithm for online linear programming. Working paper, Stanford University. Akan, M., Ata, B. & Dana, J. (2009), ‘Revenue management by sequential screening’, Unpublished manuscript, Carnegie Mellon University pp. 1792–1811. Amazon’s Recommendation Systems (2012), ‘How recommendations work?’, http://www.amazon. com/gp/help/customer/display.html/ref=pd_ys_help_iyr?ie=UTF8&nodeId= 13316081 . Araman, V . F. & Caldentey, R. (2009), ‘Dynamic pricing for nonperishable products with demand learning’, Operations research 57(5), 1169–1188. Athey, S. (2001), ‘Single crossing properties and the existence of pure strategy equilibria in games of incom- plete information’, Econometrica 69(4), 861–889. Athey, S. & Levin, J. (1999), Information and competition in us forest service timber auctions, Technical report, National Bureau of Economic Research. Aviv, Y . & Pazgal, A. (2008), ‘Optimal pricing of seasonal products in the presence of forward-looking consumers’, Manufacturing & Service Operations Management 10(3), 339–359. 96 Aviv, Y ., Wei, M. M. & Zhang, F. (2015), Responsive pricing of fashion products: The effects of demand learning and strategic consumer behavior, Technical report, Working Paper, Washington University. Azar, Y ., Birnbaum, B. E., Karlin, A. R. & Nguyen, C. T. (2009), On revenue maximization in second- price ad auctions, in A. Fiat & P. Sanders, eds, ‘ESA’, V ol. 5757 of Lecture Notes in Computer Science, Springer, pp. 155–166. Ball, M. O. & Queyranne, M. (2009), ‘Toward robust revenue management: Competitive analysis of online booking’, Operations Research 57(4), 950–963. Bansal, M. & Maglaras, C. (2009), ‘Dynamic pricing when customers strategically time their purchase: Asymptotic optimality of a two-price policy’, Journal of Revenue and Pricing management 8(1), 42–66. Battaglini, M. & Lamba, R. (2012), Optimal dynamic contracting. Economic Theory Center Working Paper. Beil, D. R., Chen, Q., Duenyas, I. & See, B. D. (2015), ‘When to deploy test auctions in sourcing’, Working Paper . Bergemann, D. & Bonatti, A. (2011), ‘Targeting in advertising markets: implications for offline versus online media’, The RAND Journal of Economics 42(3), 417–443. Bergemann, D. & Bonatti, A. (2013), Selling cookies. Working Paper. Bergemann, D. & Said, M. (2011a), ‘Dynamic auctions’, Wiley Encyclopedia of Operations Research and Management Science . Bergemann, D. & Said, M. (2011b), ‘Dynamic auctions: A survey’, Wiley Encyclopedia of Operations Research and Management Science . Bergemann, D. & V¨ alim¨ aki, J. (2002), ‘Information acquisition and efficient mechanism design’, Econo- metrica 70(3), 1007–1033. Bernstein, F., K¨ ok, A. G. & Xie, L. (2011), Dynamic assortment customization with limited inventories. Working Paper, Duke University. Besanko, D. & Winston, W. L. (1990), ‘Optimal price skimming by a monopolist facing rational consumers’, Management Science 36(5), 555–567. 97 Besbes, O. & Lobel, I. (2015), ‘Intertemporal price discrimination: Structure and computation of optimal policies’, Management Science 61(1), 92–110. Besbes, O. & Saur´ e, D. (2012), Dynamic pricing strategies in the presence of demand shocks. Working Paper. Besbes, O. & Zeevi, A. (2011), ‘On the minimax complexity of pricing in a changing environment’, Oper- ations Research 59(1), 66–79. Bhat, C. R. (2002), ‘Recent methodological advances relevant to activity and travel behavior analysis’, In Perpetual Motion: Travel Behavior Research Opportunities and Application Challenges, edited by H.S. Mahmassani, Pergamon pp. 381–414. Bhawalkar, K., Hummel, P. & Vassilvitskii, S. (2014), Value of targeting, in ‘Algorithmic Game Theory’, Springer, pp. 194–205. Bitran, G. & Caldentey, R. (2003), ‘An overview of pricing models for revenue management’, Manufactur- ing & Service Operations Management 5(3), 203–229. Bitran, G. R. & Mondschein, S. V . (1997), ‘Periodic pricing of seasonal products in retailing’, Management Science 43(1), 64–79. Board, S. & Skrzypacz, A. (2016), ‘Revenue management with forward-looking buyers’, Journal of Political Economy 124(4), 1046–1087. Boleslavsky, R. & Said, M. (2013), ‘Progressive screening: Long-term contracting with a privately known stochastic process’, Review of Economic Studies 80(1), 1–34. Borgs, C., Candogan, O., Chayes, J., Lobel, I. & Nazerzadeh, H. (2014), ‘Optimal multiperiod pricing with service guarantees’, Management Science 60(7), 1792–1811. Buchbinder, N., Jain, K. & Naor, J. (2007), Online primal-dual algorithms for maximizing ad-auctions revenue, in L. Arge, M. Hoffmann & E. Welzl, eds, ‘ESA’, V ol. 4698 of Lecture Notes in Computer Science, Springer, pp. 253–264. Buchbinder, N. & Naor, J. (2007), ‘The design of competitive online algorithms via a primal-dual approach’, Foundations and Trends in Theoretical Computer Science 3(2–3), 93–263. 98 Cachon, G. P. & Swinney, R. (2011), ‘The value of fast fashion: Quick response, enhanced design, and strategic consumer behavior’, Management Science 57(4), 778–795. Chan, C. W. & Farias, V . F. (2009), ‘Stochastic depletion problems: Effective myopic policies for a class of dynamic optimization problems’, Mathematics of Operations Research 34(2), 333–350. Chan, L. M., Shen, Z. M., Simchi-Levi, D. & Swann, J. L. (2004), Coordination of pricing and inventory decisions: A survey and classification, in ‘Handbook of quantitative supply chain analysis’, Springer, pp. 335–392. Chen, Y . & Farias, V . F. (2013), ‘Simple policies for dynamic pricing with imperfect forecasts’, Operations Research 61(3), 612–624. Chen, Y . & Farias, V . F. (2015), Robust dynamic pricing with strategic customers., in ‘EC’, p. 777. Ciocan, D. F. & Farias, V . F. (2013), Dynamic allocation problems with volatile demand. To appear in Mathematics of Operations Research. Clifford, S. (2012), ‘Shopper alert: Price may drop for you alone’, The New York Times, August 9 . Coase, R. H. (1972), ‘Durability and monopoly’, The Journal of Law and Economics 15(1), 143–149. Conlisk, J., Gerstner, E. & Sobel, J. (1984), ‘Cyclic pricing by a durable goods monopolist’, The Quarterly Journal of Economics 99(3), 489–505. Cremer, J. & Khalil, F. (1992), ‘Gathering information before signing a contract’, The American Economic Review pp. 566–578. Cr´ emer, J., Spiegel, Y . & Zheng, C. Z. (2009), ‘Auctions with costly information acquisition’, Economic Theory 38(1), 41–72. Davis, J., Gallego, G. & Topaloglu, H. (2011), Assortment optimization under variants of the nested logit model. Working Paper, Cornell University. Devenur, N. R. & Hayes, T. P. (2009), The adwords problem: online keyword matching with budgeted bid- ders under random permutations, in ‘Proceedings of the 10th ACM conference on Electronic commerce’, EC ’09, ACM, New York, NY , USA, pp. 71–78. 99 Elmaghraby, W., G¨ ulc¨ u, A. & Keskinocak, P. (2008), ‘Designing optimal preannounced markdowns in the presence of rational customers with multiunit demands’, Manufacturing & Service Operations Manage- ment 10(1), 126–148. Emek, Y ., Feldman, M., Gamzu, I., Leme, R. P. & Tennenholtz, M. (2012), Signaling schemes for revenue maximization, in B. Faltings, K. Leyton-Brown & P. Ipeirotis, eds, ‘ACM Conference on Electronic Commerce’, ACM, pp. 514–531. ¨ Eso, P. & Szentes, B. (2007), ‘Optimal information disclosure in auctions and the handicap auction’, Review of Economic Studies 74(3), 705–731. Farias, V . F., Jagabathula, S. & Shah, D. (2011), Assortment optimization under a general choice model. Working Paper, MIT. Farias, V . F., Jagabathula, S. & Shah, D. (2013), ‘A nonparametric approach to modeling choice with limited data’, Management Science 59(2), 305–322. Federgruen, A. & Heching, A. (1999), ‘Combined pricing and inventory control under uncertainty’, Opera- tions research 47(3), 454–475. Feldman, J., Henzinger, M., Korula, N., Mirrokni, V . & Stein, C. (2010), ‘Online stochastic packing applied to display ad allocation’, Algorithms–ESA 2010, Springer pp. 182–194. Feng, Y . & Gallego, G. (1995), ‘Optimal starting times for end-of-season sales and optimal stopping times for promotional fares’, Management Science 41(8), 1371–1391. Gallego, G., Iyengar, G., Phillips, R. & Dubey, A. (2004), Managing flexible products on a network. Work- ing Paper, Columbia University. Gallego, G. & Topaloglu, H. (2012), Constrained assortment optimization for the nested logit model. Work- ing Paper, Cornell University. Gallego, G. & van Ryzin, G. (1994a), ‘On the relationship between inventory costs and variety benefits in retail assortments’, Management Science 40(8), 999–1020. Gallego, G. & Van Ryzin, G. (1994b), ‘Optimal dynamic pricing of inventories with stochastic demand over finite horizons’, Management science 40(8), 999–1020. 100 Garrett, D. (2011), ‘Durable goods sales with dynamic arrivals and changing values’, Unpublished manuscript, Northwestern University . Gaur, V . & Honhon, D. (2006), ‘Assortment planning and inventory decisions under a locational choice model’, Management Science 52(10), 1528–1543. Ghosh, A., Nazerzadeh, H. & Sundararajan, M. (2007), Computing optimal bundles for sponsored search, in ‘Internet and Network Economics, Third International Workshop (WINE)’, pp. 576–583. Goel, A., Guha, S. & Munagala, K. (2010), ‘How to probe for an extreme value’, ACM Transactions on Algorithms 7(1), 12. Goel, A., Mahdian, M., Nazerzadeh, H. & Saberi, A. (2010), ‘Advertisement allocation for generalized second-pricing schemes’, Operations Research Letters 38(6), 571–576. Golrezaei, N. & Nazerzadeh, H. (2016), ‘Auctions with dynamic costly information acquisition’, Operations Research . Golrezaei, N., Nazerzadeh, H. & Rusmevichientong, P. (2014), ‘Real-time optimization of personalized assortments’, Management Science 60(6), 1532–1551. Google AdX Documentation (2015), ‘Private auctions overview’,https://support.google.com/ adxbuyer/answer/2839853?hl=en. Accessed: 2015-04-1. Google DoubleClick Documentation (2016), ‘Distinguish between the open auction, private auctions, and preferred deals’, https://support.google.com/dfp_premium/answer/2710459? hl=en. Accessed: 2016-04-1. Goyal, V ., Levi, R. & Segev, D. (2011), Near-optimal algorithms for the assortment planning problem under dynamic substitution and stochastic demand. Working Paper, Columbia University. Guha, S., Munagala, K. & Sarkar, S. (2006), Optimizing transmission rate in wireless channels using adap- tive probes, in ‘Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS/Performance’, pp. 381–382. Gul, F., Sonnenschein, H. & Wilson, R. (1986), ‘Foundations of dynamic monopoly and the coase conjec- ture’, Journal of Economic Theory 39(1), 155–190. 101 Hatfield, J. W., Kojima, F. & Kominers, S. D. (2015), ‘Strategy-proofness, investment efficiency, and marginal returns: an equivalence’, Working paper . Helft, M. & Vega, T. (2010), ‘Retargeting ads follow surfers to other sites’, The New York Times . URL: http://www.nytimes.com/2010/08/30/technology/30adstalk.html Hendricks, K., Pinkse, J. & Porter, R. H. (2003), ‘Empirical implications of equilibrium bidding in first- price, symmetric, common value auctions’, The Review of Economic Studies 70(1), 115–145. Honhon, D., Gaur, V . & Seshadri, S. (2010), ‘Assortment planning and inventory decisions under stock-out based substitution’, Operations Research 58(5), 1364–1379. Hummel, P. & McAfee, P. (2012), ‘When does improved targeting increase revenue?’. Jaillet, P. & Lu, X. (2012), Near-optimal online algorithms for dynamic resource allocations. Working Paper, MIT. Jasin, S. & Kumar, S. (2012), ‘A re-solving heuristic with bounded revenue loss for network revenue man- agement with customer choice’, Mathematics of Operations Research 37(2), 313–345. Jing, B. (2011), ‘Exogenous learning, seller-induced learning, and marketing of durable goods’, Manage- ment Science 57(10), 1788–1801. Kakade, S. M., Lobel, I. & Nazerzadeh, H. (2013), ‘Optimal dynamic mechanism design and the virtual pivot mechanism’, Operations Research 61(4), 837–854. Kalyanasundaram, B. & Pruhs, K. (2000), ‘An optimal deterministic algorithm for online b-matching’, Theorictal Computer Science 233(1-2), 319–325. K¨ ok, A. G., Fisher, M. & Vaidyanathan, R. (2008), Assortment planning: Review of literature and industry practice, in ‘Retail Supply Chain Management’, Springer. Kristol, D. M. (2001), ‘Http cookies: Standards, privacy, and politics’, ACM Transactions on Internet Tech- nology (TOIT) 1(2), 151–198. Larson, K. & Sandholm, T. (2001), Costly valuation computation in auctions, in ‘Proceedings of the 8th Conference on Theoretical Aspects of Rationality and Knowledge’, TARK ’01, pp. 169–182. Lazear, E. P. (1984), ‘Retail pricing and clearance sales’. 102 Lewis, G. (2011), ‘Asymmetric information, adverse selection and online disclosure: The case of ebay motors’, The American Economic Review pp. 1535–1546. Li, G., Rusmevichientong, P. & Topaloglu, H. (2013), The d-level nested logit model: Assortment and price optimization problems. Working Paper, Marshall School of Business. Liu, Q. & van Ryzin, G. J. (2008a), ‘On the choice-based linear programming model for network revenue management’, Manufacturing and Service Operations Management 10(2), 288–310. Liu, Q. & Van Ryzin, G. J. (2008b), ‘Strategic capacity rationing to induce early purchases’, Management Science 54(6), 1115–1131. Lobel, I. & Xiao, W. (2013), Optimal long-term supply contracts with asymmetric demand information. Working Paper. Lu, J. & Ye, L. (2014), ‘Optimal two-stage auctions with costly information acquisition’, Working Paper . Mahajan, S. & van Ryzin, G. J. (2001), ‘Stocking retail assortments under dynamic consumer substitution’, Operations Research 49(3), 334–351. Mahdian, M., Nazerzadeh, H. & Saberi, A. (2007), Allocating online advertisement space with unreliable estimates, in J. K. MacKie-Mason, D. C. Parkes & P. Resnick, eds, ‘ACM Conference on Electronic Commerce’, ACM, pp. 288–294. Mahdian, M., Nazerzadeh, H. & Saberi, A. (2012), ‘Online optimization with uncertain information’, ACM Transactions on Algorithms 8(1), 2. Mattioli, D. (2012), ‘On orbitz, mac users steered to pricier hotels’, The Wall Street Journal, June 26 . McAfee, R. P. & McMillan, J. (1988), ‘Search mechanisms’, Journal of Economic Theory 44(1), 99–123. Mehta, A., Saberi, A., Vazirani, U. V . & Vazirani, V . V . (2007), ‘Adwords and generalized online matching’, Journal of the ACM 54(5). Milgrom, P. & Segal, I. (2002), ‘Envelope theorems for arbitrary choice sets’, Econometrica 70(2), 583–601. Mirrokni, V . S., Gharan, S. O. & Zadimoghaddam, M. (2012), Simultaneous approximations for adversar- ial and stochastic online budgeted allocation, in ‘Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms’, pp. 1690–1701. 103 Muthukrishnan, S. (2009), Ad exchanges: Research issues, in ‘Internet and network economics’, Springer, pp. 1–12. Myerson, R. (1986), ‘Multistage games with communications’, Econometrica 54(2), 323–358. Myerson, R. B. (1981), ‘Optimal auction design’, Mathematics of operations research 6(1), 58–73. Pancs, R. (2013), ‘Sequential negotiations with costly information acquisition’, Games and Economic Behavior 82, 522–543. Pavan, A., Segal, I. & Toikka, J. (2014), ‘Dynamic mechanism design: A myersonian approach’, Economet- rica . Quint, D. & Hendricks, K. (2012), ‘Selecting bidders via non-binding bids when entry is costly’, Working Paper . Rayo, L. & Segal, I. (2010), ‘Optimal information disclosure’, Journal of Political Economy 118(5), 949– 987. RFC6265 (2011), ‘Http state management mechanism’, http://tools.ietf.org/html/rfc6265. Roberts, J. W. & Sweeting, A. (2010), Entry and selection in auctions, Technical report, National Bureau of Economic Research. Rusmevichientong, P., Shen, Z.-J. M. & Shmoys, D. B. (2009), ‘A PTAS for capacitated sum-of-ratios optimization’, Operations Research Letters 37(4), 230–238. Rusmevichientong, P., Shen, Z.-J. M. & Shmoys, D. B. (2010), ‘Dynamic assortment optimization with a multinomial logit choice model and capacity constraint’, Operations Research 58(6), 1666–1680. Shen, Z.-J. M. & Su, X. (2007), ‘Customer behavior modeling in revenue management and auctions: A review and new research opportunities’, Production and operations management 16(6), 713–728. Shi, X. (2012), ‘Optimal auctions with information acquisition’, Games and Economic Behavior 74(2), 666– 686. Singer, N. (2012), ‘Mapping, and sharing, the consumer genome’, The New York Times . URL: http://www.nytimes.com/2012/06/17/technology/acxiom-the-quiet-giant-of-consumer-database- marketing.html 104 Smith, S. A. & Agrawal, N. (2000), ‘Management of multi-item retail inventories systems with demand substitution’, Operations Research 48, 50–64. Steel, E. & Angwin, J. (2010), ‘The web’s cutting edge, anonymity in name only’, The Wall Street Journal, August 3 . Stokey, N. L. (1979), ‘Intertemporal price discrimination’, The Quarterly Journal of Economics pp. 355– 371. Su, X. (2007), ‘Intertemporal pricing with strategic customer behavior’, Management Science 53(5), 726– 741. Syrgkanis, V ., Kempe, D. & Tardos, E. (2013), Information asymmetries in common-value auctions with discrete signals. Working Paper. Szalay, D. (2009), ‘Contracts with endogenous information’, Games and Economic Behavior 65(2), 586– 625. Talluri, K. & Van Ryzin, G. (2004a), ‘Revenue management under a general discrete choice model of consumer behavior’, Management Science 50(1), 15–33. Talluri, K. & van Ryzin, G. J. (2004b), ‘Revenue management under a general discrete choice model of consumer behavior’, Management Science 50(1), 15–33. Talluri, K. & van Ryzin, G. J. (2004c), The Theory and Practice of Revenue Management, Springer, New York. Thompson, D. (2012), ‘The 11 ways that consumers are hopeless at math’, the Atlantic . Topaloglu, H. (2013), ‘Joint stocking and product offer decisions under the multinomial logit model’, Pro- duction and Operations Management 22(5). Vallen, M. A. & Bullinger, C. D. (1999), ‘The due diligence process for acquiring and building power plants’, The Electricity Journal 12(8), 28–37. van Ryzin, G. J. & Mahajan, S. (1999), ‘On the relationship between inventory costs and variety benefits in retail assortments’, Management Science 45, 1496–1509. Wang, R. (1993), ‘Auctions versus posted-price selling’, The American Economic Review pp. 838–851. 105 Wortham, J. (2012), ‘Rather than share your location, Foursquare wants to suggest one’, The New York Times, June 7 . Yao, A. (1977), ‘Probabilistic computations: Toward a unified measure of complexity’, Proceedings of the 18th IEEE Symposium on Foundations of Computer Science (FOCS) pp. 222–227. Ye, L. (2007), ‘Indicative bidding and a theory of two-stage auctions’, Games and Economic Behavior 58(1), 181–207. Yu, M., Debo, L. & Kapuscinski, R. (2015), ‘Strategic waiting for consumer-generated quality information: Dynamic pricing of new experience goods’, Management Science 62(2), 410–435. 106 Appendix A Technical Appendix to Chapter 1 A.1 Here, we present the proof of Lemma 1.2.1, Theorem 1.3.3, and Proposition 1.4.2. A.1.1 Proof of Lemma 1.2.1: Fix an arbitrary sequence fz t g T t=1 of customers and an algorithmA. Let the random variablesS 1 ;:::;S T denote the sequence of assortments offered by the algorithmA, and let 1 ;:::; T be the random variables corresponding to the customer’s choices. Note thatS t may depend onS 1 ;:::;S t1 and 1 ;:::; t1 . The expected revenue of the algorithmA is given by E " T X t=1 X S2S X i2S r i 1 l [S t =S; t =i] # = T X t=1 X S2S X i2S r i E [ zt i (S t )1 l [S t =S]] = T X t=1 X S2S n X i=1 r i zt i (S) PrfS t =Sg ; where the first equality follows from the tower property of conditional expectation and the fact that E 1 l [ t =i] S t = zt i (S t ), and the last equality follows from the fact that E [1 l [S t =S]] = PrfS t =Sg. For t = 1;:::;T and S2S, let y t (S) = PrfS t = Sg. To complete the proof, it suffices to show that y t (S) satisfies the constraints of the Primal fz t g T t=1 . Clearly, y t (S) 0. Also, by definition, P S2S y t (S) = 1. Finally, since the algorithm cannot sell more than the initial inventory of each prod- uct, we have that for every producti, with probability one, T X t=1 X S2S 1 l [S t =S; t =i]c i ; and by taking expectation on both sides, we have P T t=1 P S2S zt i (S) y t (S) c i . This shows that y t (S) :S2S is a feasible solution ofPrimal fz t g T t=1 , which is the desired result. 107 A.1.2 Proof of Theorem 1.3.3 In this section, we prove Theorem 1.3.3. We start with the following lemma. Lemma A.1.1. For any increasing, concave, twice-differentiable penalty function : [0; 1]! [0; 1], the functionx7! 1x 2x (x) is increasing on [0; 1], and for anya2 [0; 1], the functionC7! 1 C + R 1 a+ 1 C (y)dy is decreasing on [1=(1a);1). Proof. Proof: The derivative ofx7! 1x 2x (x) is equal to 1+ (x)+(1x) 0 (x) (2x (x)) 2 . The numerator1+ (x)+ (1x) 0 (x) is equal to 0 atx = 1, and its derivative is equal to (1x) 00 (x) 0 because is concave. Thus,1 + (x) + (1x) 0 (x) 0 for allx2 [0; 1]. To complete the proof, note that the derivative of C7! 1 C + R 1 a+ 1 C (y)dy is equal to(1=C) 2 (1 (a + 1 C )) 0. Letfz t g T t=1 be an arbitrary sequence of customers. Note that by Lemma 1.3.2, the Inventory-Balancing algorithm respects the capacity constraints of the problem. However, its solution may not correspond to a feasible solution of Primal fz t g T t=1 . In order to compare the expected revenue of our algorithm with the upper-bound given by Primal fz t g T t=1 , we construct a sequence of feasible dual solutions. The dual of Primal fz t g T t=1 is given below: MINIMIZE P T t=1 t + P n i=1 i c i SUBJECT TO: t P n i=1 zt i (S) (r i i ) 1tT;S2S; i 0 1in: (Dual fz t g T t=1 ) Based on the realization of customers’ choices, we construct a feasible solution for the linear program Dual fz t g T t=1 as follows: i =r i (1 I T i =c i ) i = 1; 2;:::;n; t = X i2S t r i I t1 i =c i zt i (S t ) t = 1; 2;:::;T: 108 Note that i and t are random variables because they depend on the inventory levels, which are random. However, they form a feasible solution for the dual, with probability one, because t = X i2S t r i I t1 i =c i zt i (S t ) X i2S t r i I T i =c i zt i (S t ) = X i2S t (r i i ) zt i (S t ); where the inequality follows from the fact that is increasing andI t1 i I T i an the equality follows from the definition of i , that isr i (I T i =c i ) = (r i i ). We now calculate the expected value of this dual solution, which will provide an upper bound on the value ofPrimal fz t g T t=1 by the Weak Duality Theorem. Since the sequence of the customers fz t g T t=1 is fixed, the expectation is with respect to the realization of each customer’s choice. Recall thatQ t i is a binary random variable that is equal to 1 if the customer chooses producti in periodt, and 0 otherwise. Thus, E " T X t=1 t # = E 2 4 T X t=1 X S 0 S t X i2S 0 r i (I t1 i =c i ) zt (S 0 ;S t ) 3 5 = E " T X t=1 n X i=1 r i (I t1 i =c i )Q t i # = E " T X t=1 n X i=1 r i (I t1 i =c i ) I t1 i I t i # = E 2 4 n X i=1 r i c i X t=I T i +1 (t=c i ) 3 5 ; where the second equality follows from the tower property of conditional expectation and the fact that E Q t i jI t1 1 ;:::;I t1 n = zt i (S t ), sinceS t is a function ofI t1 1 ;:::;I t1 n . The third equality follows from the observation that I t1 i I t i = Q t i . The final equality follows because the k th sold unit of product i contributes an amount of (c i k + 1)=c i to the summation. Since i and t are dual feasible, it follows from the Weak Duality Theorem that E " T X t=1 t + n X i=1 c i i # = E 2 4 n X i=1 r i 0 @ c i X t=I T i +1 (t=c i ) +c i 1 I T i =c i 1 A 3 5 Primal fz t g T t=1 : On the other hand, the expected revenue of the Inventory-Balancing algorithm is equal to E P n i=1 r i (c i I T i ) . Therefore, the competitive ratio is at least E P n i=1 r i (c i I T i ) Primal m fz t g T t=1 E P n i=1 r i (c i I T i ) E h P n i=1 r i P c i t=I T i +1 (t=c i ) +c i 1 I T i =c i i : 109 Note that if I T i = c i , then the contribution of product i to both the revenue of our algorithm and to the constructed dual solution is zero. Therefore, the competitive ratio of the algorithm is at least min (c i ;I T i ):I T i c i 1 c i I T i P c i t=I T i +1 (t=c i ) +c i 1 I T i =c i = min (c i ;x):x 1 1 c i 1x 1 c i P c i t=I T i +1 (t=c i ) + (1 (x)) ; where the equality follows from the variable transformation x = I T i =c i . Because (1) = 1 and is increasing, we have 1 c i c i X t=I T i +1 (t=c i ) = 1 c i 0 @ 1 + c i 1 X t=I T i +1 (t=c i ) 1 A 1 c i + Z 1 I T i +1 c i (y)dy: Putting everything together, we have the following lower bound on the competitive ratio: min (c i ;x)2R + h 0;1 1 c i i 8 < : 1x 1 c i + 1 (x) + R 1 x+ 1 c i (y)dy 9 = ; To complete the proof, it suffices to show that the above ratio is lower bounded by c MIN ( ) := min x2 h 0;1 1 c MIN i 1x 1 c MIN +1 (x)+ R 1 x+ 1 c MIN (y)dy defined in Theorem 1. Consider an arbitrary (c i ;x)2 R + h 0; 1 1 c i i . There are two cases to consider: x 1 1 c MIN andx > 1 1 c MIN . In the first case, since the functionC7! 1 C + R 1 x+ 1 C (y)dy is decreasing by Lemma 3 andc i >c MIN , we have 1x 1 c i + 1 (x) + R 1 x+ 1 c i (y)dy 1x 1 c MIN + 1 (x) + R 1 x+ 1 c MIN (y)dy c MIN ( ): In the second case, we havex> 1 1 c MIN . Recall that to compute the minimum competitive ratio, we need to considerI T i <c i , thus, (c i ;x)2R + h 0; 1 1 c i i and as a result we havex 1 1 c i , or equivalently, c i 1=(1x). Applying Lemma 3 once again with 1 1x as a lower bound forc i , we get 1x 1 c i + 1 (x) + R 1 x+ 1 c i (y)dy 1x 1x + 1 (x) + R 1 x+(1x) (y)dy = 1x 2x (x) 1 1 1 c MIN 2 1 1 c MIN (1 1 c MIN ) c MIN ( ); 110 where the second inequality follows from the fact thatx > 1 1 c MIN and Lemma A.1.1, which shows that 1x 2x (x) is increasing inx. This completes the proof. A.1.3 Proof of Proposition 1.4.2 In this section, we characterize the performance guarantee of the EIB algorithm in the I.I.D. arrival model. Precisely, we will look at a random arrival model which encompasses the I.I.D model. 1 In the random arrival model the total number of the customersT and number of customers of each type are chosen by an adversary, but the arrival order is chosen uniformly at random. The performance guarantee of the EIB algorithm in the random arrival model is stated in the following proposition. Proposition A.1.2 (Performance Guarantee in the Random Arrival Model). Suppose the penalty function is exponential (EIB), (x) = e e1 (1e x ),x2 [0; 1]. Ifc MIN !1 and 1 c MIN = O(1) 2 , the ratio of the expected revenue of the EIB algorithm, E fztg T t=1 Rev EIB fz t g T t=1 , to the expected revenue of the optimal solution,Primal-S, is bounded below by the solution of the following linear program. MINIMIZE ; ; ( 1 ) e e1 P 1 1 j=0 j; 1 j e e j1 +e 1 SUBJECT TO: P 1 1 j=0 j;k = 1 0<k 1 ; j;k j;k 0j 1 ; 0<k 1 ; P 1 1 l=j l;k+1 P 1 1 l=j l;k 0j 1 ; 0<k 1 ; P 1 1 j=0 j;k R 1 y=1j (y)dy (k) 0<k 1 ; P 1 1 j=0 j;k+1 1 (j + 1) (k + 1)(k) 0<k 1 1: (FRLP) 1 Consider the sample space of sequences of customer types in I.I.D. model and divide it into groups such that in each group the number of customers of each type is the same for every sequence. Since each group includes all the equally likely permutations of some sequence, every sequence of customers (and their types) in the I.I.D. model can be mapped to a sequence of customers in the random arrival model. 2 We use the standard Landau notation:f(n) =O(g(n)) denotesjf(n)jdjg(n)j whered is some constant. 111 where is a discretization parameter.For the linear penalty function (LIB), (x) = x, the ratio of the expected revenue of the Inventory-Balancing algorithm, E fztg T t=1 Rev LIB fz t g T t=1 toPrimal-S is at least equal to the solution of the linear program FRLP with objective function( 1 ) + 1 2 P 1 1 j=0 j; 1 (j) 2 . Proposition A.1.2 shows that the expected revenue of the EIB and LIB algorithms is at least equal to Primal-S times the solution of the linear program FRLP. The proof is given in Section A.4.2. The main idea comes from the factor-revealing linear programs by Mirrokni et al. (2012), where we bound the competitive ratio of an algorithm by the solution of a linear program. In particular, we show that the objective function corresponds to the revenue of the IB algorithms when the value of Primal-S is normalized to one and any solution of the IB algorithms corresponds to a feasible solution of FRLP. The revenue of the IB algorithms is evaluated by potential function, which is an integral of the penalty function; see the precise definition in Section A.1.2. That is, the IB algorithms perform well if they can maximize the final value of potential function,( 1 ). Thus, we need to monitor the increase in the potential function during the run of the algorithms. 3 Interestingly, the last set of constraints implies that the potential function has a significant increase over time, with high probability. Hence, we can lower bound the final value( 1 ) or equivalently the overall performance of the IB algorithms, as a function of the optimal solution, measured by j;k s. As stated, is a discretization parameter, that is, the horizonT is divided into 1 time slots, each with an integer lengthT . Considering the fact that is a discretization parameter, we prefer to choose as small as possible. Particularly, as ! 0, T should approach infinity (T !1). Note that we have assumed 1 c MIN =O(1). Thus depending on howT scales, we can determine how fastT increases toward infinity, e.g., given thatT is a finite integer,T =O(c MIN ). The optimal value of the linear program FRLP for the LIB algorithm for = 1 20 ; 1 30 and 1 50 are respec- tively 0:69, 0:71, and 0:72. The corresponding values for the EIB algorithm are respectively 0:72, 0:74, and 0:75. We observe that as gets smaller, the optimal value of FRLP becomes more accurate and converges to 0:72 and 0:75 respectively for the LIB and EIB algorithms. Considering the fact that random arrival model encompasses the I.I.D. model, we get the performance guarantee of the LIB and EIB algorithms in the I.I.D. model given in Proposition 1.4.2. 3 Note that due to discretization, we only measure the amount of increase in at 1 points. 112 A.1.4 Learning the Customer Types under the Multinomial Logit Model Suppose the choice model of each customer type is described by a multinomial logit. If we show all products to m independent customers of type z and compute the maximum likelihood estimates (V z 0 ;:::;V z n ), then it is a standard result that Pr max i;S j z i (S) z i (S)j> de 2 m . Here, z i (S) = V z i =(V z 0 + P `2S V z ` ) and d is a constant, see, for example, Rusmevichientong et al. 2010. Therefore, E max i;S j z i (S) z i (S)j +de 2 m . Now consider the following variation of the IB algorithm. Upon the arrival of the customer in period 1 t T , of type z t , with probability 0 < < 1, we do explore, i.e., we show all the products to the customer and with probability 1 , we offer an assort- mentS t = arg max S2S P i2S I t1 i =c i r i zt i (S), where zt i (S) is the estimated selection probability, as described above, using previous sales data. Note that the number of observations up to period t, with high probability, is approximately ( t). Hence, by setting = 1 t (1 1 )=2 andm = t where 0 < 1 < 1, we have E max i;S j zt i (S) zt i (S)j = O 1 t (1 1 )=2 +e t 1 , which implies that E h P T t=1 t i =o (T ) . 4 Observe that asc MIN andT proportionally grow, E h P T t=1 t i =c MIN approaches 0. By Proposition 1.5.1 and the fact that algorithm loses at most fraction of its revenue during explorations, 5 the competitive ratio of the modified algorithm, asc MIN andT proportionally tend to infinity, would be equal to (1 )( ). Note that the modified algorithm, because of the constant rate of sampling, would still perform well if the choice models change slowly over time. A.2 Computational Complexity of Inventory-Balancing Algorithms The complexity of the Inventory-Balancing Algorithm is measured by the complexity the following combi- natorial optimization problem max S2S P i2S r i (I t1 =c i ) zt i (S), wherer i (I t1 =c i ) is the discounted revenue of producti in periodt. As shown in the following examples, for a broad class of choice models this 4 f(x) =o(g(x)) if limx!1 f(x) g(x) = 0. 5 We allocate fraction of the inventory of each product for “exploration”. 113 problem can be solved efficiently. Also, all of the choice models in these examples satisfy Assumption 1. We note that in Example A.2.1, no constraints on the assortments are included. Example A.2.1 (Multinomial and Nested Logit). Under the multinomial logit (MNL) choice model, z i (S) = 8 > < > : v z i =(v z 0 + P `2S v z ` ) if i2S[f0g; 0 otherwise ; where for eachi2f0; 1; 2;:::;ng,v z i 2R + denote the preference weight parameter associated with prod- ucti for a customer typez. As shown in Talluri & van Ryzin (2004b), the assortment optimization problem max S P i2S w i z i (S) can be solved efficiently by simply sorting the products in a descending order of the marginal revenue w i , and the optimal assortment can be found among the revenue-ordered assortments f1g;f1; 2g;:::;f1; 2;:::;ng. Li et al. (2013) showed that the above assortment optimization problem can also be solved inO(dn logn) operations for ad-level nested logit choice model, which generalizes to MNL to allow for a product taxonomy that is described by a tree withd levels (the MNL corresponds to the special case whered = 1). Example A.2.2 (Choice Models with Constraints). Our formulation automatically allows for constraints on the assortments, through the specification of the setS of feasible assortments. For example, if we have a budget or shelf-space constraint, we can defineS = S : P i2S d i B , whered i is the cost (or space) associated with showing producti andB is the budget. When the underlying choice model is an MNL or a nested logit model, the assortment optimization problem can still be solved efficiently; see, for example, Rusmevichientong et al. (2009, 2010), Gallego & Topaloglu (2012). Example A.2.3 (General Choice Models). Farias et al. (2013) has pioneered a novel algorithm that learns a general nonparametric choice model from transaction data. Under their framework, for each customer of typez, the choice model corresponds to a probability distribution z over the set of all permutations of f1;:::;ng where for eachi,(i)2f1; 2;:::;ng denote the rank of producti, with 1 be the highest rank (most preferred). They assume that customer chooses the product with the highest rank. In this case, z i (S) = X z ()1 l (i) min k2S[f0g (k) : It is easy to verify that z i (S) z i (S[fjg) for j 6= i. So, this choice model satisfies Assumption 1. Moreover, Farias et al. (2011) described an algorithm for solving the assortment optimization problem 114 max S P i2S w i z i (S) under their general choice model framework. This algorithm which can be used as a subroutine in our Inventory-Balancing algorithm is efficient for MNL choice model. A.3 Numerical Experiments: Appendix to Section 1.6 A.3.1 Known Length of the Horizon In this Section we compare the performance of EIB, LIB and Hybrid algorithms to the Myopic policy and the LP-based heuristics when the length of the horizon is known in advance. We set the initial inventory levels to 100, i.e.,c i = 100,i = 1; 2;:::; 73. Performance Evaluation: In Table A.1, we present the average revenue of each algorithm as a percentage of the upper bound, which is averaged over all 250 problem instances, for loading factors 1:4, 1:6 and 1:8 and for coefficients of variation of 0.1, 1, and 2. As the table shows, when the number of customers is known in advance, LPR 500 algorithm can obtain more that 99% of the optimal solution for all the considered problem classes which implies that having more resolving periods is not necessary. Problem Upper Avg. Revenue Under Different Policies (as % of the Upper Bound) Class Bound Inventory-Balancing Myopic One-shot LP LP Resolving Hybrid LF CV (in $1000 ) EIB LIB Policy LPO ALPO LPR 500 H1:5 H2 1.4 2.0 173 97.3 97.2 96.2 69.3 77.9 99.3 98.6 98.9 1.0 179 97.6 97.8 95.4 83.5 89.2 99.3 98.5 98.8 0.1 182 98.1 98.5 95.8 95.8 98.0 99.5 98.9 99.1 1.6 2.0 175 98.2 98.3 97.2 68.4 75.4 99.4 98.9 99.0 1.0 181 98.7 98.9 97.4 83.6 88.3 99.6 99.0 99.0 0.1 183 99.3 99.3 97.8 95.8 97.8 99.7 99.4 99.5 1.8 2.0 177 98.8 98.9 98.0 65.5 72.1 99.4 99.1 99.1 1.0 182 99.2 99.3 98.4 79.8 84.6 99.5 99.3 99.4 0.1 183 99.7 99.8 99.4 95.5 97.6 99.8 99.8 99.8 Table A.1: Revenue comparison when the length of horizon is known in advance. The standard errors of all numbers are less than 0.1%. We note that both the LIB and EIB algorithms outperform Myopic and LPO algorithms. Moreover, the revenue of the EIB and LIB algorithms is within2% of that of the resolving heuristics. Comparing the performance of LPR 500 in Tables A.1 and 1.3 implies that that LPR heuristics are sensitive to uncertainty 115 in number of customers. Precisely, the performance of LPR heuristic decreases significantly when it does not know the exact length of the horizon. In all problem classes, the Hybrid algorithms yield more revenue than the IB polices since they incorporates additional information about arrival sequence by using the LP resolving heuristic. Again, in all cases, the LPO algorithm has the lowest revenue and its performance decreases by increas- ing CV and loading factor. Observe that when CV= 0:1, One-shot LP heuristics obtain more than 95% of the optimal clairvoyant solution. For small value of CV , the number of customers of each type is very concentrated around its average. Therefore, these heuristics do not suffer from fixing their strategies at the beginning of the horizon. Note that even for small value of CV , our IB algorithms perform better than One-shot LP heuristics. Transient Behavior: Figures A.1 shows the cumulative revenue over time for Myopic, LIB, LPR 500 , LPO and ALPO algorithms with LF = 1:8 and CV = 2. We observe that the Myopic policy and the LIB algorithm are very aggressive during the initial periods, resulting in higher cumulative revenues than One-shot LP and resolving heuristics. Since resolving heuristics know exactly the number of customers in advance, they manage to earn revenue linearly over time. This implies that knowing the true estimate of the length of the horizon (number of customers) is essential for the resolving heuristics, that is, if the number of customers is less than its estimated value, these heuristics will suffer from significant revenue loss, see Section 1.6.2. A.3.2 Worst-Case Performance In Section 1.6.2, we have compared different polices in term of their average performance. Here, we investigate the worst-case performance of different policies. To this aim, we consider 250 random arrival sequences. For each of them we compute the ratio of revenue collected by each policy and the corresponding optimal clairvoyant solution. Then, the worst-case performance of any policy is defined as the minimum of these ratios. Table A.2 presents the worst-case performance of all policies for LF= 1:4; 1:6, and 1.8 and CV= 0:1, 1 and 2 when the length of the horizon is drawn from the uniform distribution with TT = E[T ]. Our IB polices outperform other policies in term of worst-case performance, that is they can obtain at least 91% of the optimal clairvoyant solution, which is much higher than the theoretical bounds, i.e., 63% for the EIB policy and 50% for the LIB policy. We observe that the LP resolving heuristics perform poorly compare 116 0 2000 4000 6000 8000 10000 12000 0 0.5 1 1.5 2 x 10 5 Period Revenue EIB EIB Myopic ALPO LPO LPR 500 Figure A.1: The cumulative revenue over time forLF = 1:8 andCV = 2 when the length of the horizon is known in advance. to IB and Hybrid algorithms. Furthermore, One-shot LP heuristics are very sensitive to uncertainty in arrival sequence (large CV). For instance, when LF=1.8 and CV=2 there is an arrival sequence in which they only get 3:8% of the optimal clairvoyant solution. Problem Worst Case Revenue Under Different Policies (as % of the Upper Bound) Class Inventory-Balancing Myopic One-shot LP LP Resolving hybrid LF CV EIB LIB Policy LPO ALPO LPR 500 LPR 50 H1:5 H2 1.4 2.0 91.8 91.4 91.0 13.2 14.4 75.8 76.7 88.4 83.2 1.0 92.2 92.0 91.9 16.8 17.8 76.7 77.4 88.8 84.7 0.1 92.2 91.8 91.4 72.6 73.3 78.9 79.1 89.5 85.4 1.6 2.0 92.5 92.0 92.0 8.6 8.8 70.9 73.3 90.0 86.9 1.0 93.2 91.7 91.2 41.8 41.9 72.7 73.1 89.8 87.3 0.1 92.7 92.8 91.3 66.5 67.4 73.4 74.5 91.1 87.3 1.8 2.0 92.4 92.3 91.2 3.8 3.8 66.2 67.4 90.5 87.1 1.0 92.8 92.5 92.0 20.9 20.3 68.4 69.0 91.8 88.9 0.1 93.1 93.2 91.6 60.3 60.5 67.8 68.2 92.6 90.3 Table A.2: Worst-Case Performance Comparison when the length of horizon is unknown. A.3.3 Learning the Customer Types Here we investigate the performance of the IB algorithms when we do not know the exact value of the selection probability zt i (S). Rather, we only have an estimate z i (S) based on data collected in the previous 117 periods. Since we assume the multinomial logit choice model for each customer type, we maintain an estimateV z (t) = (V z 0 (t);V z 1 (t);:::;V z n (t)) of the preference weight parameters, where for each product i, we setV z i (t) to be proportional to the number of times that a customer of typez purchases DVD during the previoust 1 periods, and we normalizeV z (t) so thatV z 0 (t) = 1. Similar to the previous section, we have 10 customer types and 73 products with initial inventory ofc i = 30. Problem Class Upper Revenue Loading Coefficient of Bound (as % of the Upper Bound) Factor (LF) Variation (CV) (in $100) LIB EIB 1.2 0.2 526 91.3 91.4 0.5 524 89.6 89.6 0.8 522 86.2 86.4 1.4 0.2 546 95.4 95.7 0.5 544 92.9 93.2 0.8 541 89.9 90.6 1.6 0.2 549 98.6 98.5 0.5 548 96.7 96.9 0.8 546 93.8 93.9 Table A.3: The average revenue for the LIB and EIB algorithms when the underlying parameters are unknown, and each algorithm uses the estimated parameters based on data collected in the previous periods. Table A.3 shows the revenue of the IB algorithms when these algorithms only have estimates of the preference weight parameters. In absolute terms, the IB algorithms perform well despite not knowing the true parameter values; they obtain 83%98% of the upper bound, depending of the coefficient of variations and the loading factor. We observe better performance for loading factor of 1:6 in compare to smaller loading factors. The reason is that larger loading factor or longer the horizon allows the algorithms to obtain better estimates of the unknown parameters. Note that IB algorithms perform well even with few observations. One of the reasons is that in the setting above, we do not impose any constraint on the size of the assortment that policies can offer to each customer. This could compensate for the inaccuracy in estimation of choice model since the algorithm can offer large assortments. Furthermore, by Proposition 1.5.1, we expect the IB algorithms to be robust with respect to the preference weight parameters. 118 A.4 Relegated Proofs Proof. Proof of Lemma 1.3.2: When the customer arrives in periodt, if productj has no remaining inven- tory, then I t1 j = 0, which implies that r j (I t1 j =c j ) = 0. By Assumption 1, under our choice model, adding productj to an assortment does not increase the probability that a customer will select other prod- ucts. Recall that in the case that both setsS andS[fjg have the maximum discounted revenue, we choose the set with the smaller number of products. Therefore, productj will never be included as a part of the optimal assortment. Proof. Proof of Corollary 1.3.4: First, observe that (x)x. This is because is increasing and concave, and we have (0) = 0 and (1) = 1. By this observation, we have (c MIN ) min x2 h 0;1 1 c MIN i 8 < : 1x 1 c MIN + 1 (x) + R 1 x+ 1 c MIN (y)dy 9 = ; min x2 h 0;1 1 c MIN i 8 < : 1x 1 c MIN + 1x + R 1 x+ 1 c MIN dy 9 = ; min x2 h 0;1 1 c MIN i ( 1x 1 c MIN + 1x + 1x 1 c MIN ) 1 2 : The second equality follows from the fact that for anyx2 h 0; 1 1 c MIN i , the lower limit of integral,x+ 1 c MIN , is less than the upper limit of integral, 1, and (x) 1. In the following we show that the competitive ratio of the IB algorithm with an increasing, strictly concave, and differentiable penalty function is strictly greater than 1 2 . First note that forx = 0, the lower bound of the competitive ratio, ( 1x 1 c MIN +1 (x)+ R 1 x+ 1 c MIN (y)dy ) is greater than 1 2 . This is because for a differentiable penally function , R 1 1 c MIN (y)dy is strictly less than 1 1 c MIN and (0) = 0. Thus, the result holds because min x2 0;1 1 c MIN i 8 < : 1x 1 c MIN + 1 (x) + R 1 x+ 1 c MIN (y)dy 9 = ; > min x2 0;1 1 c MIN i 8 < : 1x 1 c MIN + 1x + R 1 x+ 1 c MIN dy 9 = ; = 1 2 : 119 The inequality holds because for any differentiable and strictly concave penalty function (x), we have (x)>x for allx2 (0; 1). A.4.1 Appendix to Section 1.3.1 Proof. Proof of Theorem 1.3.6: From Theorem 1 we have: ( ) = min x2[0;1] 8 < : 1x 1 e e1 1e x R 1 x (1e y )dy 9 = ; = min x2[0;1] ( 1x 1 e e1 (1e x 1 +xe 1 +e x ) ) = min x2[0;1] ( 1x 1 e e1 (xe 1 ) ) = e 1 e = 1 1 e : The second part of the theorem is followed from Lemma 1.3.5. Proof. Proof of Lemma 1.3.5: 6 Consider a setting withn products, indexed by 1; ;n, all with revenue equal to 1 and initial inventory of 1 n T . Think ofT , the length of the horizon, as a very large number (that would tend to infinity) and a multiple ofn . The number of types is equal to 2 n 1. Each type corresponds to a set 6=; of products that a customer of that type equally likes; the “no-purchase” probability for all types is equal to zero. The arrival process is defined as follows: customer arrives inn phases of equal length, that is, the number of customers in each phase is T n . All the customers in each phase have the same type. We denote the type of the customer in phasej by j . We have 1 =f1; 2;:::ng; forj, 2jn, j = j1 nf j1 g where j1 is a randomly chosen element of j1 . In other words, the set of products of interest to customer during phasej is the set of products of interest to customers in phasej 1 minus one of those products and n is the only product of interest to customers in phasen, i.e., customers in phasej randomly lose interest in one of the products of interest in phasej 1. An example of sequences of customer types inn phases is 6 The proof is built upon ideas from Mehta et al. (2007). Our analysis is different, more rigorous, and applies to smaller number of products. For instance, theirs omits the corresponding proof of Lemma A.4.1 which we establish via induction, using the dynamic programming formulation of the problem. 120 n f1; 2;:::;ng;f1; 2;:::;n 1g;:::;f1; 2g;f1g o . Therefore, there aren! sequences of customer arrivals, each with equal probability. In Lemma A.4.1 in Section A.4, we show that the following Inventory-Balancing policy is optimal among all deterministic policies: offer to each customer all the products with the highest (positive) remaining inventory that are of interest to her. 7 Each customer purchases one of the products (if any) offered to her because the no-purchase probability is zero. Hence, the policy described above, in each phase, sells equal portion of the remaining inventory of each product that is of interest to the customers in that phase (which are all of the same type). For instance, in the first phase 1 n fraction of the inventory of every product is sold. Note that the rounding error is negligible sinceT is large. Recall that i denotes the product that will be of no more interest to the customers arriving after and including phasei + 1. Letq i;j be the fraction of customers in phasej that bought product i . We have q i;j = 8 > > < > > : 1 nj+1 ji 0 j >i (A.1) where nj + 1 is the number of products of interest to customers in phase j. Therefore, the revenue obtained from product i is 1 n T min n P i j=1 1 nj+1 ; 1 o and consequently the total revenue of the policy above is equal to 1 n T P n i=1 min n P i j=1 1 nj+1 ; 1 o . On the other hand, the optimal clairvoyant solution that knows the customers types in advance sells all units of product i to customers in phasei and obtains in total a revenue ofT . This completes the proof. Lemma A.4.1. For the arrival process described in the proof of Lemma 1.3.5, the following inventory balancing algorithm is optimal among all deterministic policies: offer to the customer all the products with the highest (positive) remaining inventory that are of interest to her. Proof. Proof: Since the revenue from all the products is the same and no-purchase probability is zero, by Eq. (A.7), to prove the lemma, it suffices to show the following Claim. 7 We do not investigate that a randomized algorithm would be able to outperform the aforementioned policy, but we are not studying that questions in this work. 121 Claim 1: Consider any two productsi andj and remaining inventory levels (x 1 ;:::;x n ) such thatx i > x j . If witht periods remaining, the type of the arriving customer isz such thati2 z andj2 z, then we have V (t;x 1 ;:::;x i ;:::;x j ;:::;x n jz)V (t;x 1 ;:::;x i 1;:::;x j + 1;:::;x n jz) This claim implies that it would be better to equalize the inventory levels. Namely, if the inventory for product i is higher than product j, the value (i.e., expected revenue) of the DP policy would increase if instead we have one additional unit of productj and one less unit of producti. We prove the claim using induction on the inventory levels, fixing product (any) two productsi andj: The induction basis is whenx i =x j + 1 (and no restriction on the inventory of other products). In this case, because of the symmetry in the problem, the value function does not change if we replace one unit of producti with one unit of productj. The reason is that the current customer in interested in bothi andj and because of the symmetry in the arrival process, the probability that a future customer is only interested in producti but not productj is the same as the probability that a future customer is only interested in product j but not producti. Induction Step: Consider initial inventory levels (y 1 ;:::;y n ) such thaty i > y j + 1. Assume Claim 1 holds for any other initial inventory levels (x 1 ;:::;x n ) such x k y k , 1 k n, and at least one of these inequalities is strict. To prove the induction step, suppose the optimal dynamic program starting with inventory levels (y 1 ;y 2 ;:::;y n ) offers setS to the arriving customer. Hence, by conditioning on the type of the customer in the next period, denoted byz 0 , we have: V (t;y 1 ;:::;y n jz) = X z 0 2Z Pr[next customer is of typez 0 jz] 1 + 1 jSj X k2S V (t 1;y k 1;y fkg jz 0 ) ! where (y k 1;y fkg ) represents the same inventory levels as before only with one less unit of productk. Now by applying the induction hypothesis, we get V (t;y 1 ;:::;y n jz) X z 0 2Z Pr[next customer is of typez 0 jz] 1 + 1 jSj X k2S V (t 1;y k 1;y i 1;y j + 1;y fi;j;kg jz 0 ) ! V (t 1;y i 1;y j + 1;y fi;jg jz) 122 The last inequality follows from the properties of the optimal dynamic program — note that the optimal policy starting with inventory levels (y i 1;y j + 1;y fi;jg ) may find a set more profitable thanS to offer to the customer. Finally, we point out ifi2S andk =i is chosen by the customer, thenV (t 1;y k 1;y i 1;y j + 1;y fi;j;kg jz 0 ) would be defined equivalent toV (t 1;y i 2;y j + 1;y fi;jg jz 0 ); note that in the induction step we assumey i y j + 2. A.4.2 Proof of Proposition A.1.2 We need to show that any solution of the Inventory-Balancing algorithm corresponds to a feasible solution of linear program FRLP, and when the value of Primal-S is normalized to one, the objective function of linear program FRLP is a lower bound for the expected revenue of the Inventory-Balancing algorithms. In the following, we only show the result for the EIB algorithm; a similar argument can be applied to the LIB algorithm. Note that we divide the horizon T into 1 time slots such that T is an integer. We only observe the remaining inventory of each producti in periodskT, 0<k 1 . Consider the solution (allocation) of the EIB algorithm for a sequence of customersfz t g T t=1 . For this solution, let j;k be the sum of revenue times capacity of any producti whose fraction of the remaining inventory I kT i c i in thek th time slot (periodkT) is between 1 (j + 1) and 1j (inclusive); that is, I kT i c i 2 (1 (j + 1); 1j] where 0j 1 1 and 0 < k 1 . Similarly, let j;k be the total revenue obtained from producti in the optimal solution (of Primal-S) where I kT i c i 2 (1 (j + 1); 1j], i.e., j;k = X i: I kT i c i 2(1(j+1);1j] r i c i ; j;k = X i: I kT i c i 2(1(j+1);1j] o i ; whereo i is the revenue obtained from selling producti in the optimal solution. Note thato i r i c i . For any time slotk, we define (k) = n X i=1 c i r i Z 1 y= I kT i c i (y)dy = n X i=1 r i Z c i y=I kT i (y=c i )dy: 123 Notice that(k) is an increasing function ofk. Using the fact that in any periodt the IB algorithm chooses an assortment S t that maximizes P i2S r i I t1 i =c i zt i (S), we will bound the change in function at two consecutive time slots, see the fifth sets of constraints in linear program FRLP. Next, we will show that the objective function of FRLP is less than the total revenue of the EIB algorithm. Since the penalty function is exponential, (x) = e e1 (1e x ),x2 [0; 1], the first term in the objective function is given by ( 1 ) = n X i=1 c i r i Z 1 y= I T i c i (y)dy = e e 1 n X i=1 c i r i 1 I T i c i +e 1 e I T i c i ! : By definition of j; 1 , the second term of the objective function is lower bounded as follows e e 1 1 1 X j=0 j; 1 j e e j1 +e 1 e e 1 n X i=1 c i r i 0 @ 1 I T i c i e e I T i c i +e 1 1 A : The inequality holds because x e e x1 is a decreasing function ofx. Therefore, ( 1 ) e e 1 1 1 X j=0 j; 1 j e e j1 +e 1 n X i=1 r i (c i I T i ); where the left hand side is the objective function of linear program FRLP and the right hand side is the revenue of the EIB algorithm. The next step is to show that any solution of the EIB algorithm corresponds to a feasible solution of linear program FRLP. Without loss of generality, we can normalize the revenue of the optimal solution, Primal-S, to 1 which implies the first set of constraints. The second set of constraints holds becauseo i r i c i . The third set of constraints follows from the definition of j;k and the fact that I kT i is a decreasing function of k. The forth set of constraints holds because of the definition of and the fact that R 1 y=x (y)dy is a decreasing function of x. In the following, we prove Lemma A.4.2 stated below, which leads us to the last set of constraints. This set of constraints gives a lower bound for difference of(k + 1) and(k) as a function of the optimal solution j;k . To prove the lower bound, we show that under uniform permutation, the revenue obtained from producti in the optimal solution during periods in [kT; (k + 1)T) denoted byo i;k is concentrated around its averageo i . Using this concentration and the fact that the IB algorithm chooses a set that maximizes discounted revenue, we will get the desired bound for the change in function. Hence, 124 any solution of the EIB algorithm, corresponds to a feasible solution of the linear program FRLP where the objective is less than the revenue of the algorithm. Considering the fact that the optimal solution, Primal-S, is normalized to one, the solution of the linear program FRLP is a number in [0; 1] and by minimizing the objective function, we obtain a lower bound for the performance of the EIB algorithm, namely the ratio of E fztg T t=1 Rev EIB fz t g T t=1 to Primal-S. Lemma A.4.2. Suppose is increasing, concave, and twice differentiable with a bounded derivative, in [0; 1] and 1 c MIN =O(1), then with high probability, for any 0<k< 1 , 1 1 X j=0 j;k+1 1 (j + 1) (k + 1)(k) +O 1 c MIN : Proof. Proof of Lemma A.4.2: Note that the contribution of producti to(k+1)(k) is equal tor i c i R I t 0 i =c i I t 1 i =c i (y)dy wheret 0 =kT andt 1 = (k + 1)T. By the assumption that the derivative of is bounded, we can substitute the integral with the sum and get r i c i Z I t 0 i =c i I t 1 i =c i (y)dy =r i Z I t 0 i I t 1 i (y=c i )dyr i I t 0 i X z=I t 1 i (z=c i )O 1 c MIN (A.2) Now let us consider the optimal solution. LetS t opt denote the set that is shown to the customer in period t by the optimal solution. Since the Inventory-Balancing algorithm shows to each customer an assortment that maximizes P i2S r i I t i c i t i (S), we have n X i=1 r i I t 0 i X z=I t 1 i (z=c i ) t 1 1 X t=t 0 X i2S t opt r i I t i =c i t i (S t opt ) t 1 1 X t=t 0 X i2S t opt r i (I t 1 i =c i ) t i (S t opt ) (A.3) The last inequality holds because is an increasing function. Recall thato i;k is the revenue obtained by the optimal solution from producti during periods in [kT; (k + 1)T). By this definition and the inequality above, we have n X i=1 r i I t 0 i X z=I t 1 i (z=c i ) n X i=1 (I t 1 i =c i )o i;k (A.4) 125 In Lemma A.4.3 which is borrowed from Mirrokni et al. (2012) (with some modifications), we show that under uniform permutation, with high probability, o i;k is concentrated around its average o i where 1 is the number of time slots. Note that this concentration holds under the uniform permutation and as we will discuss below it is useful when 1 c MIN =O(1). By this lemma and the above equation, n X i=1 r i I t 0 i X z=I t 1 i (z=c i ) n X i=1 (I t 1 i =c i )o i (A.5) with high probability. Therefore, by Equation (A.5) and the definition of , we have n X i=1 r i I t 0 i X z=I t 1 i (z=c i )O 1 c MIN 1 2 X j=0 1 (j + 1) j;k+1 O 1 c MIN (A.6) Since the left hand side is less that(k + 1)(k), the proof is completed. Lemma A.4.3. If the customers arrive according to a random order (i.e., a permutation chosen uniformly at random), P n i=1 o i = 1, then for any> 0 and 1 T k 1 , Pr " n X i=1 jo i;k o i j> 5 c MIN # < 1: The assumption that P n i=1 o i = 1 implies that we have normalized Primal-S to 1. In this lemma, we need 5 c MIN to be either constant or go to 0 which justifies the assumption 1 c MIN =O(1) in Lemma A.4.2. A.4.3 Appendix to Section 1.5.2 Proof. Proof of Proposition 1.5.1: We prove the claim by revisiting the steps of the proof of Theorem 1.3.3. Letfz t g T t=1 be the sequence of the customers. By Lemma 1.3.2, we never offer any product that has no inventory. We now construct a solution forDual fz t g T t=1 , with the true selection probabilities, as follows: i = r i (1 I T i =c i ) t = X n i=1 r i I t1 i =c i zt i (S t ) + 2 t : 126 whereS t = arg max S2S P i2S r i I t1 i =c i zt i (S) is the assortment offered by the IB algorithm. Note that we add the error term t to the value of t because the assortmentS t is computed using the estimated selection probability zt . This construction gives us a feasible dual solution because t n X i=1 r i I T i =c i zt i (S t ) t + 2 n X i=1 r i t n X i=1 r i I T i =c i zt i (S t ) + n X i=1 r i t where the first inequality follows from the fact that for all i = 1; 2;:::;n and S 2 S, j zt i (S) zt i (S)j t . The second inequality holds because is increasing and I t1 i I T i . By definition ofS t , t max S ( X i2S r i I T i =c i zt i (S) ) + n X i=1 r i t max S ( X i2S r i I T i =c i zt i (S) t ) + n X i=1 r i t max S X i2S (r i i ) zt i (S); where the second inequality follows from the fact that for alli andS2S,j zt i (S) zt i (S)j t . The third inequality follows from the definition of i and the fact that I T i =c i is less than or equal to 1. It follows from the Weak Duality Theorem that Primal fz t g T t=1 E " T X t=1 t + n X i=1 c i i # E 2 4 n X i=1 r i 0 @ c i 1 I T i =c i + c i X t=I T i +1 (t=c i ) + 2 T X t=1 t 1 A 3 5 Hence, E P n i=1 r i (c i I T i ) Primal fz t g T t=1 E P n i=1 r i (c i I T i ) E h P n i=1 r i c i 1 I T i =c i + P c i t=I T i +1 (t=c i ) + 2 P T t=1 t i : where E P n i=1 r i (c i I T i ) is the revenue of the Inventory-Balancing algorithm. Note that the contribution of any product i that is not sold by the optimal solution to Primal fz t g T t=1 is zero. Thus, to find the competitive ratio, we only consider products that are sold by the optimal solution. Since it is assumed that 127 the IB algorithm sells at least one unit of any product that is sold by the optimal solution, the competitive ratio of the algorithm is at least min (c i ;x):x 1 1 c i 1x 1 c i P c i t=I T i +1 (t=c i ) + (1 (x)) + 2 c i E h P T t=1 t i ; where i is a product sold by the optimal solution. Therefore, by the same argument as in the proof of Theorem 1.3.3, the competitive ratio is at least min x2 h 0;1 1 c MIN i 8 > < > : 1x 1 c MIN + 1 (x) + R 1 x+ 1 c MIN (y)dy + 2 c MIN E h P T t=1 t i 9 > = > ; : A.5 Asymptotic Optimality of the Dynamic Programming Policy In this section, we show asymptotic optimality of the dynamic programming (DP) policy when the type of customers is drawn independently from a known distribution. Namely, we show that the value obtained by the DP policy approaches Primal-S asymptotically when both the capacities and the horizon scale propor- tionally. Let z > 0, z 2 Z, be the probability that in each period a customer of type z arrives. 8 Let V (t;x 1 ;:::;x n jz) denote the maximum expected revenue witht periods remaining, given that a customer of type z2Z arrives, and the remaining inventories are (x 1 ;:::;x n ). Then, the dynamic programming formulation of this problem is given by V (t;x 1 ;:::;x n jz) (A.7) = max S2S :x i 18i2S ( X i2S z i (S) [r i +V (t 1;x 1 ;:::;x i 1;:::;x n )] + z 0 (S)V (t 1;x 1 ;:::;x n ) ) whereV (t;x 1 ;:::;x n ) = P z2Z z V (t;x 1 ;:::;x n jz). Also, the terminal condition is given byV (0;) = 0. We denote the optimal revenue under the dynamic programming formulation by V (T;c) wherec is 8 This is with abuse of notation and done for the sake of economy of notation, we previously used z as the expected number of customers of typez, not the probability. 128 the vector of initial inventories. We note that in computing V (T;c), we take expectation with respect to sequence of customers and the customers’ choices. For simplicity, we assume that the policy can always offer an “empty” assortment withS = ?. Thus, the maximum in the dynamic programming equation is always well-defined. The asymptotic optimality result is stated in the following Proposition. Proposition A.5.1 (Asymptotic Optimality of DP). Given that the type of customers is drawn independently from a known distribution such that the probability of arriving a customer of typez2Z in any periodt is z , then lim !1 V (T;c) Primal-S(T;c) = 1; wherePrimal-S(T;c) is the linear programmingPrimal-S with initial inventoriesc and the length of the horizonT . In the above proposition, we scale both the horizon and initial inventory with a scalar . The corre- sponding problem is called-scaled stochastic problem. Then, to see the asymptotic behavior of dynamic programming, we let go to infinity. We note that Proposition A.5.1 does not imply that dynamic pro- gramming policy is asymptotically optimal for every sequence of customer types. Instead it shows that it is asymptotically optimal only when take average over all sequences. Proof. Proof of Proposition A.5.1: By Lemma 1.4.1, V (T;c) Primal-S(T;c) for all T , c and > 0. Now, let f y z (S) :S2S ; z2Zg denote an optimal solution for the (unscaled) Primal-S(T;c) for all T , c and > 0. Then, it is easy to verify thatf y z (S) :S2S ; z2Zg is an optimal solution toPrimal-S(T;c). To show that lim !1 V (T;c) Primal-S(T;c) = 1, we construct a deterministic policy for the -scaled stochastic problem whose expected revenue approaches Primal-S(T;c) as increases toward infinity. We show that this policy is admissible, that is the total sales of producti is less than its initial inventory. Therefore,V (T;c) also approachesPrimal-S(T;c) as!1. The policy operates as follows: Offer a setS2S to customers of typez for up to z y z (S) times. The order in which the sets are offered is arbitrary. Under this policy, we will NOT accept all of the demands generated by offeringS. Rather, we will limit the sales of producti from offeringS to customers of typez to at most z z i (S) y z (S). 129 LetN(T ) = (N z (T ) :z2Z) be a multinomial random vector, where N z (T ) denotes the total number of customers of typez overT periods. Note thatN z (T ) has a binomial distribution with param- eterT and z . We define the random variableD z i (S;q) as the total number of customers of typez who select product i when S is offered under the policy , given that there are q customers of type z. Since under the policy we do not accept all the demands, the total sales of producti from customers of typez generated from offeringS under the policy is given by Sale ;z i (S) = minfD z i (S;N z (T )); z z i (S) y z (S)g : We point out that Sale ;z i (S) is a random variable because D z i (S;N z (T )) is a random variable. Since y z (S) is a feasible solution of linear program Primal-S(T;c), we have X z2Z X S2S z z i (S) y z (S)c i ; i = 1;:::;n; which implies that, with probability one, X z2Z X S2S Sale ;z i (S) X z2Z X S2S z z i (S) y z (S) c i ; i = 1;:::;n: Therefore, the policy is admissible because the total sales of product i does not exceed its ini- tial inventory. The total revenue over T periods under policy is given by a random variable P n i=1 r i P S2S P z2Z Sale ;z i (S). Then, lim !1 1 n X i=1 r i X S2S X z2Z Sale ;z i (S) = lim !1 1 n X i=1 r i X S2S X z2Z minfD z i (S;N z (T )); z y z (S) z i (S)g = n X i=1 r i X S2S X z2Z min lim !1 1 D z i (S;N z (T )); z y z (S) z i (S) = n X i=1 r i X S2S X z2Z z y z (S) z i (S) = Primal-S(T;c): To establish the third equality above, note that 1 D z i (S;N z (T )) = 1 M z X t=1 1 l fB i t;z (S)=1g = M z 1 M z M z X t=1 1 l fB i t;z (S)=1g ; 130 where M z := minfN z (T ) ; z y z (S)g and B i t;z (S) = 1 denotes the event that the t th customer of type z selects product i when S is offered, with E[1 l fB i t;z (S)=1g ] = z i (S). By SLLN, we know that lim !1 N z (T )= = z T almost surely (a.s.). Since under the policy, we only offerS up to z y z (S) customers of typez, lim !1 M z = lim !1 minfN z (T ); z y z (S)g = z y z (S) a:s: By a similar argument, 1 M z P M z t=1 1 l fB i t;z (S)=1g = z i (S). Thus, with probability one, lim !1 1 D z i (S;N z (T )) = z y z (S) z i (S) ; which gives us the desired result. Then, by the Dominated Convergence Theorem, it follows that lim !1 1 E P n i=1 r i P S2S P z2Z Sale ;z i (S) = Primal-S(T;c). Since the policy is admissible, 1 lim !1 V (T;c) Primal-S(T;c) lim !1 1 E P n i=1 r i P S2S P z2Z Sale ;z i (S) 1 Primal-S(T;c) = 1; which completes proof. 131 Appendix B Technical Appendix to Chapter 2 B.1 Sequential Weighted Second-Price Mechanism & Proofs of Theo- rems 2.4.1 and 2.5.1 In this section, we present a parameterized class of mechanisms called Sequential Weighted Second-Price (SWSP), which is denoted byM(;). Weight function :R!R connects the bids in the first and the second rounds by manipulating the final allocation and payments in the favor of agents with higher initial valuations. Parameter2R specifies a lower bound on weighted bids the seller is willing to accept for the item. As it will be more clear later, this class of mechanism includes the efficient and optimal mechanisms. We will show that the SWSP mechanisms are incentive compatible and individually rational. As a corollary of this result, we will conclude that the efficient and optimal mechanisms are IC and IR; that is, Theorems 2.4.1 and 2.5.1 hold. We make the following assumption on function. Assumption 4. Weight function is non-decreasing and differentiable with bounded derivatives, that is, sup z f 0 (z)g<1,z2 [v;v]. Note that as it will be more clear later, the non-decreasing function alters the social welfare and the seller’s revenue by distorting the allocation via favoring agents with higher valuations. The weight function and parameter can be used to adjust the social welfare and the revenue of the mechanism. For instance, as we show later, for the efficient mechanism, function() and are equal to 0. Let us start with the description of the mechanism. Sequential Weighted Second-Price MechanismM(;): The selection, allocation, and payment rules are defined as follows: 132 Selection: Select the following set of agents S ; (b 0 )2 arg max Sf1;;ng ; (b 0 ;S) ; (B.1) where ; (b 0 ;S), the weighted surplus, is defined as follows ; (b 0 ;S) = E S h max n max i2S fb i;0 +(b i;0 ) + i g; max i= 2S fb i;0 +(b i;0 )g; oi X i2S c i : (B.2) The expectation is with respect to the second signals of the selected agents; in case of ties, we will choose one of the sets at random. Each selected agenti payst i (b 0 ) to the seller; see Eq. (2.2). Allocation and Payments: Agents participate in a “weighted second-price” auction with a reserve price , where the mechanism allocates the item to the agent with the highest weighted bid as along as it exceeds . More precisely, consider an agent i ? 2 argmax i fb i;1 + (b i;0 )g. If b i ? ;1 +(b i ? ;0 ) , then the item is allocated to agent i ? . If agent i ? was a selected agent, then he paysp i ? = maxfmax i6=i ?fb i;1 +(b i;0 )g;g(b i ? ;0 ). If agenti ? is not a selected agent, then he paysp i ? = maxfmax i6=i ?fb i;1 +(b i;0 )g;rg(b i ? ;0 ), wherer :R n !R will be defined below. Note that by letting = 0 and() = 0, we can implement mechanismM EFF . Furthermore, by setting = 0 and() =(), we have mechanismM OPT . Let` be an unselected agent with the highest weighted bid, i.e.,`2 arg max j= 2S ; (b 0 ) fb j;0 +(b j;0 )g. Then, ifb `;0 +(b `;0 )< or all agents are selected,r =. Otherwise,r solves the following equation Z b `;0 +(b `;0 ) maxfr;g Pr h z max j2S ; (b 0 ) b j;0 + j +(b j;0 ) i dz = Z b `;0 v E h q ` v `;0 =z;v `;0 =b `;0 i dz: (B.3) The next lemma shows that there exists anr2 [;b `;0 +(b `;0 )] that solves the above equation. Lemma B.1.1. Consider agent `2 arg max j= 2S ; (b 0 ) fb j;0 +(b j;0 )g in mechanismM(;). If b `;0 + (b `;0 ), then there existsr2 [;b `;0 +(b `;0 )] that satisfies Eq. (B.3). The following results are immediate corollaries of Lemma B.1.1. Corollary B.1.2. Let`2 arg max j= 2S EFF (b 0 ) fb j;0 g be an unselected agent` with the highest initial bid in mechanismM EFF . Then, ifb `;0 0, then there existsr2 [0;b `;0 ] that satisfies Eq. (2.3) 133 Corollary B.1.3. Let ` 2 arg max j= 2S OPT (b 0 ) fb j;0 + (b j;0 )g be an unselected agent ` with the highest weighted bid in mechanismM OPT . Then, ifb `;0 +(b `;0 ) 0, then there existsr2 [0;b `;0 +(b `;0 )] that satisfies Eq. (2.5). We now present the main result of this section, which shows that the proposed mechanism is IC and IR. Theorem B.1.4 (Incentive Compatibility). Suppose 0 and function satisfies Assumption 4. Then, the Sequential Weighted Second-Price mechanismM(;) is incentive compatible and individually rational. The proof of the theorem is given in Appendix B.1.1. If mechanismM(;) is incentive compatible, it maximizes the weighted surplus defined below as ; (v 0 ) = arg max Sf1;:::;ng f ; (v 0 ;S)g The assumption that the derivatives of the function are bounded will ensure us that the weighted sur- plus, ; (v 0 ), is absolutely continuous in the initial valuations of the agents. Note that =0;=0 (v 0 ) and =0;= (v 0 ) are, respectively, the maximum social welfare and virtual revenue. Therefore, Theorem B.1.4 implies mechanismsM EFF andM OPT are efficient and optimal, respectively. B.1.1 Proof of Theorem B.1.4 In this section, we prove Theorem B.1.4. We start with incentive compatibility and show that no agents would prefer to deviate from the truthful strategy, as long as all other agents are truthful. We prove this by going over the strategy of an agent in a backward manner. First, using Lemma B.1.5, we show that agents bid truthfully in the second round. Then, we prove that a selected agent obtains the additional information (Lemma B.1.6). Finally, in Lemma B.1.9 we show that agents will be better off by being truthful in the first round. We present the proof of Lemma B.1.9 in Section B.1.2 since it our key technical lemma. The proofs of other lemmas are relegated to Section B.4. The key challenging part is to show that agents bid truthfully in the first round. The reason is that the effects of initial bids are twofold. First, they determine the set of selected agents. Second, they influence the final allocation of the item. 134 The following lemma shows that agents who can bid in the second round will be truthful even if they were untruthful in the first round. Precisely, we will show that v i;1 = arg b i;1 max q i v i;1 p i t i c i e i : (B.4) Note that unselected agents do not bid in the second round; that is, their initial bids are considered as their final bids. Lemma B.1.5 (Truthfulness in the Second Round). Under mechanismM(;), for any agent that is allowed to update his bid in the second round of bidding, truthfulness is a weakly dominant strategy, even if the agent has not been truthful in the first round. From a technical perspective, one of the aspects that differentiates our work from the previous work on dynamic mechanism design, in particular ¨ Eso & Szentes (2007), is that the deviation strategies of the agents, in addition to misreporting his valuations, include the decision on obtaining information. In the following lemma, we show that a selected agenti will acquire the additional information when he bids truthfully in the first round, and all other agents follow the truthful strategy, i.e., 1 = arg e i 2f0;1g max q i v i;1 p i t i e i c i b 0 =v 0 ; b i;1 =v i;1 ; e j = 1 ifs j = 1,j6=i ; (B.5) The conditions in the above equation imply that agenti bids truthfully in the first round, and all other agents follow the truthful strategy; that is, they bid truthfully in both rounds and if they get selected, they obtain information. Lemma B.1.6 (Obtaining Additional Information). Consider a selected agenti who bids truthfully in the first round. Assuming all other agents are truthful, agenti would incur costc i to obtain signal i . Lemma B.1.6 implies that a selected agenti “who bids truthfully in the first round” will obtain infor- mation. However, if agenti bids untruthfully in the first round is selected, he will not necessarily obtain information. We will show in Lemma B.1.9 that agenti will not gain from bidding untruthfully in the first round regardless of his decision to obtain information. The proof of Lemma B.1.6is provided in Section B.4.3. To obtain the result, we show that the incentive of the selected agenti gets aligned with the selection rule. Thus, the selected agent prefers to incur costc i and acquire information; that is,e i = 1. 135 The final step is to show that an agenti will bid truthfully in the first round. LetU i (x i ; ^ x i ) be the utility of agent i with initial valuation x i when he bids ^ x i in the first round and follows the “optimal strategy” afterwards (assuming other agents are truthful). More precisely, U i (x i ; ^ x i ) = max b i;1 ;e i E q i (^ x i ;v i;0 ); (b i;1 ;v i;1 ) v i;1 e i c i s i t i (^ x i ;v i;0 ) p i (^ x i ;v i;0 ); (b i;1 ;v i;0 ) ; (B.6) where the expectation is taken assuming that all agents except for agenti are truthful. Then, considering the fact that initial valuationv 0 =x, for anyj6=i, we havev j;1 =x j + j if agentj is selected andx j otherwise. Note that after initial bidding, agenti optimizes over (e i ;b i;1 ) to obtain his best (utility-maximizing) strategy. We start with characterizingU i (x i ;x i ). Lemma B.1.7. If the vector of initial valuations is given byx, and all agents except for agenti are truthful, then the expected utility of agenti who bid truthfully in the first round, denoted byU i (x i ;x i ), is equal to U i (x i ;x i ) = Z x i v E h q i v i;0 =z; v i;0 =x i i dz; (B.7) In addition,U i (x i ;x i ) is non-decreasing inx i . In the proof of Lemma B.1.7, we use lemmas B.1.5 and B.1.6 where we show that if agent i bids truthfully in the first round, he will prefer to follow the truthful strategy afterwards. Lemma B.1.7 implies that the utility of truthful agenti is fully determined by his allocation probability for different initial valuations. Furthermore, the higher his probability of allocation is, the more utility he earns. In fact, Eq. (B.7) is analogous to the utility of an agent in standard static incentive compatible mechanisms (see Myerson (1981)). We now consider the utility of agent i when he bids untruthfully in the first round and follows his optimal strategy thereafter, i.e.,U i (x i ; ^ x i ) defined in Eq. (B.6). By Lemma B.1.5, if agenti with initial bid ^ x i 6=x i gets selected, he bids truthfully in the second round, i.e.,b i;1 =v i;1 . But, untruthful agenti will not necessarily obtain information if the mechanism selects him. In the following, we denote his best investing strategy bye i (v i;0 =x i ;b i;0 = ^ x i ;t i ). The next lemma establishes an upper bound onU i (x i ; ^ x i ). 136 Lemma B.1.8. Suppose the vector of initial valuations is given by x, and all agents except agent i are truthful. We haveU i (x i ; ^ x i ) max U i (^ x i ; ^ x i ) + R x i ^ x i Pr [z +e i i +(^ x i )! i ]dz; 0 , where! i is the maximum weighted bids of all agents but agenti when he misreports ^ x i in the first round and other agents are truthful, i.e., ! i = max n max j2S ; (^ x i ;x i );j6=i fx j +(x j ) + j g; max j= 2S ; (^ x i ;x i );j6=i fx j +(x j )g; o (B.8) ande i =e i (v i;0 =x i ;b i;0 = ^ x i ;t i ). The term inside the integral, i.e., Pr [z +e i i +(^ x i )! i ], is the probability that agenti with final weighted bidz +e i i +(^ x i ) wins the item when agents in setS ; (^ x i ;x i )nfig obtain information and agenti follows an investing strategy associated withe i =e i (v i;0 =x i ;b i;0 = ^ x i ;t i ). Next, we show thatU i (x i ;x i )U i (x i ; ^ x i ); that is an agenti prefers to bid truthfully in the first round. In Lemma B.1.8, we find an upper bound forU i (x i ; ^ x i ). Then, whenU i (x i ; ^ x i ) = 0, immediately we have U i (x i ;x i ) U i (x i ; ^ x i ) = 0. Now we show that even the upper bound of U i (x i ; ^ x i ), i.e., U i (^ x i ; ^ x i ) + R x i ^ x i Pr [z +e i i +(^ x i )! i ]dz, is smaller thanU i (x i ;x i ), wheree i =e i (v i;0 =x i ;b i;0 = ^ x i ;t i ). We start with defining a suboptimal selection rule ^ S y 1 ;y 2 (z;x i ) for any nonzero measure interval [y 1 ;y 2 ] such thaty 1 y 2 and [y 1 ;y 2 ] [minfx i ; ^ x i g; maxfx i ; ^ x i g], where ^ S y 1 ;y 2 (z;x i ) = 8 > > > > > < > > > > > : S ; (^ x i ;x i )nfig ife i = 0 andz2 [y 1 ;y 2 ]; S ; (^ x i ;x i ) ife i = 1 andz2 [y 1 ;y 2 ]; S ; (z;x i ) otherwise; wheree i = e i (v i;0 = x i ;b i;0 = ^ x i ;t i ). Note that the suboptimal selection rule ^ S y 1 ;y 2 follows the optimal selection rule everywhere except interval [y 1 ;y 2 ]. In the interval [y 1 ;y 2 ], the set of selected agents is the set of agents that update their valuations when all agents except for agenti follow the truthful strategy, and agent i with initial valuationx i misreports ^ x i in the first round and follows his best strategy afterward with regard to obtaining the additional information. The suboptimal selection rule captures the untruthful behavior of agenti in the first round while other agents are truthful. 137 In the next lemma, by characterizing the difference between the weighted surplus under the selection rule of the SWSP mechanism,S ; (x), and suboptimal selection rule ^ S y 1 ;y 2 (x), we will show that agenti prefers to bid truthfully in the first round; that is,U i (x i ;x i )U i (x i ; ^ x i ). Lemma B.1.9. For any interval [y 1 ;y 2 ], consider the suboptimal selection rule ^ S y 1 ;y 2 described above. Then, for any x2 [v; v] n , we have ; (x; ^ S y 1 ;y 2 (x)) ; (x;S ; (x)) 0, and as a result, agent i prefers to bid truthfully in the first round; that is,U i (x i ;x i )U i (x i ; ^ x i ). Note that by Lemma B.1.9, an agenti who misreports his initial valuation will be worse off regardless of his decision to acquire information. B.1.2 Proof of Lemma B.1.9 Throughout the proof, to simplify our notations, we denoteS ; byS. Furthermore, without loss of gener- ality, we assume that ^ x i <x i . A similar argument can be applied when ^ x i >x i . We first show that the weighted surplus under the selection rule of the SWSP mechanism,S(x), i.e., ; (x;S(x)), is less than the weighted surplus under selection rule ^ S y 1 ;y 2 (x), i.e., ; (x; ^ S y 1 ;y 2 (x)). Then, by characterizing the difference between ; (x; ^ S y 1 ;y 2 (x)) and ; (x;S(x)) as a function of alloca- tion probabilities, and using the fact that() is an increasing function, we show thatU i (x i ;x i )U i (x i ; ^ x i ). By definition, ; (x;S(x)) = ; (x). Then, since the selection rule S(x) maximizes the weighted surplus, we have ; (x; ^ S y 1 ;y 2 (x)) ; (x) 0. Next, in Lemma B.1.10, we characterize ; (x; ^ S y 1 ;y 2 (x)) ; (x). The proof which uses the Envelope Theorem ( cf. Milgrom & Segal (2002)) is provided in Section B.4.4. Lemma B.1.10. For any interval [y 1 ;y 2 ], consider the suboptimal selection rule ^ S y 1 ;y 2 . Then, we have ; (x; ^ S y 1 ;y 2 (x)) ; (x) = Z y 2 y 1 1 + 0 (z) E h q i v i;0 =z; v i;0 =x i ; ^ S y 1 ;y 2 (z;x i ) i E h q i v i;0 =z; v i;0 =x i ;S(z;x i ) i dz; where E h q i v i;0 =z; v i;0 =x i ; S i is the probability that truthful agenti with initial valuationz receives the item when other agents bid truthfully and agents in setS update their valuations. 138 In the following, using Lemma B.1.10 and the fact that ; (x; ^ S y 1 ;y 2 (x)) ; (x), we will show that agenti prefers to bid truthfully in the first round. Precisely, we will show that Z x i z=^ x i E h q i v i;0 =z; v i;0 =x i i dz Z x i z=^ x i Pr h z +e i i +(^ x i )! i i dz; (B.9) where by Lemma B.1.7, the l.h.s. isU i (x i ;x i )U i (^ x i ; ^ x i ). In addition, by Lemma B.1.8, the r.h.s. is an upper bound of U i (x i ; ^ x i )U i (^ x i ; ^ x i ). Thus the above equation implies that agent i does not have any incentive to bid untruthfully in the first round. By definition, E h q i v i;0 =z; v i;0 =x i ;S(z;x i ) i = E h q i v i;0 =z; v i;0 =x i i . Then, by Lemma B.1.10 and the fact that ; (x; ^ S y 1 ;y 2 (x)) ; (x), we have Z y 2 y 1 1 + 0 (z) E q i v i;0 =z; v i;0 =x i E h q i v i;0 =z; v i;0 =x i ; ^ S y 1 ;y 2 (z;x i ) i dz 0: (B.10) Note that for anyy 1 zy 2 , we have E h q i v i;0 =z; v i;0 =x i ; ^ S y 1 ;y 2 (z;x i ) i = Pr [z +e i i +(z)! i ] Pr h z +e i i +(^ x i )! i i ; (B.11) Here, again,e i =e i (v i;0 =x i ;b i;0 = ^ x i ;t i ), and the equality follows from the construction of the subopti- mal selection rule and the definition of! i in Eq. (B.8). In addition, the inequality holds because the weight function() is non-decreasing. Applying Eq. (B.11) in Eq. (B.10), we obtain Z y 2 y 1 1 + 0 (z) E h q i v i;0 =z; v i;0 =x i i Pr h z +e i i +(^ x i )! i i dz 0: Then, Eq. (B.9) follows from the fact that the above equation holds for any nonzero measure interval [y 1 ;y 2 ] [^ x i ;x i ], and the weight function() is non-decreasing. Therefore, agenti prefers to bid truthfully in the first round. 139 B.2 Proof of Theorem 2.6.1 We show this result for any mechanism which selection rule maximizes the weighted surplus ; (b 0 ;S), where ; (b 0 ;S) is defined in Eq. (B.4), 0, and : R! R is a non-decreasing function from an initial bid to a weight. Note that the efficient and optimal mechanisms select a set of agents that maximize =0;=0 (b 0 ;S) and =0;= (b 0 ;S), respectively. Consider two unselected agents i;j such that b i;0 > b j;0 . Assume that agents in set S are already selected. We will show that when the cost of information is the same for all agents, ; (b 0 ;S[fig) is greater than or equal to ; (b 0 ;S[fjg); that is, the seller prefers to add agenti to setS rather than agent j. By definition, ; (b 0 ;S[fig) = E max max k6=i;j fb k;1 +(b k;0 )g; b i;0 +(b i;0 ) + i ; b j;0 +(b j;0 ); c(jSj+1); where the expectation is with respect to second signals andb k;1 =b k;0 + k ifk2S and isb k;0 otherwise. For any realizations of second signals, letY i = max max k6=i;j fb k;1 +(b k;0 )g;b j;0 +(b j;0 ); b i;0 (b i;0 ). Then, ; (b 0 ;S[fig) is given by E h b i;0 +(b i;0 ) + i 1 lf i Y i g + Y i +b i;0 +(b i;0 ) 1 lf i <Y i g i c (jSj + 1): After some manipulations, it can be rewritten as b i;0 +(b i;0 ) +Y i G(Y i ) + Z Y i zdG(z)c (jSj + 1); whereG is the distribution of i . Likewise, ; (b 0 ;S[fjg) =b j;0 +(b j;0 ) +Y j G(Y j ) + Z Y j zdG(z)c (jSj + 1); whereY j = max max k6=i;j fb k;1 +(b k;0 )g;b i;0 +(b i;0 ); b j;0 (b j;0 ). Using integration by part, one can easily show that ; (b 0 ;S[fig) ; (b 0 ;S[fjg) =b i;0 +(b i;0 )b j;0 (b j;0 ) R Y j Y i G(z)dz. To show the result we need to consider the following two cases. 140 Y i Y j =b j;0 +(b j;0 )b i;0 (b i;0 ): In this case, max max k6=i;j fb k;1 +(b k;0 )g; is greater thanb j;0 +(b j;0 ) andb i;0 +(b i;0 ). That is,Y i = max max k6=i;j fb k;1 +(b k;0 )g; b i;0 (b i;0 ) andY j = max max k6=i;j fb k;1 +(b k;0 )g; b j;0 (b j;0 ). Thus, ; (b 0 ;S[fig) ; (b 0 ;S[fjg) b i;0 +(b i;0 )b j;0 (b j;0 ) (b i;0 +(b i;0 )b j;0 (b j;0 )) = 0; where the inequality follows from the fact that for anyz,G(z) 1. Y i Y j 6=b j;0 +(b j;0 )b i;0 (b i;0 ): In this case, max max k6=i;j fb k;1 +(b k;0 )g; is less thanb i;0 +(b i;0 ). That is,Y j =b i;0 +(b i;0 )b j;0 (b j;0 ) andY i b j;0 +(b j;0 )b i;0 (b i;0 ). Then, ; (b 0 ;S[fig) ; (b 0 ;S[fjg)b i;0 +(b i;0 )b j;0 (b j;0 ) Z b i;0 +(b i;0 )b j;0 (b j;0 ) b j;0 +(b j;0 )b i;0 (b i;0 ) G(z)dz: Note that the upper level of the integral equals negative the lower level of the integral. Thus, by the fact theG(z) = 1G(z), we have R b i;0 +(b i;0 )b j;0 (b j;0 ) b j;0 +(b j;0 )b i;0 (b i;0 ) G(z)dz =b i;0 +(b i;0 )b j;0 (b j;0 ); that is, ; (b 0 ;S[fig) ; (b 0 ;S[fjg) is at least zero, which is the desired result. B.3 Appendix to Sections 2.4.1 and 2.5.3 In this section, we first present the proof of Theorem 2.4.2. Then, we show that the single crossing conditions might not hold in the All-Access mechanism. We then provide the proof of Theorems 2.4.4 and 2.5.5. B.3.1 Proof of Theorem 2.4.2 To show the result, we consider the following procedure: At each round, one agent presumes that other agents are playing stationary strategies. That is, he assumes that the maximum bid of other agents, B = maxfmax j6=i fv j;0 + j ~ e j (v j;0 )g; 0g, is drawn from a stationary distribution. Then, he best responds to the strategies of other agents. We show this procedure terminates in an equilibrium. To this aim, we 141 establish that when an agent updates his strategy in any round, he increases a bounded potential function :f(~ e 1 ; ~ e 2 ;:::; ~ e n )g!R, defined below, (~ e 1 ; ~ e 2 ;:::; ~ e n ) = E 2 4 max max j=1;2;:::;n fv j;0 + j ~ e j (v j;0 )g; 0 n X j=1 c j ~ e j (v j;0 ) 3 5 ; where ~ e j = ~ e j (); j = 1; 2;:::;n, is the investment strategy of agentj, and the expectation is with respect to v j;0 and j for j = 1; 2;:::;n. Note that the potential function is the average social welfare of the auctioneer and agents when agents follow investment strategies< ~ e 1 ; ~ e 2 ;:::; ~ e n >. Then, considering the fact that (~ e 1 ; ~ e 2 ;:::; ~ e n ) E max Sf1;2;:::g f (v 0 ;S)g <1, the potential function is bounded, and as a result, the process of updating strategies will eventually result in an equilibrium. For any initial valuation v i;0 , agent i selects ~ e i (v i;0 )2f0; 1g that maximizes his utility. Let ~ e i be investment strategies of all agents except of agenti. Given ~ e i , the utility of agenti with initial valuation v i;0 when he does not obtain information (~ e i (v i;0 ) = 0) can be written as u i v i;0 ; ~ e i (v i;0 ) = 0; ~ e i = E v i;0 ;~ e i [maxfv i;0 B; 0g] = E v i;0 ;~ e i max v i;0 max max j6=i fv j;0 + j ~ e j (v j;0 )g; 0 ; 0 ; (B.12) where E v i;0 ;~ e i denotes the expectation with respect to the initial valuations and second signals of all agents except for agenti while taking into account their investment strategies, i.e., ~ e i . Recall that the agents bid truthfully in the second price auction. Similarly, when agenti obtains information, his expected utility is given by u i v i;0 ; ~ e i (v i;0 ) = 1; ~ e i = E v i;0 ;~ e i E max (v i;0 + i ) max max j6=i fv j;0 + ~ e j (v j;0 ) j g; 0 ; 0 c i ; (B.13) 142 where the inner expectation is with respect to i . In the rest of the proof, we denote all the expectations by E. Then, for any initial valuationv i;0 , we have ~ e i (v i;0 ) = arg max e2f0;1g fu i v i;0 ;e; ~ e i g = arg max e2f0;1g fE [maxfv i;0 + i eB; 0g]c i eg = arg max e2f0;1g fE [maxf(v i;0 + i e);BgBc i e]g = arg max e2f0;1g 8 < : E 2 4 maxf(v i;0 + i e);Bgc i e X j6=i c j ~ e j (v j;0 ) 3 5 9 = ; = arg max e2f0;1g 8 < : E 2 4 max (v i;0 + i e); maxfmax j6=i fv j;0 + j ~ e j (v j;0 )g; 0g c i e X j6=i c j ~ e j (v j;0 ) 3 5 9 = ; ; where the last equation follows from the definition ofB. It is easy to observe that at any round, when an agenti updates his strategy, the potential function is increased. Then, by the fact that the potential function is bounded, the process of updating strategies will eventually terminate in an equilibrium point. B.3.2 The All-Access Mechanism and Single Crossing Conditions The following proposition shows that in the setting in Example 2.4.3, the All-Access mechanism does not admit any equilibrium with increasing investment decisions. Proposition B.3.1. Consider the All-Access mechanism with no reserve price and the setting in Example 2.4.3. Then, there exists no equilibrium such that both agents follow increasing investment decisions. That is, if an agenti = 1; 2 acquires information only when his initial valuation is greater than i 2 (0; 1), i.e., ~ e i (v i;0 ) = 1 forv i;0 i , and ~ e i (v i;0 ) = 0 forv i;0 < i , then the other agentj6= i will not choose the following increasing investment decision: ~ e j (v j;0 ) = 1 forv j;0 j , and ~ e j (v j;0 ) = 0 forv j;0 < j , where j 2 (0; 1). Proof of Proposition B.3.1 To show the result, we will assume that an agenti = 1; 2 follows an increasing investment decision. That is, he only obtains the additional information when his initial valuation is greater than i 2 (0; 1). Then, we will establish that agentj6=i will not follow an increasing investment decision. 143 Throughout the proof, for simplicity, we denote i by. We defineW (v j;0 ) as the difference between the utility of agentj when he obtains information and the utility of agentj when he does not obtain infor- mation given that agenti6=j only obtains information when his initial valuation is greater than, i.e., W (v j;0 ) =u j (v j;0 ; ~ e j (v j;0 ) = 1; ~ e i )u j (v j;0 ; ~ e j (v j;0 ) = 0; ~ e i ); whereu j (v j;0 ; ~ e j (v j;0 ) = 1; ~ e i ) andu j (v j;0 ; ~ e j (v j;0 ) = 0; ~ e i ) are defined in Equations (B.13) and (B.12). Here, ~ e i (v) = 1 for anyv2 [; 1], and ~ e i (v) = 0 forv2 [0;). We will show thatW () is a unimodular function and obtains its unique maximum at ^ v 2 (0;). Precisely, we will show that @W (v j;0 ) @v j;0 0 for v j;0 2 [0; ^ v] and @W (v j;0 ) @v j;0 0 forv j;0 2 [^ v; 1]. This implies that for any values of the cost, agentj will not follow an increasing investment decision. To see why note that agentj with initial valuationv j;0 updates his valuation ifW (v j;0 ) 0. Then, considering the fact thatW () is unimodular,fv j;0 :W (v j;0 ) 0g cannot be in the form of [ j ; 1] where j 2 (0; 1). LetH() be the distribution of the maximum bid that agentj competes against, i.e.,B = maxf(v i;0 + i ~ e i (v i;0 )); 0g where ~ e i (v i;0 ) = 1 forv i;0 2 [; 1], and is zero otherwise. Then W (v j;0 ) = E " Z maxfv j;0 + j ;0g x=0 (v j;0 + j x)dH(x) # Z v j;0 x=0 (v j;0 x)dH(x)c; By Leibniz’s integral rule, @W (v j;0 ) @v j;0 can be written as @W r (v j;0 ) @v j;0 = E[H(maxfv j;0 + j ; 0g)]H(v j;0 ) = E[H(v j;0 + j )]H(v j;0 ) The second equality holds because B 0 and as a result H(x) = 0 for any x < 0. The next Lemma characterizes the distributionH(). 144 Lemma B.3.2. Suppose thatv i;0 u(0; 1), i u(1; 1), ~ e i (v i;0 ) = 1 forv i;0 2 [; 1], and ~ e i (v i;0 ) = 0 forv i;0 2 [0;). Then, the distribution ofB = maxf(v i;0 + ~ e i (v i;0 ) i ); 0g, denoted byH, is given by H(x) = 8 > > > > > > > > > > > > < > > > > > > > > > > > > : 0 x < 0: 3 2 x + (1) 2 4 0 x <: 1 2 x + + (1) 2 4 x < + 1 x 2 4 +x + 1 x < 2 1 x 2 The proof is straightforward. Thus, it is omitted. To simplify our notations, in the rest of the proof, we denote j andv j;0 with andv, respectively. In the following, using Lemma B.3.2, we will show that there exists ^ v2 [0;] such that forv2 [0; ^ v)W (v) is increasing inv, and it is non-increasing otherwise. By Lemma B.3.2, for anyv2 [0;), we have @W (v) @v = E[H(v +)]H(v) = 1 2 Z v v 3 2 (v +) + ( 1) 2 4 d + 1 2 Z 1 v 1 2 (v +) + + ( 1) 2 4 d 3 2 v + ( 1) 2 4 = v 2 (1) + ( 2 + 4 9)v 3 2 + 5 8 Since @W (v) @v is quadratic inv, it is easy to verify that @W (v) @v is decreasing inv2 [0;]. Then, by the fact that @W (v) @v v=0 = 3 2 +5 8 > 0 and @W (v) @v v= = (2) 4 < 0, we can conclude that there exists ^ v2 (0;) such that @W (v) @v 0 forv2 [0; ^ v], and @W (v) @v < 0 forv2 (^ v;). The last step of the proof is to show thatW (v) is decreasing inv whenv > . Lemma B.3.2 implies that for anyv>, we have @W (v) @v = E[H(v +)]H(v) = 1 2 Z v v 3 2 (v +) + ( 1) 2 4 d + 1 2 Z 1+v v 1 2 (v +) + + ( 1) 2 4 d + 1 2 Z 1 1+v (v +) 2 4 + (v +) d 1 2 v + + ( 1) 2 4 = 3 + 12v 9 2 9 24 + 3v 2 3vv 3 24 : 145 By the fact that 3v 2 3vv 3 is decreasing inv2 (; 1] andv>, we have @W (v) @v 3 + 12v 9 2 9 24 + 3 2 3 3 24 = 12v 6 2 12 24 6 2 24 < 0; where the second inequality follows from the fact thatv 1. B.3.3 Proof of Theorems 2.4.4 and 2.5.5 Theorem 2.4.4 can be seen as a corollary of Theorem 2.5.5. Therefore, in the following, we will verify Theorem 2.5.5. The proof is naturally divided into two parts. In the first part, we show that when the cost is small, there exists an equilibrium in which both agents obtain the additional information, i.e., ~ e i (v i;0 ) = 1 fori = 1; 2 andv i;0 2 [0; 1]. In the second part, we show that when the cost is large, there exists an equilibrium in which none of the agents obtain the additional information, i.e., ~ e i (v i;0 ) = 0 fori = 1; 2 andv i;0 2 [0; 1]. Part 1: We will show that when an agent i = 1; 2 obtains the additional information for any v i;0 2 [0; 1], then agentj6= i also has incentive to obtain information for anyv j;0 2 [0; 1] as long as costc minf 4r 3 3r 2 6r+5 48 ; 8r 3 +6r 2 +7 96 g. With slightly abuse of notation, we defineW r (v j;0 ) as the difference of the utility of agentj = 1; 2 with initial valuationv j;0 when he obtains information, and his utility when he does not obtain information given that the reserve price isr, and agenti6=j obtains the additional information for any initial valuation. That is, W r (v j;0 ) =u j (v j;0 ; ~ e j (v j;0 ) = 1; ~ e i )u j (v j;0 ; ~ e j (v j;0 ) = 0; ~ e i ); whereu j (v j;0 ; ~ e j (v j;0 ) = 1; ~ e i ) andu j (v j;0 ; ~ e j (v j;0 ) = 0; ~ e i ) are defined in Equations (B.13) and (B.12). Here, ~ e i (v) = 1 for anyv2 [0; 1]. We will show thatW r (v j;0 ) 0 for anyv j;0 2 [0; 1]; that is, for any initial valuation, agentj prefers to update his valuation. To this end, we will verify thatW r () achieves its minimum at eitherv j;0 = 0 orv j;0 = 1. Then by showing thatW r (1);W r (0) 0, we can conclude that W r (v j;0 ) 0 for anyv j;0 2 [0; 1]. 146 Similar to the proof of Theorem B.3.1, we writeW r () as a function of the distribution of the competing bidB, whereB = maxf(v i;0 + ~ e i (v i;0 ) i );rg = maxfv i;0 + i ;rg 1 . This follows because a second price auction is a truthful mechanism, and agenti6=j obtains the additional information for any value ofv i;0 . We denote the distribution ofB byH r . Then W r (v j;0 ) = E " Z maxfv j;0 + j ;rg x=r (v j;0 + j x)dH r (x) # Z v j;0 x=r (v j;0 x)dH r (x)c; where the expectation is with respect to j . We will show that the derivative of W r (v j;0 ) with respect to v j;0 is positive for small values of v j;0 and is non-positive for large values of v j;0 . This implies that W r (v j;0 ) minfW r (0);W r (1)g for anyv j;0 2 [0; 1]. By Leibniz’s integral rule, @Wr (v j;0 ) @v j;0 can be written as @Wr (v j;0 ) @v j;0 = E[H r (v j;0 + j )]H r (v j;0 ). The next lemma characterizes the distributionH r . Lemma B.3.3. Ifv i;0 u(0; 1) and i u(1; 1), the distribution ofB = maxf(v i;0 + i );rg, denoted byH r , is given by H r (x) = 8 > > > > > > > > < > > > > > > > > : 0 x<r; 1 4 + x 2 rx 1; x x 2 4 1<x 2; 1 x> 2 The proof is straightforward. Thus, it is omitted. In the following, using Lemma B.3.3, we will show that arg min v2[0;1] fW r (v)g is either 0 or 1. To this aim, we will show thatW r (v) is increasing inv given thatv<r, and is decreasing inv otherwise. Observe that for anyv2 [0;r), @Wr (v) @v = E[H r (v +)] 0. Furthermore, by Lemma B.3.3, for any v2 (r; 1], we have @W r (v) @v = E[H r (v +)]H r (v) = 1 2 Z 1v rv 1 4 + v +x 2 dx + 1 2 Z 1 1v (v +x) (v +x) 2 4 dx 1 4 + v 2 = 3v 2 v 3 3v 24 + 3r 2 3r 24 3r 2 r 3 3r 24 + 3r 2 3r 24 = r 3 6r 24 0 1 We can assume that the seller is one of the opponents with submitted bid ofr. 147 The first inequality holds because 3v 2 v 3 3v is decreasing inv andv > r. We established thatW r () gets minimized either at v = 0 or v = 1. Then, in the last step, we will verify W r (0);W r (1) 0. By definition, W r (0) = E (B) + c = Pr[B =r] Z 1 r 1 2 (r)d + 1 4 Z 1 r Z r (x)dxdc = 4r 3 3r 2 6r + 5 48 c Similarly, W r (1) = E (1 +B) + E (1B) + c = Pr[B =r] E (1 +r) + + 1 4 Z 1 =r1 Z minf1+;1g x=r (1 +x)dxd Pr[B =r] (1r) 1 2 Z 1 x=r (1x)dx = 8r 3 + 6r 2 + 7 96 c Therefore, whenc minf 4r 3 3r 2 6r+5 48 ; 8r 3 +6r 2 +7 96 g,W r (v) 0 for anyv2 [0; 1]. As a result, the agent is willing to obtain information regardless of his initial valuation. Part 2: In this part, we will show that when the reserve pricer p 2 1 and an agenti = 1; 2 does not obtain the additional information regardless of his initial valuation, then agentj6=i also does not have any incentive to obtain information for anyv j;0 2 [0; 1] as long as costc 3r 4 +8r 3 +6r 2 +7 48 . Similarly, when r > p 2 1 and the cost is greater than 3rr 3 +1 12 , there exists an equilibrium such that none of the agents obtain the additional information. With slightly abuse of notation, we defineW r (v j;0 ) as the difference between the utility of agentj when he obtains information and the utility of agentj when he does not obtain information given that agenti6=j does not acquire the additional information for any initial valuation, i.e., W r (v j;0 ) = u j (v j;0 ; ~ e j (v j;0 ) = 1; ~ e i )u j (v j;0 ; ~ e j (v j;0 ) = 0; ~ e i ), where ~ e i (v) = 0 for anyv2 [0; 1]. We will show thatW r (v j;0 ) 0 for any v j;0 2 [0; 1] when i- c 3r 4 +8r 3 +6r 2 +7 48 and r p 2 1, or ii- c 3rr 3 +1 12 and r > p 2 1 . To this aim, we will verify that max v2[0;1] fW r (v)g = 3r 4 +8r 3 +6r 2 +7 48 c 0 whenr p 2 1, and max v2[0;1] fW r (v)g = 3rr 3 +1 12 c 0 whenr > p 2 1. This implies that the agent does not have any incentive to update his valuation for any initial valuation. 148 Specifically, we will show thatW r (v) is increasing inv whenv2 [0; maxfr; 1r 2 2 g], and is decreasing otherwise. That is, arg max v2[0;1] fW r (v)g = 1r 2 2 if r p 2 1, and arg max v2[0;1] fW r (v)g = r if r> p 2 1. We show these statements by characterizing @Wr (v) @v . By part 1, @Wr (v) @v = E[H r (v + )] H r (v), where H r () is the distribution of the bid that agent j competes against. Then, considering the fact that v i;0 u(0; 1), and the other agent does not obtain information, i.e.,B = maxfv i;0 ;rg, we have H r (x) = 8 > > > > > < > > > > > : 0 x<r; x x2 [r; 1]; 1 x> 1: This implies that for anyv2 [0; 1] E[H r (v +)] = Pr[ +v 1] + 1 2 Z 1v rv (v +x)dx = v 2 + 1r 2 4 : Thus, for anyv<r, we have @Wr (v) @v = E[H r (v +)] = v 2 + 1r 2 4 0 and for anyv>r, @W r (v) @v = E[H r (v +)]H(v) = v 2 + 1r 2 4 : We point that for anyvr, we have @Wr (v) @v 0 as long asr> p 2 1. To see why note that @W r (v) @v = v 2 + 1r 2 4 r 2 + 1r 2 4 0; where the last inequality holds becauser > p 2 1. On the other hand, whenr p 2 1, dWr (v) dv 0 forv2 [r; 1r 2 2 ] and @Wr (v) @v 0 forv 1r 2 2 . This implies that we have arg max v2[0;1] fW r (v)g = r whenr > p 2 1, and arg max v2[0;1] fW r (v)g = 1r 2 2 whenr p 2 1. The proof will be completed by showing that i- W r ( 1r 2 2 ) 0 when c 3r 4 +8r 3 +6r 2 +7 48 and r p 2 1, and ii- W r (r) 0 when c 3rr 3 +1 12 andr> p 2 1. 149 Forr p 2 1, we have max v2[0;1] fW r (v)g = W r ( 1r 2 2 ) = E " 1r 2 2 +B + # E " 1r 2 2 B + # c = Pr[B =r] E " 1r 2 2 +r + # + 1 2 Z 1 =r 1r 2 2 Z minf 1r 2 2 +;1g x=r ( 1r 2 2 +x)dxd Pr[B =r] 1r 2 2 r Z 1r 2 2 x=r ( 1r 2 2 x)dx = 3r 4 + 8r 3 + 6r 2 + 7 48 c 0; where the inequality holds becausec 3r 4 +8r 3 +6r 2 +7 48 . Similarly, whenr> p 2 1, we have max v2[0;1] fW r (v)g = W r (r) = E (r +B) + E (rB) + c = Pr[B =r] E () + + 1 2 Z 1 =0 Z maxfr+;1g x=r (r +x)dxdc = 3rr 3 + 1 12 c 0; where the inequality follows from the fact thatc 3rr 3 +1 12 . B.4 Delegated Proofs B.4.1 Proof from Section 2.5 Proof. Proof of Lemma 2.5.2 Letx be the vector of the initial valuations. Given that mechanismM OPT is incentive compatible, the revenue of the seller is given by E " n X i=1 t i +p i # = E " n X i=1 v i;1 q i c i s i U i (x i ;x i ) v 0 =b 0 =x # ; (B.14) where v i;1 = x i + i if i 2 S OPT (x) and x i otherwise, and the expectations are with respect to initial valuations and second signals. Note that the sum of the first and second terms is the social welfare of the agents and the seller. In the following, we compute the last term in the r.h.s. of the above equation, that is, E [U i (x i ;x i )], where the expectation is with respect to initial valuations. By Lemma B.1.7, we 150 have E [U i (x i ;x i )] = R v x i =v R x i z=v E h q i v i;0 =z; v i;0 =x i i dzdF (x i ); where the expectation inside the integral is with respect tov i;0 . Changing the order of integrals, we get Z v z=v Z v x i =z dF (x i )E h q i v i;0 =z; v i;0 =x i i dz = Z v z=v (1F (z))E h q i v i;0 =z; v i;0 =x i i dz: By multiplying and dividing the r.h.s. of the equation above by the probability density f(z), we obtain E [U i (x i ;x i )] = E h 1F (x i ) f(x i ) q i v 0 =x i = E (x i )q i v 0 =x . Substituting E[U i (x i ;x i )] in Eq. (B.14), the expected revenue of the seller is given by E P n i=1 (v i;1 +(x i ))q i c i s i v 0 =x . Finally, the result follows from applying the selection and allocation rules. Proof. Proof of Lemma 2.5.3 To find an upper bound, we consider a relaxed environment in which the seller observes the additional information of selected agents, and she can force agents to update their valuations. It is easy to see that the maximum achievable revenue in this environment is an upper bound on the revenue of the seller in the original environment. By the revelation principle, we focus on direct incentive compatible mechanisms that consist of transfer scheme t i : R n ! R, selection rule s i : R n !f0; 1g, and allocation rule q i : R 2n ! R + , where s i is 1 when agenti is selected and is 0 otherwise. Note that the payment and selection rules are only functions of initial bids, and the allocation rule is a function of the initial bids and the second signals observed by the seller. To compute the upper bound on the revenue of any incentive compatible mechanism in the relaxed environment, we first need to characterize the utility of each agent i. Assume that agent i with initial valuationx i reports ^ x i , and other agents report truthfully. Then, his utility is given by U i (x i ; ^ x i ) = E h q i (x i + i s i ) t i c i s i b i;0 = ^ x i ; v i;0 =x i ; v i;0 =b i;0 =x i i ; where the expectation is with respect to the second signals. Incentive compatibility implies that U i (x i ;x i )U i (^ x i ; ^ x i ) U i (x i ;x i )U i (^ x i ;x i ) = (x i ^ x i )E h q i b i;0 =x i ; b i;0 =x i i ; 151 and similarly,U i (x i ;x i )U i (^ x i ; ^ x i ) (x i ^ x i )E h q i b i;0 = ^ x i ; b i;0 =x i i . Without loss of generality, we assume thatx i > ^ x i . Then, using the above Equations, E h q i b i;0 = ^ x i ; b i;0 =x i i U i (x i ;x i )U i (^ x i ; ^ x i ) x i ^ x i E h q i b i;0 =x i ; b i;0 =x i i : Finally by taking the limit as ^ x i !x i , we getU i (x i ;x i ) =U i (v;v) + R x i v E q i v i;0 =z;b i;0 =x i dz. We are now ready to compute the upper bound of the revenue. By using the same arguments as in Lemma 2.5.2, it can be shown that for any selection rule s i and allocation rule q i revenue of the seller when agents are truthful is given by E h X i: s i (x)=1 x i +(x i ) + i q i + X i: s i (x)=0 x i +(x i ) q i n X i=1 c i s i n X i=1 U i (v;v) b 0 =x i ; (B.15) where the expectation is with respect to the first and second signals. Because the mechanism should be individually rational, we set U i (v;v) = 0 for all i. Then, to maximize the revenue, the item should be allocated to the agent with the highest non-negative virtual valuation, that is, q i = 1 if i2 arg max j x j +(x j ) + j s j (x) 2 and 0 otherwise. Therefore, the expected revenue of the seller can be written as E " max max i: s i (x)=1 fx i +(x i ) + i g; max i: s i (x)=0 fx i +(x i )g; 0 n X i=1 c i s i b 0 =x # : So, if agents in setS relaxed , defined below, obtain the additional information, the revenue gets maximized: S relaxed (x) = arg max Sf1;;ng E max n max i2S fx i + i +(x i )g; max i= 2S fx i +(x i )g; 0 o X i2S c i : Finally, the result follows by plugging the selection ruleS relaxed (x) and allocation rule q i in Eq. (B.15). B.4.2 Proofs from Section B.1 Proof. Proof of Lemma B.1.1 2 In case of ties, we choose one of them randomly. 152 The basic idea is to establish an upper bound on the r.h.s. of Eq. (B.3). We will show the upper bound is larger than the l.h.s. of Eq. (B.3) atr =. This will imply that Eq. (B.3) is satisfied at somer. We first show that the r.h.s. of Eq. (B.3), i.e., R b `;0 v E q ` v `;0 =z;v `;0 =b `;0 dz, is less than or equal to R b `;0 v Pr z +(b `;0 ) ! ` dz, where! ` = max max j2S ; (b 0 ) b j;0 + j +(b j;0 ) ; . Because agent ` has the largest weighted bid among unselected agents, ! ` = max max j2S ; (b 0 ) b j;0 + j + (b j;0 ) ; max j= 2S ; (b 0 );j6=` b j;0 +(b j;0 ) ; . Then, by Lemma B.1.8, U ` (v;b `;0 ) U ` (b `;0 ;b `;0 ) Z b `;0 v Pr z +(b `;0 )! ` dz U ` (v;v) = U ` (b `;0 ;b `;0 ) Z b `;0 v E q ` v `;0 =z; v `;0 =b `;0 dz; where the second inequality follows from Lemma B.1.9, and the equality follows from Lemma B.1.7. By the above equation, we can conclude that R b `;0 v E q ` v `;0 =z; v `;0 =b `;0 dz is less than R b `;0 v Pr z + (b `;0 )! ` dz. Next, we will show that the l.h.s. of Eq. (B.3) atr = is greater than the upper bound, i.e., R b `;0 v Pr z + (b `;0 ) ! ` dz. Then, considering the fact that the upper bound is not a function of r, the l.h.s. is a non-increasing function of r, and is zero at r = b `;0 +(b `;0 ), we conclude that there exists an r 2 [;b `;0 +(b `;0 )] that satisfies Eq. (B.3). By changing variable, the l.h.s. atr = can be written as Z b `;0 (b `;0 ) Pr h z +(b `;0 ) max n max j2S ; (b 0 ) b j;0 + j +(b j;0 ) ; oi dz = Z b `;0 (b `;0 ) Pr z +(b `;0 )! ` dz; Then, becauseb `;0 +(b `;0 ), we have Z b `;0 (b `;0 ) Pr z +(b `;0 )! ` dz Z b `;0 maxf(b `;0 );vg Pr z +(b `;0 )! ` dz = Z b `;0 v Pr z +(b `;0 )! ` dz Z maxf(b `;0 );vg v Pr z +(b `;0 )! ` dz = Z b `;0 v Pr z +(b `;0 )! ` dz; where the last equality holds because! ` and for anyz (b `;0 ), Pr z +(b `;0 ) ! ` = 0. The last equation shows that the l.h.s. of Eq. (B.3) atr = is greater than the r.h.s. of Eq. (B.3). 153 B.4.3 Proofs from Section B.1.1 Proof. Proof of Lemma B.1.5 Consider a selected agenti with initial bidb i;0 . If agenti wins the item, his payment in the second round,p i , would be equal to maxfmax j6=i fb j;1 +(b j;0 )g;g(b i;0 ). Note that p i is independent ofb i;1 . Therefore, an agent cannot change his price for the item. However, the agent can change the probability of the allocation. It is easy to see that underbidding may only result in losing the item. On the other hand, over bidding may yield a negative utility. Note that overbidding can make a difference only when v i;1 +(b i;0 ) maxfmax j6=i fb j;1 +(b j;0 )g;g b i;1 +(b i;0 ): In this case, the utility of agenti would be non-positive: v i;1 p i = v i;1 +(b i;0 ) maxfmax j6=i fb j;1 +(b j;0 )g;g v i;1 +(b i;0 ) (v i;1 +(b i;0 )) = 0: Therefore, a weekly dominate strategy of agenti is to be truthful. Proof. Proof of Lemma B.1.6 Consider agenti2S ; (x) who bids truthfully in the first round, wherex is the initial valuations of agents. In the proof, to simplify the notations, we denoteS ; (x) byS(x). We will show that agenti will learn his second signal, i.e., e i = 1, given that other agents are truthful. To this aim, we will prove that for agenti, the marginal value of changing his decision to obtain information is identical to the change in the weighted surplus ; . More precisely, the difference between the utility of agent i when he obtains information and that when he does not is equal to ; (x;S(x)) ; (x;S(x)nfig). Then, since the SWSP mechanism maximizes the weighted surplus, i.e., ; (x;S(x)) ; (x;S(x)nfig), we conclude that agenti prefers to update his valuation. We first characterize the utility of agent i when he does not obtain information. Let Y be the ran- dom variable corresponding to the maximum weighted bid of all agents except for agent i, i.e., Y = 154 max max j2S(x);j6=i x j + j +(x j ) ; max j= 2S(x) x j +(x j ) ; o . Then, by Lemma B.1.5, when agenti does not update his valuation, his utility is given by E max x i +(x i )Y; 0 t i = E maxfY;x i +(x i )gY t i ; = ; (x;S(x)nfig) + X j2S(x)nfig c j E[Y ]t i (B.16) Note that in computing utility of agent i, we use the fact that other agents are truthful. In addition, the second equality holds because ; (x;S(x)nfig) = E max n max j2S(x);j6=i fx j +(x j ) + j g; max j= 2S(x) fx j +(x j )g;x i +(x i ); o X j2S(x)nfig c j = E [maxfY;x i +(x i )g] X j2S(x)nfig c j ; Similarly, when agenti obtains the additional information, his utility is equal to E max x i + i +(x i )Y; 0 t i c i = E maxfY;x i + i +(x i )gY t i c i = ; (x;S(x)) + X j2S(x)nfig c j E[Y ]t i : (B.17) The first expression follows from Lemma B.1.5 where we show that agenti bids truthfully in the second round. The second equality holds because ; (x;S(x)) =E max n max j2S(x);j6=i fx j +(x j ) + j g; max j= 2S(x) fx j +(x j )g;x i + i +(x i ); o X j2S(x) c j = E maxfY;x i + i +(x i )g X j2S(x) c j : In addition, note that t i in Eq. (B.17) is the same as t i in Eq. (B.16) since agent i’s decision to obtain information, i.e., e i , is not observable by the mechanism. By Equations (B.16) and (B.17), the differ- ence between the utility of agent i when he updates his valuation and his utility when he does not is ; (x;S(x)) ; (x;S(x)nfig). Then, considering the fact that the SWSP mechanism maximizes the 155 weighted surplus, i.e., ; (x;S(x)) ; (x;S(x)nfig), we conclude that agent i prefers to learn his second signal. Proof. Proof of Lemma B.1.7 We first show that for any x i 2 [v; v] n1 and any i = 1; 2;:::;n, E [q i jv i;0 =x i ; v i;0 =x i ] is a non-decreasing function ofx i . Observe that the weighted surplus is the maximum of affine functions ofv i;0 +(v i;0 ). Thus, it is a convex function ofv i;0 +(v i;0 ). Furthermore, the weighted surplus is a continuos function ofv i;0 +(v i;0 ), and its derivative with respect tov i;0 +(v i;0 ) at v i;0 =x i , if exists, is equal to E [q i jv i;0 =x i ; v i;0 =x i ]. 3 This implies that E [q i jv i;0 =x i ; v i;0 =x i ] is a non-decreasing function ofv i;0 +(v i;0 ). Finally, considering the fact function is non-decreasing, we can conclude that E [q i jv i;0 =x i;0 ; v i;0 =x i ] is a non-decreasing function ofv i;0 . Next we show that the utility of an agenti that bids truthfully in the first round and follows the optimal strategy afterwards follows from Eq. (B.7) when all other agents are truthful. By Lemma B.1.6, agent i that bids truthfully in the first round will obtain information if he gets selected. Furthermore, Lemma B.1.5 implies that agenti will bid truthfully in the second round if he is allowed to update his bid in the second round. Therefore, agenti that bids truthfully in the first round stays truthful. Now, we are ready to show the result. We consider the following cases. Throughout this proof, for simplicity, we drop the subscript ofS ; (x), and denote it byS(x). i)i2S(x): By lemmas B.1.5 and B.1.6, selected agenti learns his second signal and reports it truthfully in the second round. Thus, his utility is given by E q i (x i + i )p i t i c i v 0 =x 0 . The claim follows from pluggingt i from Eq. (2.2). ii) i = 2S(x) and x i +(x i ) < max max j= 2S(x);j6=i fx j +(x j )g; : In this case, the utility of agent i and his allocation probability is 0. By the fact that E q i v i;0 =z; v i;0 =x i is an increas- ing function of z and E q i v i;0 =x i ; v i;0 =x i = 0, we can write the utility of agent i as R x i v E q i v i;0 =z; v i;0 =x i dz = 0. iii)i = 2S(x) andx i +(x i ) max max j= 2S(x);j6=i fx j +(x j )g; : In this case, when unselected agenti wins the item, he has to pay maximum ofr and the second highest weighted bid. Therefore, 3 To see that note E [qijvi;0 =xi; vi;0 =xi] is equal to Pr[vi;1 + (xi) maxfmax j2S(x) fxj + j + (xj;0)g; max j= 2S(x) fxj +(xj;0)g;g], wherevi;1 =xi +i ifi2S(x) and it isxi otherwise. 156 U i (x i ;x i ) = E h q i x i +(x i ) max n max j2S(x) x j + j +(x j ) ;r oi , where U i (x i ;x i ) is defined in Eq. (B.6). LetY = max j2S(x) fx j + j +(x j )g, and letH be the distribution ofY . Then,U i (x i ;x i ) can be written as E h x i +(x i )Y 1 l x i +(x i )Y r + x i +(x i )r 1 l n x i +(x i )rY oi = (x i +(x i )) H(x i +(x i ))H(r) R x i +(x i ) r zdH(z) + (x i +(x i )r)H(r) = x i +(x i ) H(x i +(x i ))rH(r) R x i +(x i ) r zdH(z) = R x i +(x i ) r H(z)dz; where in the first equation, the expectation is with respect to the second signals. The last equality is followed from the integration by part. Therefore, using Eq. (B.3), we get U i (x i ;x i ) = Z x i +(x i ) r Pr h z max j2S(x) x j + j +(x j ) i dz = Z x i v E h q i v i;0 =z; v i;0 =x i i dz: Proof. Proof of Lemma B.1.8 Throughout the proof, all the expectations are with respect to the second signals. Consider an untruthful agenti with initial valuationx i who bids ^ x i in the first round. We establish an upper bound on his utility. We consider the following two cases,s i = 1 ands i = 0. s i = 1: When agenti is selected,s i = 1, he can either obtain information or not. Given his investing decisione i =e i (v i;0 =x i ;b i;0 = ^ x i ;t i ), by Lemma B.1.5, his utility can be written as U i (x i ; ^ x i ) = E h (x i +e i i )p i t i e i c i v 0 =x; b i;0 = ^ x i ; b i;0 =x i i (B.18) = E [maxfx i +(^ x i ) +e i i ! i ; 0gt i e i c i ] ; where! i is defined in Eq. (B.8), and the expectation is taken assuming the all agents except for agenti are truthful. Note that for abbreviation, we omit the condition in the second equation and in the rest of the 157 proof. Then, by adding and subtracting E [maxf^ x i +(^ x i ) +e i i ! i ; 0g], the utility can be rewritten as U i (x i ; ^ x i ) = E h maxf^ x i +(^ x i ) +e i i ! i ; 0gt i e i c i maxf^ x i +(^ x i ) +e i i ! i ; 0g maxfx i +(^ x i ) +e i i ! i ; 0g i = E h maxf^ x i +(^ x i ) +e i i ! i ; 0gt i e i c i i + Z x i ^ x i Pr [z +(^ x i ) +e i i ! i ]dz Whene i = 1 the first term in the last line isU i (^ x i ; ^ x i ). Otherwise, it is the utility of selected agenti with initial ^ x i who bids truthfully, gets selected, but does not learn his second signal, which is by Lemma B.1.6 is less than or equal toU i (^ x i ; ^ x i ). Thus, the utility is at mostU i (^ x i ; ^ x i ) + R x i ^ x i Pr [z +(^ x i ) +e i i ! i ]dz, which is the desired result. s i = 0: Note that s i = 0 means agent i is not selected. Then, if ^ x i + (^ x i ) < max n max j= 2S ; (^ x i ;x i ) fx j +(x j )g; o , his utility is zero. If not, the utility of agent i given that he stays in the game can be written as E h q i x i +(^ x i ) max ! i ;r i : By adding and subtracting E [q i ^ x i ], and by the fact that agenti receives the item if ^ x i +(^ x i ) is greater than! i , we have E max (^ x i +(^ x i ) max ! i ;r ; 0 + (x i ^ x i ) 1 l ^ x i +(^ x i )! i : The first term isU i (^ x i ; ^ x i ). SinceU i (^ x i ; ^ x i ) 0 and agenti can exit the game if his utility gets negative, he can at most yield maxfU i (^ x i ; ^ x i )+ R x i ^ x i Pr ^ x i +(^ x i )! i dz; 0g, which is less than maxfU i (^ x i ; ^ x i )+ R x i ^ x i Pr z +(^ x i )! i dz; 0g. B.4.4 Proofs from Section B.1.2 Proof. Proof of Lemma B.1.10 We establish the following two claims. 158 Claim 1: For any setSf1; 2;:::;ng ; ((x i ;x i );S) ; ((^ x i ;x i );S) = Z x i ^ x i (1 + 0 (z))E h q i v i;0 =z; v i;0 =x i ; S i dz: Claim 2: Suppose Assumption 4 holds. Then weighted surplus is an absolutely continuous and convex function ofv i;0 +(v i;0 ), and it is given by ; (x) = ; ((^ x i ;x i )) + R x i ^ x i (1 + 0 (z)) E h q i v i;0 =z; v i;0 =x i i dz: The proof of claims follows from Theorems 1 and 2 in Milgrom & Segal (2002). Thus, we do not repeat it here. By Claims 1 and 2 and the fact that E h q i v i;0 =z; v i;0 =x i ; ^ S y 1 ;y 2 (z;x i ) i = E h q i v i;0 =z; v i;0 =x i i forz<y 1 andz>y 2 , we have ; (x; ^ S y 1 ;y 2 (x)) = ; ((^ x i ;x i ); ^ S y 1 ;y 2 (^ x i ;x i )) + Z y 1 ^ x i 1 + 0 (z) E h q i v i;0 =z; v i;0 =x i i dz + Z y 2 y 1 1 + 0 (z) E h q i v i;0 =z; v i;0 =x i ; ^ S y 1 ;y 2 (z;x i ) i dz + Z x i y 2 1 + 0 (z) E h q i v i;0 =z; v i;0 =x i i dz: Then, the result follows from Claim 2 and the fact that, by construction, ; (^ x i ;x i ) = ; ((^ x i ;x i ); ^ S y 1 ;y 2 (^ x i ;x i )). B.5 Numerical Experiments In Section B.5.1, we depict the initial payments. Section B.5.2 compares the SSP mechanism with the optimal mechanism in terms of the revenue of the seller. In Section B.5.3, we study impacts of the number of agents,n. B.5.1 Payments in the First Round Recall that the initial paymentt i incentivizes agents to be truthful. In this section, we investigate how much the OPT and EFF mechanisms charge each agenti upfront for different realizations of initial valuations. As usual,n = 2,F =N(0:5; 0:5),G i =N(0; 0:5), andc i = 0:05 fori = 1; 2. 159 The initial payment for the first agent,t 1 , in the OPT and EFF mechanisms for all realizations ofv 1;0 and v 2;0 in the range of [1:5; 2:5] is shown in Figures B.1 and B.2, respectively. The x-axis isv 2;0 , and the y-axis isv 1;0 . Here, different shades of gray mean different initial payment as defined in the color bars next to the figures. By construction, the initial payment of the first agent is zero if he is not selected. Furthermore, when he is selected,t 1 is an increasing function ofv 1;0 . Figure B.1: The payment of agent 1 in the first round,t 1 , in the optimal mechanism for different realizations ofv 1;0 andv 2;0 withn = 2,c = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5). B.5.2 The SSP Mechanism versus the optimal Mechanism In this section, we seek to understand how the SSP mechanism performs in compare with mechanismM OPT . To this aim, we report the revenue of the SSP mechanism under four problem classes, corresponding with cost 0:02 and 0:05, and number of agents of 2 and 3. Here,F = N(0:5; 0:5) andG i = N(0; 2 ), where 2 = 0:5; 1; 1:5, and 2. In Table B.1, for each problem class, we present the revenue of the SSP mechanism with revenue- maximizingr as a percentage of the optimal revenue, averaged over 2000 instances in each problem class. We observe that the SSP mechanism performs better as the number of agents gets larger, 2 becomes smaller, and the additional information gets more costly. In addition, the SSP mechanism yields more than 84% of the optimal revenue. 160 Figure B.2: The payment of agent 1 in the first round,t 1 , in the efficient mechanism for different realizations ofv 1;0 andv 2;0 withn = 2,c = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5). Problem Class Gi =N(0; 2 ) n cost 2 = 0:5 2 = 1 2 = 1:5 2 = 2 2 0.02 94 90 87 84 0.05 95 92 90 87 3 0.02 95 93 89 88 0.05 96 94 92 90 Table B.1: Revenue of the SSP mechanism (with revenue-maximizing r) as a percentage of the optimal revenue withF =N(0:5; 0:5). Here, the standard errors of all numbers are less than 1%. B.5.3 More Agents In this section, we investigate how the number of agents can affect the outcome of the OPT and EFF mecha- nisms. Again,F =N(0:5; 0:5),G i =N(0; 0:5), andc i = 0:05. Figure ?? shows the average number of selected agents, revenue, and social welfare versus the number of agents, n. As the number of agents increases, the revenue and social welfare, and average number of selected agents in all considered mechanisms rise. However, even in the EFF mechanism, the average number of selected agents is sub-linear (concave) inn. 161 2 3 4 5 0.4 0.6 0.8 1 Number of Agents Revenue OPT EFF Figure B.3: Revenue in the optimal and efficient mechanisms versus number of agents,n, withc = 0:05, F =N(0:5; 0:5), andG i =N(0; 0:5). 2 3 4 5 0.8 1 1.2 1.4 1.6 Number of Agents Welfare OPT EFF Figure B.4: Social welfare in the optimal and efficient mechanisms versus number of agents,n, withc = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5). 2 3 4 5 0.5 1 1.5 2 2.5 Number of Agents Average Number of Selected Agents OPT EFF Figure B.5: The average number of selected agents of the optimal and efficient mechanisms versus number of agents,n, withc = 0:05,F =N(0:5; 0:5), andG i =N(0; 0:5). 162 Appendix C Technical Appendix to Chapter 3 C.1 Proof of Main Results C.1.1 Appendix to Section 3.2 Proof. Proof of Lemma 3.2.1 The proof falls naturally into two parts. In the first part, we show that in an incentive compatible mechanism conditions in Equations (3.1) and (3.2) hold. In the second part, we show that if Equations (3.1) and (3.2) hold, the mechanism is incentive compatible. First Part: Consider a customer with type that reports ^ . Without loss of generality, we assume that ^ . Then, the utility of the customer is given byu(; ^ ) = V (;t( ^ )) p( ^ ). Incentive compatibility implies that u(;)u( ^ ; ^ ) u(;)u( ^ ;) = V (;t())V ( ^ ;t()) = Z z= ^ @ 1 V (z;t())dz; (C.1) and u(;)u( ^ ; ^ ) u(; ^ )u( ^ ; ^ ) = V (;t( ^ ))V ( ^ ;t( ^ )) = Z z= ^ @ 1 V (z;t( ^ )dz; (C.2) where@ 1 V (;t) = @V (;t) @ . Then, using the above Equations, we have V (;t( ^ ))V ( ^ ;t( ^ )) ^ u(;)u( ^ ; ^ ) ^ ; V (;t())V ( ^ ;t()) ^ u(;)u( ^ ; ^ ) ^ : 163 Finally by taking the limit as ^ ! , we get Eq. (3.1). Then, by Equations (3.1), (C.1), and (C.2), we get the second condition, given in Eq. (3.2). Second Part: Here, we will show that if in a mechanism Equations (3.1) and (3.2) hold, the mechanism is incentive compatible. By Eq. (3.1), u(;)u( ^ ; ^ ) = Z z= ^ @ 1 V (z;t(z))dz Z z= ^ @ 1 V (z;t( ^ ))dz = V (;t( ^ ))V ( ^ ;t( ^ )) = u(; ^ )u( ^ ; ^ ); (C.3) where the inequality follows from Eq. (3.2). The final equation implies thatu(;)u(; ^ ). Similarly, u(;)u( ^ ; ^ ) = Z z= ^ @ 1 V (z;t(z))dz Z z= ^ @ 1 V (z;t())dz = V (;t())V ( ^ ;t()) = u(;)u( ^ ;): That is,u( ^ ; ^ )u( ^ ;). The above equation along with Eq. (C.3) imply that the mechanism is incentive compatible. Proof. Proof of Lemma 3.2.2 Consider any incentive compatible mechanism. Then, the expected revenue of the firm from selling one unit of the item is given by E[p()ht()c] = E [V (;t())u(;)ht()c]; (C.4) 164 where the expectation is with respect to the customer type. In the following, we compute E[u(;)]. By Lemma 3.2.1 E[u(;)] =u(; ) + Z = dF () Z z= @ 1 V (z;t(z))dz =u(; ) + Z z= Z =z dF ()@ 1 V (z;t(z))dz =u(; ) + Z z= (1F (z))@ 1 V (z;t(z))dz =u(; ) + E (1F ()) f() @ 1 V (;t()) : (C.5) By replacing Eq. (C.5) in Eq. (C.4), we get the desired result. C.1.2 Lower Bound on the Revenue Gain of the Dynamic Pricing Policy Here, we compare the revenue of the optimal mechanism given in Theorem 3.3.2 with that of the optimal FP policy. Recall that in the FP policy, the firm only sells to customers with type L at time zero by posting a fixed price of L . Lemma C.1.1 (Lower Bound on the Revenue Gain of DP). LetR f andR opt be the expected revenue of the firm under the FP policy and the optimal DP policy, respectively. Then, R opt R f R f e 1 E [1 lf L g] L (1F ( L )) : Proof of Lemma C.1.1 is given at the end of this section. Assume that the customer type is drawn from the Uniform distribution in the range of [0; 1]; that is, U(0; 1). Then, L = 0:5, andR f = 1 4 . Lemma C.1.1 implies that the firm can increase his revenue by more than 100 e 1 E[1 lf L g] L (1F ( L )) 18% by using DP. Proof. Proof of Lemma C.1.1 By Lemma 3.2.2, under the FP policy, R f = E [V (;t()) +()@ 1 V (;t())u(; )] = E h e t() +()(1t()) i = E [( +())1 lf L g] ; (C.6) 165 where the last inequality holds because in the FP policy, t() = 0 for L . Similarly, under the mechanism described in Theorem 3.3.2, we have R opt = E h e t R () +()(1t R ()) i = E h ( +()) 1 lf H g (C.7) + e t R () +()(1t R ()) 1 lf2 ( L ; H )g + e 1 1 lf L g i ; where the first equation holds because of the time of purchase in the optimal DP policy is t R (), given in Eq. (3.6). Then, considering the fact that 2 ( L ; H ), t R () is the FOC solution, i.e., t R () = arg max t0 fR(;t)g, we get R opt E h ( +()) 1 lf L g + e 1 1 lf L g i : By the above equation and Eq. (C.6), we getR opt R f e 1 E [ 1 lf L g]. Then the result follows becauseR f = L (1F ( L )). C.1.3 Proof of Theorem 3.4.4 We have divided the proof into a sequence of lemmas. Lemma C.1.2 characterizes the optimal mechanism when the holding cost is low. Lemmas C.1.3 and C.1.4, respectively, characterize the optimal mechanism when the holding cost is medium and high. Lemma C.1.2 (Low Holding Cost). If Assumption 3 holds, the valuation functionV (;t) = e t , and the holding costhH l , then the optimal mechanism sells to the customer of type maxf L ; g at time t h () and at price p() =V (;t h ()) R maxf L ;g e t h (z)z (1t h (z)z)dz whereH l , t h (), and L are defined in Equations (3.7) and (3.8). For maxf L ; g, p() = 1. The proof of Lemma C.1.2 is given in Section C.1.4. Lemma C.1.3 (Medium Holding Cost). If Assumption 3 holds, the valuation functionV (;t) =e t , the holding costh2 [H l ;H h ], andR(;t f ())ht f () = 0 has a unique solution, then the optimal mechanism 166 sells to the customer of type M at time t h () and at price p() = V (;t h ()) R M e t h (z)z (1 t h (z)z)dz whereH l ,H h , M , t h (), and the FOC solution t f () are defined in Equations (3.7) and (3.8). For < M , p() = 1. The assumption in Lemma C.1.3 is discussed in Section C.1.8, and the proof of Lemma C.1.3 is provided in Section C.1.5. Lemma C.1.4 (High Holding Cost). If Assumption 3 holds, the valuation functionV (;t) = e t , the holding costh H h , andR(;t f ())H h t f () = 0 has a unique solution, then the optimal mechanism sells to customers with type L at time zero and at price p() = L where L solves L +( L ) = 0 andH l ,H h , and the FOC solutiont f () are defined in Eq. (3.7). For customers with type< L ,p() =1. The proof is given in Section C.1.6. C.1.4 Optimal Mechanism for a Low Holding Cost In this section, we present the proof of Lemma C.1.2. Throughout the proof, for convenience, we assume that L . We need to show that the time of allocation in the optimal mechanism is given by t () := 8 > < > : t h () if maxf L ; g; 1 if< maxf L ; g = 8 > > > > > > > > < > > > > > > > > : 0 if h H ; t f () if2 [ h L ; h H ]; 1 if2 [ L ; h L ]; 1 if< L : (C.8) Note that t () =1 implies that the mechanism does not allocate the item to customers with type. To characterize the optimal mechanism, by Lemma 3.2.2, we need to solve the optimization Problem OPT-H. That is, we need to maximize the expected virtual revenue subject to IR and IC constrains. Lemma 3.2.1 shows that a mechanism is IC iff the interval and envelope conditions hold. This implies that one can replace the IC constraints with these two conditions. However, as we noted in Section 3.3, characterizing the optimal mechanism that satisfies both these two conditions in rather complicated. Thus, in the following, we relax Problem OPT-H and only consider the IR and envelope conditions. We then show that the solution of the relaxed problem also satisfies the interval condition. Thus, it is optimal. 167 The relaxed problem can be formulated as follows. max ft():2[; ]g;u(;)0 E max R(;t())ht(); 0 s.t. u(;) = u(; ) + Z e t(z)z (1t(z)z)dz 0 for 2 [; ]; (IR) (OPT-H-R) where the maximization is taken over the time of sales t() and the utility of a customer with type , i.e., u(; ). Here,R(;t) is the virtual value of customer of type at timet, and is defined in Eq. (B.14). The following lemma characterizes the optimal solution of the relaxed problem. This lemma is our main technical contribution of this section. Lemma C.1.5. If Assumption 3 holds and the holding cost h H l , in an optimal solution of Problem OPT-H-R, time of sales is t () andu(; ) = 0 where t () is defined in Eq. (C.8). The proof is provided in Section C.1.4. In the proof, we first show that t () is a feasible solution of the relaxed problem. Then, we show that it is optimal. To verify that t () is an optimal solution of Problem OPT-H, we show that the interval condition specified in Lemma 3.2.1 is fulfilled. That is, for any ^ ;2 [; ] such that ^ , Z ^ A(z;t ( ^ ))dz Z ^ A(z;t (z))dz; Z ^ A(z;t (z))dz Z ^ A(z;t ())d; whereA(z;t) = @ 1 V (z;t) = e tz (1tz) and R ^ A(z;t (z))dz = u(;)u( ^ ; ^ ). To this aim, we show that for anyz ^ ,A(z;t ( ^ ))A(z;t (z)) and for anyz,A(z;t (z))A(z;t ()). We will make use of the following preliminary result. Lemma C.1.6. The FOC solution t f (), defined in Eq. (3.7), is a decreasing function of as long as R(;t f ())ht f () 0. In addition, for any 2 [ L ; ], 0 A(;t h ()) 1 where A(z;t) = e tz (1tz). Unless stated otherwise, the proof of all technical lemmas is given in Section C.2. 168 Later in Lemma C.1.12, we will show that R(;t f ())ht f () 0 for any 2 [ h L ; h H ]. This result and the result in Lemma C.1.6 imply that t h () = t f () is decreasing for any2 [ h L ; h H ]. Then, considering the fact thatt h () = 0 for h H ,t h () = 1 for2 [ L ; h L ],t h ( h H ) = 0, andt h ( h L ) = 1 h L , we can conclude that t h () is decreasing in L . Now, we are ready to show that the interval conditions are satisfied. We first note that when ^ L , it easy to show that for anyz ^ ,A(z;t ( ^ ))A(z;t (z)). This holds becauseA(z;t ( ^ )) = 0 and as shown in Lemma C.1.6, A(z;t (z)) 0. Also, when L , we have A(z;t (z)) A(z;t ()) for anyz . This follows from the fact that bothA(z;t (z)) andA(z;t ()) are zero. Next, we assume that both and ^ are greater than L . Recall that for L , t () = t h (). We start with showing A(z;t h ( ^ )) A(z;t h (z)), z ^ . We consider two cases: 1- (1t h ( ^ )z) 0 and 2- (1t h ( ^ )z)> 0. Assume that (1t h ( ^ )z) 0. Then, we have e zt h ( ^ ) (1t h ( ^ )z) 0 e zt h (z) (1t h (z)z); where the second inequality follows from Lemma C.1.6 where we show thatA(z;t h (z)) = e zt h (z) (1 t h (z)z) 0. By the above equation, we getA(z;t h ( ^ ))A(z;t h (z)). Now, assume that (1t h ( ^ )z)> 0. Then, considering the fact that t h () is decreasing, for anyz ^ , we have (1t h (z)z) (1t h ( ^ )z), ande t h (z)z e t h ( ^ )z . By multiplying these two Equations, we getA(z;t h ( ^ ))A(z;t h (z)). Next, we will verify thatA(z;t h (z)) A(z;t h ()). Given that t h () is decreasing, for anyz , we have 0 (1t h (z)z) (1t h ()z); and e t h (z)z e t h ()z ; where the first inequality holds follows from Lemma C.1.2 where we showA(z;t h (z)) = e zt h (z) (1 t h (z)z) 0. By multiplying these two Equations, we haveA(z;t h (z))A(z;t h ()). Proof of Lemma C.1.5 Here, with some abuse of notations, we denote t () with t h (). Recall that t () = t h () when L and is1 otherwise. Also, for simplicity, we denoteu(;) byu(). 169 The proof has two parts. In the first part, we show that the solution given in Lemma C.1.5 is a feasible solution of Problem OPT-H-R. In the second part, we verify that this solution is an optimal solution of this problem. Feasibility: To show that t h () is a feasible solution of Problem OPT-H-R, we will verify thatu() 0 for any2 [; ]. For any h L , it is easy to verify thatu() =u() = 0. Thus, we only need to show thatu() 0 for any h L . To prove thatu() 0 for h L , we make use of Lemma C.1.6 where we show thate t h () (1t h ()) 0. This implies thatu() = R e t h (z)z (1t h (z)z)dz 0 Optimality: Here, we will show that the solution given in Lemma C.1.5, is an optimal solution of Problem OPT-H-R. To this end, we find an upper bound for the optimal value of Problem OPT-H-R by dualizing the IR constraints. Then, we will show that the solution given in Lemma C.1.5 achieves the upper bound and thus is optimal. Upper Bound of OPT-H-R: For any time of sales t() and Lagrangian function : [; ]! R + , we define the following function. L h (t();();u()) = E[R(;t())ht()u()] + Z (z)u(z)dz; whereu(z) = R z e t() (1t())d +u(), andR is defined in Eq. (B.14). Then, considering the fact that() 0, for any (t();u()) such thatu() =u() + R e zt(z) (1zt(z)) 0, we have E[R(;t())ht()u()] L h (t();();u()) One can think of() as a dual variable for the IR constraints. Therefore, for any : [; ]!R + , max (t();u())2T fE[R(;t())ht()u()]g max (t();u())2T fL h (t();();u())g; (C.9) 170 whereT = n (t();u()) : t() 0; u() + R e zt(z) (1zt(z)) 0 for any2 [; ] o is the set of feasible solutions. In the following, we will characterize an upper bound for max (t();u())2T fE[R(;t()) ht()u()]g by considering a specific Lagrangian function, defined below. h () = 8 > > > > > < > > > > > : 0 if> h L ; f()( +() + h e 1 ) 0 if2 [ L ; h L ]; f()(2 +()) 0 if2 [; L ]; (C.10) where f()(2+()) 0 and f()(+()+ h e 1 ) 0 are respectively the derivative of f()(2+()) and f()( +() + h e 1 ) with respect to. The following lemma establishes that h () 0. Lemma C.1.7. WhenhH l , for any2 [; ], h (), defined in Eq. (C.10), is non-negative. The following claim shows that (t h ();u() = 0) is an optimal solution of Problem OPT-H-R. Claim: With a slight abuse of notations, let (t ();u ) = arg max (t();u())2T fL h (t(); h ();u())g: Then,t () = t h () for any2 [; ] andu = 0. Furthermore,L h (t h (); h ();u ) = E[R(;t h ()) ht h ()u ]. Proof of the Claim: By definition, h () = 0 for> h L . Thus, we get L h (t(); h ();u()) = Z z= h L R(z;t(z))ht(z) f(z)dz + Z h L z= R(z;t(z))ht(z) f(z) + h (z)u(z) dzu(): (C.11) From definition of h (z), the last two terms of Eq. (C.11) can be written as Z h L L R(z;t(z))ht(z) f(z)dz + Z h L L u(z)d f(z)(z +(z) + h ze 1 ) Z L R(z;t(z))ht(z) f(z)dz + Z L u(z)d f(z)(2z +(z)) u(): (C.12) 171 We first focus on the first two terms wherez2 [ L ; h L ]. By integrating by part and using the definition of R, the first two terms can be rewritten as Z h L L e t(z)z z +(z)(1t(z)z ht(z) f(z)dz +u(z)f(z)(z +(z) + h ze 1 ) h L L Z h L L e t(z)z 1t(z)z f(z) z +(z) + h ze 1 dz: In the above equation, we use the fact that du(z) dz =e t(z)z 1t(z)z . Then, by definition of h L , i.e., the fact that ( h L +( h L ) + h h L e 1 ) = 0, the above equation is simplified as u( L )f( L )( L +( L ) + h L e 1 ) + Z h L L f(z) e t(z)z z 2 t(z)ht(z)e t(z)z 1t(z)z h ze 1 dz: (C.13) Now, we focus on the last three terms of Eq. (C.12). Again, by integrating by part and using definition of R, the last two terms of Eq. (C.12) can be rewritten as Z L f(z) e t(z)z z +(z)(1t(z)z) ht(z) dz +u(z)f(z)(2z +(z)) L Z L e t(z)z 1t(z)z f(z) (2z +(z))dzu() =u( L )f( L )(2 L +( L )) +u() 1f()(2 +()) + Z L f(z) ze t(z)z (1 + 2t(z)z)ht(z) dz: (C.14) Note that the coefficient ofu(), i.e., 1f()(2 +()) , can be simplified as2f() 0. By plugging Equations (C.13) and (C.14) into Eq. (C.11), and by using definition of L , we get L h (t(); h ();u()) = Z h L R(z;t(z))ht(z) f(z)dz + Z h L L f(z) e t(z)z z 2 t(z)ht(z)e t(z)z 1t(z)z h ze 1 dz + Z L f(z) ze t(z)z (1 + 2t(z)z)ht(z) dz 2f()u(): 172 First of all, since the coefficient ofu() is negative, to maximize the above equation, we need to setu() to zero. That is,u = 0. Then, max (t();u())2T fL h (t(); h ();u())g can be upper-bounded as follows max (t();u())2T fL h (t(); h ();u() = 0)g Z h L f(z) max t0 R(z;t)ht dz + Z h L L f(z) max t0 e tz z 2 thte tz 1tz h ze 1 dz + Z L f(z) max max t0 n ze tz (1 + 2tz)ht o ; 0 dz: (C.15) We take advantage of the following lemma to simplify the first term of the above equation. Lemma C.1.8. If Assumption 3 holds and the holding cost h H l , then for any z h L , we have arg max t0 R(z;t)ht = t h (z). Note that the optimal solution characterized in Lemma C.1.8 is the maximum of the FOC solution and zero. We now simplify the second term of Eq. (C.15). It is easy to verify that for anyz2 [ L ; h L ], we have arg max t0 e tz z 2 thte tz 1tz h ze 1 = 1 z = t h (z): (C.16) Finally, the following lemma characterizes an optimal solution of the third term of Eq. (C.15). Lemma C.1.9. If Assumption 3 holds and the holding costhH l , for anyz L , we have max t0 n ze tz (1 + 2tz)ht o 0: Lemmas C.1.8, C.1.9, and Eq. (C.16) show thatt () = t h () andu = 0. Then, the proof is completed by observing thatL h (t h (); h (); 0) = E[R(;t h ())ht h ()]. 173 C.1.5 Optimal Mechanism for a Medium Holding Cost Here, we present the proof for Lemma C.1.3. We show that in the optimal solution of Problem OPT-H, the time of purchase is given by t () = 8 > < > : t h () if M ; 1 if< M = 8 > > > > > < > > > > > : 0 if h H ; t f () if2 [ M ; h H ]; 1 if< M ; (C.17) andu(;) = R M e zt h (z) (1zt h (z))dz. Here, t () =1 implies that customer with type does not purchase the item. The proof has three main steps. In the first step, we relax the problem by ignoring both IC and IR constraints and we find an allocation rule that maximizes the virtual revenue. Then, we show that the solution of this relaxed problem can construct a mechanism that satisfy the IR and envelope conditions. Finally, we show that the aforementioned solution also satisfies the interval conditions, as a result, it is optimal. Maximizing virtual revenue without IC and IR constraints: Consider that following optimization prob- lem. max ft()0:2[; ]g E max R(;t())ht(); 0 (OPT-H-1) The following lemma shows that t (), given in Eq. (C.17), is an optimal solution of Problem OPT- H-1. Lemma C.1.10. The optimal solution of Problem OPT-H-1 is given by t () where t () is defined in Eq. (C.17). The proof is similar to the proof of Lemma C.1.8; thus, it is omitted. The main idea of the proof is to show thatR(;t)ht as a function oft has an inverted u-shape. Thus, it obtains its maximum at maxf0;t f ()g, where t f () is the FOC solution. Note that to show Lemma C.1.10, we need the 174 assumption that M , i.e., the solution ofR(;t f ())ht f () = 0, is unique. By this assumption, for any< M we get max t0 R(;t)ht = R(;t f ())ht f () < 0 for < M : This implies that it is optimal not to allocate the item to customers with type< M . Maximizing virtual revenue with IR and envelope constraints: Here, we show that the time of purchase t () is an optimal solution of Problem OPT-H-R. 1 To this aim, we verify that u(;) = Z M e t h (z)z (1t h (z)z)dz 0: Particularly, we show that for any M , (1t h ()) 0. Since t h () = 0 for h H , it suffices to show that (1t h ()) 0 for any2 [ M ; h H ]. Lemma C.1.11. For anyh2 [H l ;H h ] and2 [ M ; h H ], we have 1t h () 0. In the proof, we show that whenh H l and ~ , we have 1t f () 0. Then, we show that M ~ . This implies that 1t f () 0 for any2 [ M ; h H ], which is the desired result. Maximizing virtual revenue with IR and IC constraints: Here, we need to show that the time of pur- chaset () and its associated payment, given in Lemma C.1.3, satisfy the interval conditions presented in Lemma 3.2.1. This part of the proof is very similar to that of Lemma C.1.2. Thus, we do not repeat it here. C.1.6 Optimal Mechanism for a High Holding Cost In this section, we present the proof of Lemma C.1.4. In the following, we show that max t0 fR(;t)htg = R(; 0) = +() 0 for L , and for any < L , max t0 fR(;t)htg < 0 whereR is defined in Eq. (B.14). This implies that in the optimal mechanism, the firm only sells to customers with type L . 1 It is easy to observe that in an optimal solution problem OPT-H-R, we need to setu(; ) to zero. 175 We first show that for any L , arg max t0 fR(;t)htg = 0. To this aim, we will verify that @(R(;t)ht) @t t=0 0. This will give us the desired result because as we show in Lemma C.1.8,R(;t)ht as a function oft has an inverted u-shape. Therefore, if @(R(;t)ht) @t t=0 0, we have arg max t0 fR(;t) htg = 0. By definition, @ (R(;t)ht) @t = e t ( +()(2t))h; and att = 0 and for any L , we have @ (R(;t)ht) @t t=0 = ( + 2())h 0; (C.18) where the inequality holds because h H h = 2 L = L ( L + 2( L )) = max 2[ L ; ] f( + 2())g: The first equality follows because L +( L ) = 0 and last equality holds because arg max 2[ L ; ] f(+ 2())g = L . To see why the latter holds note that ( + 2()) 0 = 2( +()) 2 0 () 0; where the inequality follows because for any L , we have ( +()) 0. Next, we will verify that for any< L , max t0 fR(;t)htg < 0. Note that it suffices to show that max t0 fR(;t)H h tg < 0 considering the fact thatR(;t)ht is decreasing inh. By Eq. (C.18), ath =H h we have t f ( L ) = 0, and more importantly R( L ;t f ( L ))H h t f ( L ) = L +( L ) = 0: Then, considering the fact thatR(;t f ())H h t f () = 0 has unique solution, we have R(;t f ())H h t f () = max t0 fR(;t)H h tg < 0 for any< L : 176 C.1.7 Proof of Theorems 3.4.1 an 3.4.5 Here, we show the results in Theorem 3.4.5. Note that Theorem 3.4.1 is a special case of Theorem 3.4.5 withh = 0. The proof is similar to the proof of Theorem 3.4.4; thus we do not repeat it here. We only present a proof sketch. We need to find an optimal solution of Problem OPT-H where the objective function is replaced with E max R(;t())ht()u(; )c; 0 . We first relax the problem by ignoring the interval conditions. We will show that an optimal solution of the relaxed problem is t h () for c and is1 for < c , where c solves R( c ;t h ( c ))ht h ( c ) = c. This follows from Lemma C.1.12, stated at the end of this section, where we show thatR(;t h ())ht h () is an increasing function of h . Then, to complete the proof, we will show that the optimal solution of the relaxed problem satisfies the envelope conditions. Lemma C.1.12. For anyh 0,R(;t h ())ht h () is increasing in h . Furthermore,R(;t h ()) ht h () 0 for any h . Proof. Proof of Lemma C.1.12We show the result forh H l whereH l is defined in Eq. (3.7). A similar argument holds forh>H l . By definition, for anyhH l , we have R(;t h ())ht h () = 8 > > > > > < > > > > > : +() if h H ; R(;t f ())ht f () if2 [ h L ; h H ]; e 1 h if2 [ L ; h L ]; R(;t h ())ht h () is obviously increasing when and h H and h L . Furthermore,R(;t h ()) ht h () is a continuous function of because t h () is continuous. Thus, it suffices to show thatR(;t h ()) ht h () is increasing in2 [ h L ; h H ]. 177 Recall that t h () = t f () for2 [ h L ; h H ]. That is, t h () is the FOC solution. Thus, by the Envelope theorem, the derivative ofR(;t f ())ht f () w.r.t. is given by @ R(;t f ())ht f () @ = t f ()e t f () ( +()(2t f ())) + e t f () (1 + 0 ()(1t f ())) = h t f () +e t f () (1 + 0 ()(1t f ())) 0; where the inequality holds because, as we show in Lemma C.1.6, 1t f () 0 for any2 [ h L ; h H ], and the second equality follows from the FOC, i.e., by the fact that @R(;t) @t t=t f () h = e t f () ( +()(2t f ()))h = 0: Finally, sinceR( L ;t h ( L ))ht h ( L ) = 0, we haveR(;t h ())ht h () 0 for L ; see definition of L in Eq. (3.7). C.1.8 Discussing the Assumption in Theorem 3.4.4 In this section, we discuss the assumption in Theorem 3.4.4. This assumption requires that the solution of equationR(;t f ())ht f () = 0 to be unique. The following lemma shows that for anyh2 [H l ;H h ],R(;t f ())ht f () = 0 has a unique solution if the solution ofR(;t f ())H l t f () = 0 is unique. In addition, it shows thatR(;t f ())H l t f () = 0 has a unique solution when 0 () is small enough. Lemma C.1.13. If the solution of R(;t f ()) H l t f () = 0 is unique, then, for any h 2 [H l ;H h ], R(;t f ())ht f () = 0 has a unique solution. Furthermore, the solution ofR(;t f ())H l t f () = 0 is unique if 0 () ( p 5+1) 2 2 5:2 for any ~ where ~ solves 2 ~ +( ~ ) = 0 andH l and the FOC solution t f () are defined in Eq. (3.7). The proof of Lemma C.1.13 is given at the end of this section. Note that for the Uniform and Exponential distributions, we have 0 () 5:2. In fact, for the Uniform distributionU(a;b), we have 0 () = 1 for any2 [a;b] wherea < b anda;b2R. For the Exponential distribution with rate 0, 0 () = 0 for 178 any 0. Furthermore, for a truncated Normal distribution with mean, standard deviation, and cut-off greater than , we have 0 () 4:48 for any (). Note that the domain of the truncated Normal distribution with cut-offC is [C;1). Proof. Proof of Lemma C.1.13 First, we show that if the solution of Eq. (C.19) is unique ath = H l , then this equation has a unique solution for anyh2 [H l ;H h ]. R(;t f ())ht f () = 0: (C.19) By Lemma C.1.3, ~ solvesR( ~ ;t f ( ~ ))H l t f ( ~ ) = 0 where 2 ~ +( ~ ) = 0 and 1t f ( ~ ) ~ = 0. Then, by our assumption, ~ is the unique solution of Eq. (C.19) at h = H l . This assumption and the proof of Lemma C.1.11 imply that for any h > H l , any solutions of Eq. (C.19) satisfy the following property: 1t f () 0. Next, we use this property to show that for anyh2 [H l ;H h ], there is only one solution to Eq. (C.19). Let 0 solve Eq. (C.19). By the Envelope theorem, the derivative ofR(;t f ())ht f () w.r.t. at 0 is given by @ R(;t f ())ht f () @ = 0 = t f ( 0 )e t f ( 0 ) 0 ( 0 +( 0 )(2t f ( 0 ) 0 )) + e t f ( 0 ) 0 (1 + 0 ( 0 )(1t f ( 0 ) 0 )) = h t f ( 0 ) 0 +e t f ( 0 ) 0 (1 + 0 ( 0 )(1t f ( 0 ) 0 )) > 0; (C.20) where the second equality follows from the FOC, i.e., @(R( 0 ;t)ht) @t t f ( 0 ) = 0 and the inequality holds because (1t f ( 0 ) 0 ) 0. By the above equation the derivative ofR( 0 ;t f ( 0 ))ht f ( 0 ) w.r.t. 0 is always positive. This implies that Eq. (C.19) has a unique solution. Next, we show that at h = H l , the solution of Eq. (C.19) is unique if for any ~ , 0 () ( p 5+1) 2 2 5:2. We first argue that any > ~ cannot solve Eq. (C.19). To this end, we use the proof of Lemma C.1.3 where we show 1t f () 0 for any ~ . The fact that 1t f () 0 for any ~ implies that 179 @ R(;t f ())ht f () @ > 0; see Eq. (C.20). Then, considering the fact thatR( ~ ;t f ( ~ ))H l t f ( ~ ) = 0, we haveR(;t f ())H l t f ()> 0 for any ~ . Next, we show that any < ~ cannot solve Eq. (C.19). Let =t f (). Then, Eq. (C.19) ath =H l can be written as G(;) := e ( +()(1))H l = 0: We assume, contrary to our result, that there exists 0 < ~ that solves Eq. (C.19). Then, we show that we have @G @ = 0 > 0 and @G @ = ~ > 0. This implies that there cannot exist 0 < ~ that solves Eq. (C.19). We consider the following two cases: i- 1 0 and ii- 1 < 0. Case i- By the FOC, we have @G @ = 0. This implies that @G @ = 0 = e ( 0 +( 0 )(1)) + 0 e (1 + 0 ( 0 )(1)) = H l 0 + 0 e (1 + 0 ( 0 )(1)) 0; (C.21) where the second equation holds becauseG( 0 ;) = 0 and the inequality holds because 1 0. Note that the above equation also implies that @G(;) @ = ~ > 0 considering the fact that at = ~ , we have 1 = 1 ~ t f ( ~ ) = 0. Case ii- Next we focus on the case of 1 < 0. In the following, we show when 1 < 0 and 0 () ( p 5+1) 2 2 5:2 for any ~ , we get @G @ = 0 > 0. By defintion, @G @ = 0 = e 2 0 + ( 0 0 ( 0 ) +( 0 ))(1) e 2 0 + ( 0 0 ( 0 ) 2 0 )(1)) = e 0 0 ( 0 )(1) + 2 : (C.22) The inequality holds because 1 < 0 and 0 ~ . Note that for any 0 ~ ,( 0 )2 0 . To complete the proof, we show that ( 0 ( 0 )(1) + 2) 0 when 0 ( 0 ) ( p 5+1) 2 2 . First assume that 0 ( 0 ) 2. Then, we get 0 ( 0 )(1) + 2 2(1) + 2 = 2 > 0; 180 where the first inequality holds because 1 < 0. Now, assume that 0 ( 0 )2 [2; ( p 5+1) 2 2 ]. We make use of the following claim. Claim: Let 0 < ~ solve Eq. (C.19). Then, = 1t f ( 0 ) 0 1+ p 5 2 . The proof of the claim is given at the end of the proof of this lemma. Given that 0 ( 0 )2 [2; ( p 5+1) 2 2 ], then7! ( 0 ()(1) + 2) is decreasing. Then, by the claim, we get 0 ()(1) + 2 0 () 1 1 + p 5 2 ! + 2 1 + p 5 2 ! 0; where the second inequality holds because 0 () ( p 5+1) 2 2 5:2. Proof of Claim: Since 0 solves Eq. (C.19) and t f ( 0 ) is the FOC solution, we get 0 e ( 0 +( 0 )(1)) = H l and 0 e ( 0 +( 0 )(2)) = H l ; where the second equation implies that 2. By dividing these two Equations, we get ( 0 +( 0 )(1)) ( 0 +( 0 )(2)) = 0. This can be simplified as 0 +( 0 ) 1 + 2 1 + = 0; where for any 2 [1; 2], 7! 1+ 2 1+ is decreasing, 1+ 2 1+ crosses zero at 1+ p 5 2 , and at = 1, 1+ 2 1+ =1 = 1 2 . We note that 0 + ( 0 ) 1+ 2 1+ =1 = 0 + 1 2 ( 0 ) < 0 for any 0 < ~ , and 0 +( 0 ) 1+ 2 1+ = 1+ p 5 2 = 0 0. Then, we can conclude that that solves 0 +( 0 ) 1+ 2 1+ = 0 should be less than 1+ p 5 2 . C.1.9 Proof of Theorem 3.5.1 Throughout the proof, for simplicity, we denoteu(;) byu(). Here, we characterize the optimal mecha- nism under valuation functionV (;t) =e g()t whereg() satisfies the following assumption. Assumption 5. For any2 [; ],g() 0 andg 0 () 0. Furthermore, g 0 () g() andt g (), defined in Eq. (3.9), are decreasing in2 [; ]. 181 Under the valuation function V (;t) = e g()t , the expected virtual revenue can be written as E[R g (;t)]u() where R g (;t) =e g()t +() 1g 0 ()t : (C.23) Then, to characterize the optimal mechanism, one needs to find an optimal solution of the following opti- mization problem. max ft();p():2[; ];u()0g E max R g (;t())u(); 0 s.t. IC and IR constraints; (OPT-G) where by Lemma 3.2.1, the IC constraints hold if the envelope and interval conditions are satisfied. Similar to the proof of Theorem 3.3.2, we first ignore the interval conditions and consider the following relaxed problem. max ft():2[; ];u()0g E max R g (;t())u(); 0 s:t: u() 0 2 [; ]; (OPT-G-R) where by the envelope condition given in Lemma 3.2.1,u() =u() + R e t(z)g(z) (1 t(z)g 0 (z)z)dz. In the following, we first show that t g (), given in Eq. (3.9), andu() = 0 solve the above optimization problem. Then, we show that this solution satisfies the interval condition. Therefore, it is an optimal solution of Problem OPT-G. Lemma C.1.14. If Assumptions 3 and 5 hold, in an optimal solution of the Problem OPT-G-R, time of sales is t g () andu() = 0 where t g () is given in Eq. (3.9). We provide the proof of Lemma C.1.14 in Section C.1.14. The proof has two parts. In the first part, we need to show that t g () andu() = 0 construct a feasible solution of Problem OPT-G-R. To this aim, we use our assumption thatg() is log-concave; that is, g 0 () g() is decreasing in. In the second part, we show that the solution is optimal. To this end, similar to the proof of Theorem 3.3.2, we find an upper bound on the relaxed problem by dualizing the IR constraints. Then we show that (t g ();u() = 0) achieves the upper bound. 182 The last step is to show the optimal solution of the relaxed problem satisfies the interval conditions. That is, for any ^ ;2 [; ] such that ^ : Z ^ A g (z;t g ( ^ ))dz Z ^ A g (z;t g (z))dz; Z ^ A g (z;t g (z))dz Z ^ A g (z;t g ())dz; whereA g (z;t) =e g(z)t (1g 0 (z)tz). To this aim, we show that for anyz ^ ,A g (z;t g ( ^ ))A g (z;t g (z)) and for anyz<,A g (z;t g (z))A g (z;t g ()). We start by showing that A g (z;t g ( ^ )) A g (z;t g (z)) for any z ^ . We consider two cases: 1- (1g 0 (z)t g ( ^ )z) 0 and 2- (1g 0 (z)t g ( ^ )z)> 0. Assume that (1g 0 (z)t g ( ^ )z) 0. Then, e g(z)tg ( ^ ) (1g 0 (z)t g ( ^ )z) 0 e g(z)tg (z) (1g 0 (z)t(z)z); where the second inequality follows from Lemma C.1.14. There, we show that for anyz2 [; ], we have e g(z)tg (z) (1g 0 (z)t(z)z) 0. The above equation implies thatA g (z;t g ( ^ ))A g (z;t g (z)). Now, assume that (1g 0 (z)t g ( ^ )z)> 0. By Assumption 5, t g () is decreasing. This leads to (1g 0 (z)t g (z)z) (1g 0 (z)t g ( ^ )z) 0; and e g(z)tg (z) e g(z)tg ( ^ ) : By the above Equations, we haveA g (z;t g ( ^ )) A g (z;t g (z)). Next, we will verify thatA g (z;t g (z)) A g (z;t g ()), forz < . Since t g () is decreasing, for any z<, we get 0 (1g 0 (z)t g (z)z) (1g 0 (z)t g ()z); and e g(z)tg (z) e g(z)tg () ; where the first inequality holds because as we show in Lemma C.1.14, for any z 2 [; ], we have e g(z)tg (z) (1g 0 (z)t(z)z) 0. The above Equations show thatA g (z;t g (z))A g (z;t g ()). 183 Proof of Lemma C.1.14 We first show that the allocation t g () andu() = 0 construct a feasible solution of Problem OPT-G-R. To this aim, we show thatu() = R e g(z)tg (z) (1g 0 (z)t g (z)z)dz = 0 for any g L , andu() > 0 otherwise. The former is easy to verify as t g () = 1 g 0 () for g L . To verify thatu()> 0 for any> g L , we make use of the following lemma. Lemma C.1.15. For any2 [; ], we have (1g 0 ()t g ()) 0. To show Lemma C.1.15, we use assumption 5 that g 0 () g() is decreasing. Next, we show that (t g ();u() = 0) is an optimal solution of Problem OPT-G-R. To this aim, we find an upper bound for the optimal value of Problem OPT-G-R using the weak duality theorem. Then, we will show that (t g ();u() = 0) achieves the upper bound, thus it is optimal. In the proof, with a slight abuse of notations, we denote the optimal value of Problem OPT-G-R with OPT-G-R. Upper Bound of OPT-G-R: For any allocation time t(), and Lagrangian function : [; ]!R + , we define the following function. L g (t();();u()) = E[R g (;t())] + Z (z)u(z)dzu(); whereu() = u() + R e g(z)t(z) (1g 0 (z)t(z)z)dz. Then, considering the fact that() 0, for any (t();u()) such thatu() =u() + R e g(z)t(z) (1g 0 (z)t(z)z)dz 0, we have we have E[R g (;t())u()] L g (t();();u()) Therefore, for any : [; ]!R + , max (t();u())2T fE[R g (;t())]u()g max (t();u())2T fL g (t();();u())g; (C.24) where T = n (t();u()) : t() 0; u() + R e g(z)t(z) (1g 0 (z)t(z)z)dz 0 for any2 [; ] o 184 is the set of feasible solution.In the following, we will characterize an upper bound for OPT-G-R by evalu- ating the r.h.s. of the above equation for a specific Lagrangian function(), defined below. g () = 8 > < > : f()( g() g 0 () +()) 0 if g L ; 0 otherwise; (C.25) Lemma C.1.16. If Assumptions 3 and 5 hold, for any2 [; ], g () 0. In the proof of Lemma C.1.16, we use our assumption thatg 0 () 0 and g() g 0 () is increasing in. Then, the result follows by the following claim. Claim: With a slight abuse of notations, let (t ();u ) = arg max (t();u())2T fL g (t(); g ();u())g. Then, t () = t g () for any 2 [; ] and u = 0. Furthermore, L(t g (); g ();u() = 0) = E[R g (;t g ())], where the expectation is taken w.r.t.. Proof of the Claim: By definition, L g (t(); g ();u()) = Z z= g L R g (z;t(z))f(z)dz + Z g L R g (z;t(z))f(z) + g (z)u(z) dzu(); (C.26) where the second term can be rewritten as Z g L R g (z;t(z))f(z)dz + Z g L u(z)d f(z) g(z) g 0 (z) +(z) : = Z g L f(z)e g(z)z z +(z)(1g 0 (z)t(z)z) dz +u(z)f(z) g(z) g 0 (z) +(z) g L Z g L e g(z)z 1g 0 (z)t(z)z f(z) g(z) g 0 (z) +(z) dz = Z g L f(z)e g(z)t(z) z (1g 0 (z)t(z)z) g(z) g 0 (z) dzu()f() g() g 0 () +() ; (C.27) 185 where the second equation follows from Eq. (C.23) and integrating by part, and the last equation follows from definition of g L ; that is, g( g L ) g 0 ( g L ) +( g L ) = 0. By plugging Eq. (C.27) in Eq. (C.26),we have L g (t(); g ();u()) = Z g L R g (z;t(z))f(z)dz + Z g L f(z)e g(z)t(z) z (1g 0 (z)t(z)z) g(z) g 0 (z) dzu()f() g() g 0 () : Considering the fact that the coefficient ofu(), i.e.,f() g() g 0 () 0, to maximize the above equation, we setu() = 0. That is,u = 0. Then, max (t();u()=0)2T fL g (t(); g ();u() = 0)g Z g L f(z) max t0 R g (z;t) dz + Z g L f(z) max t0 e g(z)t z (1g 0 (z)tz) g(z) g 0 (z) dz: In Lemma C.1.17, we show that for any g L , we get arg max t0 R g (z;t) = t g (z). Then the result follows because for anyz g L , we have arg max t0 e g(z)t z (1g 0 (z)tz) g(z) g 0 (z) = 1 g 0 (z)z : Lemma C.1.17. For any g L , we have arg max t0 fR(;t)g = t g (). C.2 Technical Proofs C.2.1 Proof of Lemma C.1.6 t f () is decreasing in: Since t f () is the FOC solution, we have @(R(;t)ht) @t t=t f () =e t f () ( +()(2t f ()))h = 0: DefineW (;t) := @(R(;t)ht) @t = e t ( +()(2t))h. Then, the FOC implies that W (;t f ()) = 0. Thus, @t f () @ = W (;t f ()) W t (;t f ()) ; 186 whereW (;t) = @W (;t) @ t=t f () andW t (;t f ()) = @W (;t) @t t=t f () . In the following, we will show that bothW (;t) andW t (;t) are non-positive. This implies that @t f () @ 0. Throughout the proof, for simplicity, we denote t f () byt. By definition, we get W (;t) =(1t)e t +()(2t) e t (1 + 0 ()(2t)t()) = (1t) h e t (1 + 0 ()(2t)t()); where the second equality follows becauseW (;t) = 0. Again, by the fact thatW (;t) = 0, we can replacee t by h (+()(2t)) . Then, W = (1t) h + h +()(2t) e t ( 0 ()(2t)t()) =h(2t) +()(1t) +()(2t) e t ( 0 ()(2t)t()) 0: (C.28) The inequality holds because by the FOC condition, i.e., W (;t) = 0, we have 2t 0 and +()(2t) 0, and by our assumption thatR(;t)ht 0, we have+()(1t) 0. Note that sinceR(;t)ht = e t +()(1t) ht 0, we get +()(1t) 0. Next, we show thatW t (;t) 0. By definition, W t (;t) = 2 2 e t +()(2t) + 2 2 ()e t 0; where the inequality holds because by the FOC condition, +()(2t) 0. The above equation along with Eq. (C.28) imply that t f () is decreasing. A(;t h ())2 [0;1]: Note thatA(;t h ()) = 0 for h L and is 1 for h H . Thus, it suffices to show thatA(;t h ())2 [0; 1] for any2 [ h L ; h H ]. To this end, in Lemma C.2.1, we will show that 1t h () 0 for any2 [ h L ; h H ]. Lemma C.2.1. WhenhH l , then (1t h ()) 0 for any2 [ h L ; h H ]. 187 Proof of Lemma C.2.1 Here, we show that setf : > h L ; and 1t f () = 0g is empty. That is, there does not exists any > h L with 1t f () = 0. Then, by the fact that 1t f ( h H ) h H = 1 and 1t f ( h L ) h L = 0, we have 1t f () 0 for any2 [ h L ; h H ]. Assume, contrary to our result, that there exists 0 > h L that solves 1t f ( 0 ) 0 = 0. Then, we show that this cannot happen. Let2f h L ; 0 g. Since t f () is the FOC solution, we have @(R(;t)ht) @t t f () = 0. This condition can be rewritten as W (;;h) := e ( +()(2))h = 0; where = t f (). In the following, we will show that for2f h L ; 0 g, we have @ @ = W W 0 , whereW := @W (;;h) @ andW := @W (;;h) @ . This implies that there does not exists 0 > h L that solves 1t f ( 0 ) 0 = 0. To show @ @ 0, we will verify thatW 0 andW 0. By definition, W = e +()(2) +()e 0; where the inequality follows from the FOC, i.e., the fact thatW (;;h) = 0. To make it more clear, by the FOC, ( +()(2))< 0 and as a result,W 0. Next, we show thatW 0 for2f h L ; 0 g. By definition, W = e 2 + () + 0 () (2) : By the fact that for2f h L ; 0 g, we have 1t f () = 0, and thus = 1. This shows that W = e 1 2 +() + 0 () 0; where the inequality holds because h L ~ and as a result 2 +() 0 for2f h L ; 0 g. Recall that 0 > h L . 188 C.2.2 Proofs of Lemmas in Sections C.1.4 and C.1.5 Proof. Proof of Lemma C.1.7 The proof is naturally divided into two parts. In the first part, we show that h () 0 for any< L and in the second part, we show that h () 0 for any2 [ L ; h L ]. First Part: By Eq. (C.10), for any L , we have h () = f 0 () (2 +()) +f() 2 + 0 () : (C.29) We note that by definition, we havee 1 ( L ) 2 =h. Thus, given thathH l = ~ 2 e 1 , we have L ~ . This implies that for any L , we have 2 +() 0. Then, iff 0 () 0, we have h () 0. Now, assume thatf 0 ()> 0. Then, h () f 0 ()() +f() 0 () = (f()()) 0 = (F () 1) 0 0 (C.30) Second Part: By definition, for any2 [ L ; h L ], we have h () = f 0 () +() + h e 1 +f() 1 + 0 () h 2 e 1 f 0 () +() + h e 1 +f() 1 + 0 () h ( L ) 2 e 1 = f 0 () +() + h e 1 +f() 0 (); where the inequality holds because L , and the last equation follows from definition of L . We consider the following two cases. Case 1:f 0 () 0: To show h () 0, we use the fact that for L , function7!+()+ h e 1 is increasing in. Then, considering the fact that h L +( h L )+ h h L e 1 = 0, we have +() + h e 1 0 for2 [ L ; h L ]. This implies that h () 0 whenf 0 () 0. The derivative of +() + h e 1 w.r.t. is given by 1 + 0 () h 2 e 1 1 + 0 () h ( L ) 2 e 1 = 0 () 0; where the first inequality holds because L , and the second inequality follows from the definition of L . 189 Case 1:f 0 ()> 0: In this case, we have h () f 0 ()() +f() 0 () = (f()()) 0 = (F () 1) 0 0: The last inequality completes the proof. Proof. Proof of Lemma C.1.8: Here, we will show that for any, the objective function,R(;t)ht is a unimodular function oft and achieves its maximum at the FOC solution, denoted by t f (). Then, we show that arg max t0 fR(;t) htg = maxft f (); 0g = t h (). To show that the objective function is unimodular, we will make the following observations: 1- The derivative of the objective function w.r.t. t att = +3() () is negative, att =1 is1, and att =1 is negative. 2- For anyt +3() () , the objective function is a concave function oft, and for anyt> +3() () , the objective function is a convex function oft. These two observations imply that for any given,R(;t) is a unimodular function oft, and achieves its maximum att< +3() () . First Part: The derivative of the objective function with respect tot is given by @R(;t) @t h = e t ( +()(2t))h: (C.31) Note that ast approaches1, the derivative of the objective function with respect tot converges to1. Furthermore, as t converges to1, the derivative goes toh. In addition, one can easily show that the derivative is negative att = +3() () . Second Part: The second derivative of the objective function with respect tot is given by () 2 e t ( +()(3t)): (C.32) It is easy to observe that the second derivative is negative for anyt< +3() () , and is non-negative otherwise. This implies that the objective function is concave for anyt +3() () and it is convex for anyt> +3() () . 190 So far, we established thatR(;t)ht is a unimodular function oft and achieves its maximum at the FOC solution, denoted by t f (). By Lemma C.1.6, the FOC solution is decreasing in. This and the fact that t f ( h H ) = 0 lead to maxft f (); 0g = 0 for any h H and maxft f (); 0g = t f () = t h () for any 2 [ h L ; h H ]. Proof. Proof of Lemma C.1.9 LetG(z;t) = ze tz (1 + 2tz)ht. We show that for anyz L , we have max t0 fG(z;t)g 0. First observe thatG(z;t = 0) =z 0 andG(z;t =1) =1. Then, to show that max t0 fG(z;t)g 0, we will verify thatG(z;t) 0 at the FOC solution, i.e., at the solution that solves @G(z;t) @t = e tz z 2 (3 2tz)h = 0: We denote the FOC solution byt F (z). We show thatG(z;t F (z)) 0. To this aim, we show that i- @G(z;t F (z)) @z 0 when (1 + 2t F (z)z) 0, ii-zt F (z) is increasing in z, and iii-G( L ;t F ( L )) = 0. The fact thatzt F (z) is increasing inz implies that there exists ^ z2 [; L ) such that (1 + 2t F (z)z)> 0 for anyz > ^ z and (1 + 2t F (z)z) 0. Then, sinceG( L ;t F ( L )) = 0 and @G(z;t F (z)) @z 0 when (1+2t F (z)z) 0, for anyz> ^ z we haveG(z;t F (z))G( L ;t F ( L )) = 0, which is the desired result. Furthermore, for anyz ^ z,G(z;t F (z)) 0 as for this range ofz, we have (1 + 2t F (z)z) 0. Claim i: @G(z;t F (z)) @z 0 when (1 + 2t F (z)z)> 0. By the envelope theorem, we get @G(z;t F (z)) @z = e t F (z)z 1 +t F (z)z 5 2t F (z)z 0; where the inequality holds because (1 + 2t F (z)z)> 0 and by the FOC (3 2t F (z)z) 0. To see why note thatx7!1 +x 5 2x is positive whenx2 [ 1 2 ; 3 2 ]. Claim ii: z7! (zt F (z)) is an increasing function. Define = zt F (z). By the FOC, we have W (z;) :=e z 2 (3 2)h = 0. Then, @ @z = @W (z;) @z @W (z;) @ = e z 2 (5 2) 2ze (3 2) 0; where the inequality holds because by the FOC 3 2 0. 191 Claim iii:G( L ;t F ( L )) = 0. Note thatt F ( L ) = 1 L and as a result, G( L ;t F ( L )) = L e 1 h L = 0; where the last equation follows from definition of L . Proof. Proof of Lemma C.1.11 The proof has two parts. In the first part, we show that whenh H l and ~ , we have 1t f () 0. Then, in the second part of the proof, we show that M ~ . This implies that 1t f () 0 for any2 [ M ; h H ], which is the desired result. First Part: Here, we show that any solution of 1t f () = 0, denoted by 0 , is less than equal to ~ . Let 0 be the maximum of such solution; that is 0 = maxf : 1t f () = 0g. Then, considering the fact that 0 ~ , 1 h H t f ( h H ) = 1, and 1 0 t f ( 0 ) = 0, we can conclude that 1t f () > 0 for any2 [ ~ ; h H ]. Suppose, contrary to our claim, that there exists 0 > ~ that solves 1t f ( 0 ) 0 = 0. By the FOC, we have @R( 0 ;t) @t t=t f ( 0 ) = 0 e 0 t f ( 0 ) ( 0 +( 0 )(2 0 t f ( 0 )))h = 0 Since 0 solves 1t f ( 0 ) 0 = 0, we get @R( 0 ;t) @t t=t f ( 0 ) = 0 e 1 ( 0 +( 0 )) = h: (C.33) We note that 0 7! 0 e 1 ( 0 +( 0 )) is decreasing in 0 . This holds because d( 0 ( 0 +( 0 ))) d 0 = (2 0 +( 0 )) 0 0 ( 0 ) 0; where the inequality follows because 0 > ~ . This implies that max 0 ~ f 0 e 1 ( 0 +( 0 ))g = ~ e 1 ( ~ +( ~ )) = ~ 2 e 1 = H l : 192 Then, by Eq. (C.33), we can conclude that when h > H l , there does not exists any 0 > ~ such that 1t f ( 0 ) 0 = 0. Second Part: Here, we show that M ~ . To this aim, we show that @ M @h 0 when 1t f ( M ) M 0. This verifies that M increases as we increaseh fromH l . The reason is that ath =H l , we have M = ~ and 1t f ( M ) M = 0. This implies ath =H l , whenh is increased, we have M ~ . Then, by the first part of the lemma, we know that 1t f ( M ) M 0 when we increaseh. This allows us to repeat this procedure to show that @ M @h 0 for anyhH l . Let = M and =t f (). Then, by definition, we have G(;;h) :=e ( +()(1))h = 0; W (;;h) :=e ( +()(2))h = 0: The first equation follows from the fact that at = M , R( M ;t f ( M ))ht f ( M ) = 0 and the second equation follows from the FOC, i.e., @(R(;t)ht) @t t=t f () = 0. In the following, we show that @ @h 0 when 1 0. The aforementioned Equations imply that @G @ @ @h + @G @ @ @h =; @W @ @ @h + @W d @ @h = 1: This leads to @ @h = @G @ 1 @W @ @G d @G d @W d @W d = @W @ @G @ @G @ @W @ @G @ @W @ ; It is easy to observe that @G @ = W (;;h) = 0. Thus, @ @h = @G @ . In the following, we will show that @ @h 0 by verifying @G @ 0. By definition, @G @ = e ( +()(1)) +e (1 + 0 ()(1)): 193 We note that the first term, i.e., (+()(1)), is non-negative becauseG(;;h) =e (+()(1 ))h = 0. Also, the second term is positive as 1 0. This gives us @G @ 0 and thus @ @h 0. C.2.3 Proof of Lemmas in Section C.1.9 Proof. Proof of Lemma C.1.15It is easy to verify that (1g 0 ()t g ()) = 1 for any> g H , and it is zero for any g L . Thus, it suffices to show that (1g 0 ()t g ()) 0 when2 [ g L ; g H ]. By definition, for any2 [ g L ; g H ], we have (1g 0 ()t g ()) = 1 () + g 0 () g() : Since> 0, to show (1g 0 ()t g ())> 0, it suffices to verify that 1 () + g 0 () g() 0. To that end, we show that 1 () + g 0 () g() is decreasing in. Then by the fact that 1 ( g L ) + g 0 ( g L ) g( g L ) = 0, we have 1 () + g 0 () g() 0. The derivative of 1 () + g 0 () g() w.r.t. is given by 0 () () 2 + ( g 0 () g() ) 0 0: The inequality holds because by Assumption 3, we have 0 () 0, and by Assumption 5, g 0 () g() is decreasing in. Proof. Proof of Lemma C.1.16 To show the result, we will verify that g () 0 for any g L . By Eq. (C.25), for any g L , g () = f 0 () g() g 0 () +()) +f() g() g 0 () 0 + 0 () : We consider the following cases: 1-f 0 () 0: Observe that the first term of g () is non-negative. This is the case because by Assump- tion 5, g() g 0 () + () 0 for any g L . To see why note that g() g 0 () + () is increasing in , and 194 g( g L ) g 0 ( g L ) +( g L ) = 0. Also, note that the second term of g (), i.e.,f() g() g 0 () 0 + 0 () , is greater than or equal to zero. This holds because by Assumption 5, g() g 0 () 0 0. 1-f 0 () 0: Sinceg 0 () 0, we have g () f 0 ()() +f() 0 () = (f()()) 0 0: The last inequality holds becausef()() =F () 1 is increasing in. Proof. Proof of Lemma C.1.17 Here, we show that for any g L , arg max t0 R g (;t) = t g (). To this end, we show that the objective function, i.e.,R g (;t), has an inverted u-shape int. Then, we show that for g H ,R g (;t) achieves its maximum att = 0, and for2 [ g L ; g H ],R g (;t) gets maximized at t g (), where t g () solves the FOC. Following the steps in Lemma C.1.8, one can show that R g (;t) has an inverted u-shape and gets maximized at the FOC solution where the FOC solution solves @R g (;t) @t = e g()t g()()g()(1g 0 ()t)()g 0 () = 0: (C.34) This implies that the FOC solution, denoted by t f (), is given by g()+()g 0 ()+()g() ()g()g 0 () . Next, we show that for any g H , the FOC solution is negative. Then, by the fact thatR g (;t) has an inverted u-shape, we can conclude that for any g H ,R g (;t) gets maximized at t g () = 0. Sinceg 0 () 0, the FOC solution is negative if g 0 ()t f () = 1 () + g 0 () g() + 1 0: By Assumptions 3 and 5, 1 () + g 0 () g() + 1 is decreasing in . Then, considering this and the fact that g 0 ( g H )t f ( g H ) = 0, we haveg 0 ()t f () 0 for any g H . This implies that t f () 0 for any g H . A similar argument can be used to show that the FOC solution is positive for any< g H . 195
Abstract (if available)
Abstract
Markets have been studied for centuries. But, because of the rise of online markets in the past decade, our ability to experiment with markets and get immediate feedback has radically changed. Such opportunities exist because online platforms can collect unprecedented amounts of data. That data can be incorporated into designing optimal marketplaces, and contribute to making informed operational decisions. For instance, data enables online markets to provide personalized services by exploiting heterogeneity among their customers. My research interests lie in understanding the economic and operational properties of online markets and contribute to their design and advancement. ❧ In this dissertation, I focus on designing and developing fast, robust algorithms and mechanisms for two types of online markets that have been significantly influenced by the availability of massive amounts of data: online retailing markets and online advertising markets. ❧ In Chapter 1, I study the problem of optimizing personalized assortment planning in online retail stores. The online retail stores provide a platform that enables consumers to directly buy goods or services from a seller over the Internet. These platforms allow online retailers to store data from customers to get information about their behavior and needs. This gives online retailers an opportunity to use such information to personalize product-offering (aka assortment planning) based on customers’ preference, interests, and other characteristics. By personalizing assortment planning, online retailers can gain substantial potential revenue improvements. In this chapter, we propose a family of simple and efficient algorithms, called Inventory-Balancing, for real-time personalized assortment optimization, that does not require any forecasting. Our proposed algorithm has a strong performance guarantee. We show that our algorithms obtain at least 1 − 1/e = 63% of the benchmark revenue, even when there are sudden shocks in the customers’ arrival patterns, either from seasonality or other non-stationarity effects. The results in this chapter are published as the journal paper by Golrezaei et al. (2014). Furthermore, this paper got nominated for the MSOM Student Paper Competition in 2016 and the Production and Operations Management Society College of Supply Chain Management 2017 Student Paper Competition. ❧ In Chapter 2, we study the impacts of data in two-sided online markets where two sets of independent self-interested agents interact with each other, and the decisions of each set of agents affect the outcomes of the other set of agents. In two-sided platforms such as online advertising markets, data impacts both sets of agents in possibly different ways. Now the challenge is to design selling mechanisms that effectively disclose and/or use data, and incentivize self-interested agents to act in a globally optimal manner. In this chapter, we study the mechanism design problem for a seller (web publisher) of an indivisible good in a setting where privately informed buyers (advertisers) can acquire additional information and refine their valuations for the good at a cost. For this setting, we propose optimal (revenue-maximizing) and efficient (welfare-maximizing) mechanisms that induce a right level of investment in information acquisition. We show that because information is costly, in the optimal and even the efficient mechanisms, not all the buyers would obtain the additional information. In fact, these mechanisms incentivize buyers with higher initial valuations to acquire information. The results in this chapter are published as the journal paper by Golrezaei & Nazerzadeh (2016). ❧ In Chapter 3, we study the problem of pricing in online retailers in the presence of strategic customers. In online retailing, time-based pricing (dynamic pricing), which is is a pricing strategy in which firms set flexible prices for products based on current market demands, has been becoming increasingly prevalent. One of the main advantages of dynamic pricing is that it helps mitigate the risk associated with demand uncertainty (see, for instance, Aviv & Pazgal 2008 and Cachon & Swinney 2011). In this chapter, we show that dynamic pricing can play an important role in differentiating between customers over time even in the absence of demand uncertainty. In many settings, especially in fashion and electronic gadget retail, a customer’s willingness-to-pay (or valuation) for the product is time-sensitive and decreases over time. In these situations, customers are not only different in terms of their initial willingness-to-pay for these products when they are first introduced to the market, but they are also different in terms of how rapidly they lose their interest in these products. We characterize the optimal mechanism for selling durable products in such environments and show that delayed allocation and dynamic pricing can be an effective screening tool for maximizing profit of a firm. This chapter is based on a joint work with Professors Hamid Nazerzadeh and Ramandeep Randhawa.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on revenue management with choice modeling
PDF
Modeling customer choice in assortment and transportation applications
PDF
Essays on information design for online retailers and social networks
PDF
Real-time controls in revenue management and service operations
PDF
Marketing strategies with superior information on consumer preferences
PDF
Three essays on agent’s strategic behavior on online trading market
PDF
The impacts of manufacturers' direct channels on competitive supply chains
PDF
Statistical learning in High Dimensions: Interpretability, inference and applications
PDF
Optimizing penalized and constrained loss functions with applications to large-scale internet media selection
PDF
Essays on digital platforms
PDF
Essays on online advertising markets
PDF
The smart grid network: pricing, markets and incentives
PDF
Essays on consumer returns in online retail and sustainable operations
PDF
Essays on bounded rationality and revenue management
PDF
An online cost allocation model for horizontal supply chains
PDF
Models and algorithms for pricing and routing in ride-sharing
PDF
Essays on dynamic control, queueing and pricing
PDF
Pricing strategy of monopoly platforms
PDF
Quality investment and advertising: an empirical analysis of the auto industry
PDF
Utilizing context and structure of reward functions to improve online learning in wireless networks
Asset Metadata
Creator
Golrezaei, Negin
(author)
Core Title
Efficient policies and mechanisms for online platforms
School
Marshall School of Business
Degree
Doctor of Philosophy
Degree Program
Business Administration
Publication Date
06/26/2017
Defense Date
04/21/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
assortment planning,dynamic mechanism design,dynamic pricing,OAI-PMH Harvest,online advertising markets,online platforms,personalization
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Nazerzadeh, Hamid (
committee chair
), Rusmevichientong, Paat (
committee member
), Vayanos, Phebe (
committee member
)
Creator Email
golrezae@usc.edu,sgolrezai@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-392005
Unique identifier
UC11264127
Identifier
etd-GolrezaeiN-5455.pdf (filename),usctheses-c40-392005 (legacy record id)
Legacy Identifier
etd-GolrezaeiN-5455.pdf
Dmrecord
392005
Document Type
Dissertation
Rights
Golrezaei, Negin
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
assortment planning
dynamic mechanism design
dynamic pricing
online advertising markets
online platforms
personalization