Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays on revenue management with choice modeling
(USC Thesis Other)
Essays on revenue management with choice modeling
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ESSAYS ON REVENUE MANAGEMENT WITH CHOICE MODELING by Heng Zhang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Business Administration) August 2019 Copyright 2019 Heng Zhang Dedication To my parents and fianc´ ee, for their love, support, encouragement, and most important, tolerance. ii Acknowledgments I would like to acknowledge the generous financial support from the Marshall School of Business at Uni- versity of Southern California, without which this dissertation would not have been possible. I am deeply indebted to my co-authors, who have been invaluable during my Ph.D. studies. In partic- ular, the research presented in this dissertation was borne out of my close collaboration with my advisor Prof. Leon Chu and Prof. Paat Rusmevichientong. They offered me constant support, guidance, and inspiration. I am also extremely grateful to my co-authors Prof. Hamid Nazerzadeh and Prof. Huseyin Topaloglu, whose expertise was invaluable in formulating research topics and methodologies. Beyond academic support, my co-authors provided a lot of care and wise counsel on my life as a PhD student and my career. They have been there every step of the way for me, and words cannot express how much I wholeheartedly appreciate everything they have done. I am also grateful to Prof. Sha Yang for being on my thesis committee, as well as her encouragement and appreciation of my work. Special thanks to the collegial faculty members at the Data Sciences and Operations Department at USC Marshall for their help and advice and, in particular, Prof. Greys Sosic, the PhD coordinator. I have received extensive personal and professional guidance from Professors Vishal Gupta, Kimon Drakopoulos, Ramandeep Randhawa, Yehuda Bassok, Wenguang Sun, Jingchi Lv, Peng Shi, Raj Rajagopalan, Song-Hee Kim, and Ashok Srinivasan. I would also like to thank my Ph.D. friends in this department for their support as peers and staff members for their help. The department as a whole offered an incredible supportive environment for my PhD studies. Julie Phaneuf, Michelle Silver Lee, and Prof. K.R. Subramanyam from USC Marshall have also given me a great deal of help along the way. Nobody has been more important to me in the pursuit of my PhD degree than my family. I would like to thank my parents, whose love is always with me. They have been the ultimate role models. Finally, I wish to thank my loving and supportive fianc´ ee, who has been my ultimate source of strength of all my pursuits in life. I dedicate my thesis to them. iii Table of Contents Dedication ii Acknowledgments iii List of Tables vii List of Figures viii Abstract ix Chapter 1: Introduction 1 Chapter 2: Multi-Produce Pricing under Generalized Extreme Value Models with Homoge- neous Price Sensitivities 4 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Generalized Extreme Value Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Unconstrained Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 Constrained Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.1 Prices as a Function of Purchase Probabilities . . . . . . . . . . . . . . . . . . . . 18 2.4.2 Concavity of the Expected Revenue Function and its Gradient . . . . . . . . . . . 20 2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Chapter 3: Assortment Optimization under the Paired Combinatorial Logit Models 25 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.1.1 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.2 Assortment Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.1 Problem Formulation and Complexity . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.2 Paired Combinatorial Logit Model in Assortment Problems . . . . . . . . . . . . 32 3.2.3 A Small Numerical Example: Predictive Power of the PCL Model . . . . . . . . . 36 3.3 A General Framework for Approximation Algorithms . . . . . . . . . . . . . . . . . . . . 39 3.3.1 Connection to a Fixed Point Problem . . . . . . . . . . . . . . . . . . . . . . . . 39 3.3.2 Constructing an Upper Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3.3 Computing the Fixed Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4 Applying the Approximation Framework to the Uncapacitated Problem . . . . . . . . . . 48 3.5 Applying the Approximation Framework to the Capacitated Problem . . . . . . . . . . . . 52 3.5.1 Half-Integral Solutions Through Iterative Rounding . . . . . . . . . . . . . . . . . 52 3.5.2 Feasible Subsets Through Coupled Randomized Rounding . . . . . . . . . . . . . 54 3.5.3 Proof of Theorem 3.5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.6 Computational Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 iv 3.6.1 Computational Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.6.2 Uncapacitated Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.6.3 Capacitated Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Chapter 4: Position Ranking and Auctions for Online Marketplaces 67 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.1.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.2 Model and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.2.1 The Marketplace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.2.2 The Consumer Search Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.3 Weighted Surplus Maximization and Optimal Ranking . . . . . . . . . . . . . . . . . . . 77 4.3.1 Weighted Surplus Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3.2 The Optimal Ranking of Sellers . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.4 Value of Selling k Slots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.4.1 Inefficiency of Incomplete Information . . . . . . . . . . . . . . . . . . . . . . . 81 4.4.2 Surplus-Ordered Ranking (SOR) . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.4.3 The Value of SOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.5 Mechanism Design of Selling k Slots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.6 Extension: Consumer Search with List-Page . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.6.1 Model Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.6.2 Consumer Search Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.6.3 Ranking under the Sequential Search Model with the List-Page . . . . . . . . . . 92 4.7 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Appendix A: Additional Technical Arguments for Chapter 2 97 A.1 Finiteness of the Optimal Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 A.2 Proof of Corollary 2.3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 A.3 Optimal Markup Under Separable Generating Functions . . . . . . . . . . . . . . . . . . 100 A.4 Proof of Lemma 2.4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 A.5 Product Prices to Achieve Given Market Shares . . . . . . . . . . . . . . . . . . . . . . . 104 A.6 Proof of Lemma 2.4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 A.7 Numerical Study for Constrained Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 A.8 Concavity of Expected Revenue Under Separable Generating Functions . . . . . . . . . . 109 Appendix B: Supporting Arguments for Chapter 3 115 B.1 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 B.2 Simplifying the Upper Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 B.3 Improving the Performance Guarantee . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 B.3.1 Preliminary Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 B.3.2 Removing Dependence on Product Revenues and Preference Weights . . . . . . . 131 B.3.3 Uniform Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 B.4 Method of Conditional Expectations for the Uncapacitated Problem . . . . . . . . . . . . 139 B.5 Semidefinite Programming Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 B.5.1 Constructing an Upper Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 B.5.2 Randomized Rounding and Performance Guarantee . . . . . . . . . . . . . . . . . 143 B.5.3 Computing the Fixed Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 B.5.4 Preliminary Bounds for De-Randomizing the Subset of Products . . . . . . . . . . 151 B.5.5 De-Randomization Algorithm and Analysis . . . . . . . . . . . . . . . . . . . . . 153 B.6 Structural Properties of the Extreme Points . . . . . . . . . . . . . . . . . . . . . . . . . . 162 B.7 Generalized Nested Logit Model with at Most Two Products per Nest . . . . . . . . . . . 164 v B.7.1 Assortment Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 B.7.2 A Framework for Approximation Algorithms . . . . . . . . . . . . . . . . . . . . 165 B.7.3 Uncapacitated Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 B.7.4 Capacitated Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 B.7.4.1 Half-Integral Solutions Through Iterative Rounding . . . . . . . . . . . 170 B.7.4.2 Feasible Subsets Through Coupled Randomized Rounding . . . . . . . 172 Appendix C: Supporting Arguments for Chapter 4 175 C.1 Consumer Search with Item Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 C.2 The Satisficing Choice Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 C.3 Limitation of the VCG Mechanism in Selling k Slots . . . . . . . . . . . . . . . . . . . . 180 C.4 SOR Payment Function with Virtual Societies . . . . . . . . . . . . . . . . . . . . . . . . 182 C.5 Selling All Slots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 C.5.1 The VCG Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 C.5.2 Generalized Second Price (GSP) Auction . . . . . . . . . . . . . . . . . . . . . . 186 C.6 Platform Profit Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 C.7 Proof of Lemma 4.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 C.8 Proof of Lemma 4.7.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 C.9 Proof of Theorem 4.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 C.10 Proof of Corollary 4.7.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 C.11 Proof of Proposition 4.4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 C.12 Proof of Proposition 4.7.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 C.13 Proof of Proposition 4.4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 C.14 Proof of Theorem 4.4.4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 C.15 Proof of Theorem 4.5.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 C.16 Proof of Lemma 4.6.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 C.17 Proof of Theorem 4.6.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 C.18 Proof of Proposition C.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 C.19 Proof of Proposition C.1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 C.20 Proof of Proposition C.2.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 C.21 Proof of Proposition C.4.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 C.22 Proof of Proposition C.5.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 C.23 Proof of Proposition C.5.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 C.24 Proof of Proposition C.5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 C.25 Proof of Lemma C.6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 C.26 Proof of Proposition C.6.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Bibliography 238 vi List of Tables 3.1 Computational results for the uncapacitated test problems. . . . . . . . . . . . . . . . . . 63 3.2 Computational results for the capacitated test problems. . . . . . . . . . . . . . . . . . . . 64 A.1 Numerical results for constrained pricing. . . . . . . . . . . . . . . . . . . . . . . . . . . 108 vii List of Figures 2.1 Relationship between well-known GEV models. . . . . . . . . . . . . . . . . . . . . . . . 11 3.1 Correlation coefficient of(e 1 ;e 2 ) as a function ofg 12 . . . . . . . . . . . . . . . . . . . . . 34 3.2 Network for the commuting example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.3 MAPE’s and out-of-sample log-likelihoods for the fitted PCL and MNL models. . . . . . . 39 4.1 The surplus gained from a different number of slots sold with SOR. . . . . . . . . . . . . 84 4.2 The trade-off contour of standard deviation anda. . . . . . . . . . . . . . . . . . . . . . 85 4.3 A flow chart illustration of the choice process of the sequential search model with the list-page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 A.1 A sample tree for the d-level nested logit model with n= 16 and m= 4. . . . . . . . . . . 103 B.1 Constructing a graph with all vertices having degrees divisible by four. . . . . . . . . . . . 119 viii Abstract Tremendous opportunities and many challenges arise in revenue management (RM) along with the evolu- tion of retailing from brick-and-mortar stores to online marketplaces. Building upon this paradigm shift, choice models, which capture consumer substitution and demand based on statistical methods, play an increasingly important role in modern RM by providing it with the “mathematical language” to describe consume behavior in an increasingly more elaborated manner. My dissertation focuses on advancing the theory of choice modeling and its modern applications in RM. This current paradigm shift of retailing has resulted in an explosive growth of data availability, and fundamentally changed our ability to experiment with different models. Such advancements enable the possibility of fitting a very complex choice model for an application. Therefore, it is critical for us to understand how to operationalize these complex models for decision making. In Chapter 2, we consider unconstrained and constrained multi-product pricing problems when customers choose according to an arbitrary generalized extreme value (GEV) model and the products have the same price sensitivity pa- rameter. GEV family is a large class of models and it subsumes infinitely many different choice models, including many well-known ones such as Multinomial Logit or Nested Logit Model. In the unconstrained problem, there is a unit cost associated with the sale of each product. The goal is to choose the prices for the products to maximize the expected profit obtained from each customer. We show that the optimal prices of the different products have a constant markup over their unit costs. We provide an explicit for- mula for the optimal markup in terms of the Lambert-W function. In the constrained problem, motivated by the applications with inventory considerations, the expected sales of the products are constrained to lie in a convex set. The goal is to choose the prices for the products to maximize the expected revenue obtained from each customer, while making sure that the constraints for the expected sales are satisfied. If we formulate the constrained problem by using the prices of the products as the decision variables, then we end up with a non-convex program. We give an equivalent market-share-based formulation, where the purchase probabilities of the products are the decision variables. We show that the market-share-based ix formulation is a convex program, the gradient of its objective function can be computed efficiently, and we can recover the optimal prices for the products by using the optimal purchase probabilities from the market-share-based formulation. Our results for both unconstrained and constrained problems hold for any arbitrary GEV model. The content of the chapter is based on the published journal paper [153]. In Chapter 3, along a similar line, we consider uncapacitated and capacitated assortment problems under the paired combinatorial logit model, which is quite complex choice model but demonstrated to be successful in transportations [144]. Here our goal is to find a set of products to maximize the expected revenue obtained from a customer. In the uncapacitated setting, we can offer any set of products, whereas in the capacitated setting, there is a upper bound on the number of products that we can offer. We establish that even the uncapacitated assortment problem is strongly NP-hard. To develop an approximation frame- work for our assortment problems, we transform the assortment problem into an equivalent problem of finding the fixed point of a function, but computing the value of this function at any point requires solving a nonlinear integer program. Using a suitable linear programming relaxation of the nonlinear integer pro- gram and randomized rounding, we obtain a 0.6-approximation algorithm for the uncapacitated assortment problem. Using randomized rounding on a semidefinite programming relaxation, we obtain an improved 0.79-approximation algorithm, but the semidefinite programming relaxation can get difficult to solve in practice for large problem instances. Finally, using iterative rounding, we obtain a 0.25-approximation algorithm for the capacitated assortment problem. Our computational experiments on randomly generated problem instances demonstrate that our approximation algorithms, on average, yield expected revenues that are within 0.9% of an efficiently-computable upper bound on the optimal expected revenue. This part is based on the working paper [154]. Furthermore, there are many novel revenue management applications for choice models along with this paradigm change. In Chapter 4, we focus on such applications in online markets. Online e-commerce platforms such as Amazon and Taobao connect thousands of sellers and consumers every day. We present a choice model that considers consumers’ search costs on such platforms and the externalities sellers im- pose on each other and study how platforms should rank products displayed to consumers and utilize the top and most salient slot. This model allows us to study a multi-objective optimization, whose objec- tive includes consumer and seller surplus, as well as the sales revenue, and derive the optimal ranking decision. In addition, we propose a surplus-ordered ranking (SOR) mechanism for selling some of the top slots. This mechanism is motivated in part by Amazon’s sponsored search program. We show that the Vickrey–Clarke–Groves (VCG) mechanism would not be applicable to our setting and propose a new x mechanism, which is near-optimal, performing significantly better than those that do not incentivize sell- ers to reveal their private information. Moreover, we generalize our model to settings where platforms can provide partial information about the products and facilitate the consumer search and show the robustness of our findings. This chapter is based on the published journal paper [31]. xi Chapter 1 Introduction This dissertation focuses on the theory of choice modeling and its applications in revenue management (RM) in a broad sense. It discusses problems such as multi-product pricing, assortment optimization, and e-commerce platform product ranking, under different consumer choice models. These researches are largely driven by the tremendous opportunities and many challenges that arise in RM along with the revolution of retailing from brick-and-mortar stores to online marketplaces. Being the cornerstone of RM, understanding and utilizing how consumers arrive at their choices can make a high impact on the design and advancement of modern marketplaces. Specifically, this current paradigm shift of retailing has resulted in an explosive growth of data avail- ability, and fundamentally changed our ability to experiment with different models. Such advancements enable the possibility of fitting a very complex choice model for an application. One of my research ques- tions is how we can tackle fundamental RM problems such as pricing or assortment optimization under these choice models based on their model structures. I apply techniques such as convex and combinatorial optimization to design fast, robust solution techniques. Some Random Utility Maximization based choice models, such as multinomial logit (MNL) and nested logit models, are well-studied in the operations literature. For instance, in [69], Huh and Li study the structural properties of the multi-product pricing problem under these two models, in which the retailer decides the optimal product prices to maximize the expected revenue. These models fall into the category of GEV models proposed by [98]. Essentially, each generating function that satisfies a few axioms defines a unique GEV model and thus the GEV family consists of a very rich class of models. With restrictive assumptions on the nest logit model and the MNL model relaxed, many of them are successfully applied 1 in a large number of empirical problems. Can we help practitioners solve the operational problems under these models; and, how far can we push the trade-off boundary between model richness and tractability? Chapter 2 shows that the structural properties of MNL and nested logit models can be extended to all GEV models under slight technical assumptions. In the unconstrained pricing problem, we show that optimal prices of the different products have the same markup—i.e., price minus cost—and the markup can be calculated efficiently using a line search. Furthermore, we show a one-to-one mapping between a vector of purchase probabilities and a vector of prices and that the objective function of the pricing problem is convex in the purchase probability space. Therefore, we can solve the pricing problem under any convex constraints in the purchase probability space efficiently using convex optimization algorithms. This problem has several applications, and an important one is that it allows efficient derivation of the solution to a relaxed problem of dynamic pricing with inventory constraints under GEV models. The solution can be used to design efficient heuristics for dynamic pricing with known techniques as proposed in [150]. On the theory side, this work solves the pricing problem under a spectrum of choice models and bridges many existing works on pricing, but more importantly, one may use data to determine the best GEV model fitting the application and our techniques can be immediately used for decision making as an off-the-shelf method for pricing. In a similar vein, the paired combinatorial logit (PCL) model, applied in many empirical settings, has been recognized as an important special case of GEV models [76, 84]. A discussion on the assortment optimization problem under PCL is presented in Chapter 3. We show that even the unconstrained problem is NP-hard and provide efficient linear-programing-based approximation algorithms with provable good performance guarantees for both unconstrained and capacity constrained problems. Our computational experiments on randomly generated problem instances demonstrate that our approximation algorithms, on average, yield expected revenues that are within 0.9% of an efficiently-computable upper bound on the optimal expected revenue. In my research, I also seek novel applications for choice models along with this paradigm change. For example, I consider how to operationalize choice modeling in online retailing. On e-commerce platforms such as Amazon, a single consumer search keyword can bring thousands of results returned in a list, but as the consumer searches through the list, the likelihood of looking at one more item in the list quickly diminishes [48]. The way in which the platform ranks products has a significant impact on its operations, in terms of both long-term platform welfare and short-term monetization. Empirically, we already know ranking is very important for consumer choice due to search costs, which, however, are not well studied 2 theoretically in the e-commerce platform setting. I incorporate search costs into consumer choice models on e-commerce platforms and propose a modeling framework that adopts optimization and mechanism design for optimizing ranking-related platform operations. In Chapter 4 we adapt the sequential search model in economics pioneered in [143] and propose a novel modeling framework to theoretically study e-commerce platform research result ranking design. In our model, the consumer maximizes her utility by balancing the trade-off between accepting a suboptimal product choice and incurring additional search costs in the search process. This modeling technique allows us to pinpoint robust solutions for a range of objectives of platform operations. For long-term welfare, we design a multi-objective optimization which includes sales revenue, consumer welfare and supply-side surplus, which is the aggregate benefit of third-party sellers, and show that the optimal ranking has a simple sorting-like structure. Furthermore, in practice, the obstacle of imple- menting this solution is information asymmetry: seller’s private benefit of consumer purchase is, in general, unobservable. We construct a near-optimal solution by selling the platform’s top slots using auction design and thus provide a practical and robust answer to the ranking problem. We further generalize the model to capture more complex consumer search processes and demonstrate that the results and insights are robust. For example, in practice, platforms can facilitate consumer search by displaying products and sharing par- tial product information through the list-page. The consumer could use such information to choose desired items, click to visit their item-pages, and learn further details. To incorporate such information exchange, we extend our base model in two ways to accommodate different consumer search habits. Even though such extensions lead to very complex models, we can still derive heuristic ranking rules with performance guarantees by slightly modifying the simple sorting solution optimal in our base model. For instance, in one of these extensions, even writing down the explicit choice probability expression seems impossible, but we can show that our heuristic retains 1e W(n) of the surplus value of the optimal ranking, where n is the number of items returned for the consumer. Given that in typical applications such n is large, our result indeed provides a practical solution. Similarly, we can also solve for platform profit maximization, which jointly considers platforms’ own product profit, commission fee, and sponsored advertisement revenue. 3 Chapter 2 Multi-Produce Pricing under Generalized Extreme Value Models with Homogeneous Price Sensitivities 2.1 Introduction In most revenue management settings, customers make a choice among the set of products that are offered for purchase. While making their choices, customers substitute among the products based on attributes such as price, quality, and richness of features. In these situations, increasing the price for one product may shift the demand of other products, and such substitutions create complex interactions among the demands for the different products. There is a growing body of literature pointing out that capturing the choice process of customers and the interactions among the demands for different products through discrete choice models can significantly improve operational decisions; see, for example, [50], [130], and [139]. Nevertheless, as the discrete choice models become more complex, finding the optimal prices to charge for the products becomes more difficult as well. This challenge reflects the fundamental tradeoff between choice model complexity and operational tractability. In this chapter, we study unconstrained and constrained multi-product pricing problems when cus- tomers choose according to an arbitrary choice model from the generalized extreme value (GEV) fam- ily. The GEV family is a rather broad family of discrete choice models, as it encapsulates many widely studied discrete choice models as special cases, including the multinomial logit [93, 97, 99], nested logit [98, 145], d-level logit [33, 69, 89], and paired combinatorial logit [25, 84, 91]. Throughout this chap- ter, when we refer to a GEV model, we refer to an arbitrary choice model within the GEV family. For both unconstrained and constrained multi-product pricing problems studied in this chapter, we consider 4 the case where different products share the same price sensitivity parameter. We present results that hold simultaneously for all GEV models. Our Contributions: In the unconstrained problem, there is a unit cost associated with the sale of each product. The goal is to set the prices for the products to maximize the expected profit from each customer. We show that the optimal prices of the different products have the same markup, which is to say that the optimal price of each product is equal to its unit cost plus a constant markup that does not depend on the product; see Theorem 2.3.1. We provide an explicit formula for the optimal markup in terms of the Lambert-W function; see Proposition 2.3.2. These results greatly simplify the computation of the optimal prices and they hold under any GEV model. We give comparative statistics that describe how the optimal prices change as a function of the unit costs; see Corollary 2.3.3. In particular, if the unit cost of a product increases, then its optimal price increases and the optimal prices of the other products decreases. If the unit costs of all products increase by the same amount, then the optimal prices of all products increase. In the constrained problem, motivated by the applications with inventory considerations, the expected sales of the products are constrained to lie in a convex set. The goal is to set the prices for the products to maximize the expected revenue obtained from each customer while satisfying the constraints on the expected sales. A natural formulation of the constrained problem, which uses the prices of the products as the decision variables, is a non-convex program. We give an equivalent market-share-based formulation, where the purchase probabilities of the products are the decision variables. We show that the market-share- based formulation is a convex program that can be solved efficiently. In particular, for any given purchase probabilities for the products, we can recover the unique prices that achieve these purchase probabilities; see Theorem 2.4.1. Also, the objective function of the market-share-based formulation is concave in the purchase probabilities and its gradient can be computed efficiently; see Theorem 2.4.3. Thus, we can solve the market-share-based formulation and recover the optimal prices by using the optimal purchase probabilities. Positioning our work in the related literature, the strength of our contributions derives from the fact that our results hold under any GEV model. We provide efficient solution methods for both the unconstrained and constrained problems that are applicable to any GEV model. This generality comes at the expense of requiring homogeneous price sensitivity parameters. As discussed shortly in our literature review, there is a significant amount of work that studies pricing problems for specific instances of the GEV models, such as the multinomial logit and nested logit, under the assumption that the price sensitivities for the products are the same. Furthermore, in many practical applications, customers choose among products that are in the 5 same product category, such as flights that depart at different times of day (for the same origin-destination pair), or different brands of detergents. For airline flights, detergents, and canned tuna, researchers have shown that the products with each of these categories have similar price sensitivities [64, 80, 103]. Even when the price sensitivities of the products are the same, the GEV models can provide significant modeling flexibility, as they include many other parameters. Consider the generalized nested logit model, which is a GEV model. Let N be the set of all products and b be the price sensitivity of the products. Besides the price sensitivity b, the generalized nested logit model has the parametersfa i : i2 Ng,ft k : k2 Lg, andfs ik : i2 N; k2 Lg for a generic index set L. If the prices of the products arep=(p i : i2 N), then the choice probability of product i is Q GenNest i (p)= å k2L (s ik e a i b p i ) 1=t k å j2N (s jk e a j b p j ) 1=t k t k 1 1+å k2L å j2N (s jk e a j b p j ) 1=t k t k : Letting c i be the unit cost for product i, if we charge the pricesp, then the expected profit from a customer iså i2N (p i c i )Q GenNest i (p). This expected profit function is rather complicated when the purchase proba- bilities are as above, but our results show that we can efficiently find the prices that maximize this expected profit function. Also, [129], [35] and [107] give general approaches to combine GEV models to generate new ones. The purchase probabilities under such new GEV models can be even more complicated. Literature Review: There is a rich vein of literature on unconstrained multi-product pricing problems under specific members of the GEV family, including the multinomial logit, nested logit, and paired com- binatorial logit, but these results make use of the specific form of the purchase probabilities under each specific GEV model. [41] and [67] consider the pricing problem under the multinomial logit model, [6] and [90] consider the pricing problem under the nested logit model, and [91] consider the pricing problem under the paired combinatorial logit model. Under each of these choice models, the authors show that if the price sensitivities of the products are the same, then the optimal prices for the products have a constant markup. We extend the constant markup result established in these papers from the multinomial logit, nested logit, and paired combinatorial logit models to an arbitrary choice model within the GEV fam- ily. Furthermore, the constant markup results established in these papers often exploit the structure of the specific choice model to find an explicit formula for the price of each product as a function of the purchase probabilities. This approach fails for general GEV models, as there is no explicit formula for the prices as 6 a function of the choice probabilities, but it turns out that we can still establish that the optimal prices have a constant markup under any GEV model. With the exception of [90] and [91], the papers mentioned in the paragraph above exclusively assume that the price sensitivities of the products are the same. [90] also go one step beyond to study the pricing problem under the nested logit model when the products in each nest have the same price sensitivity. In this case, they show that the optimal prices for the products in each nest have a constant markup. In addition to the case with homogeneous price sensitivities for the products, [91] also consider the pricing problem under the paired combinatorial logit model with arbitrary price sensitivities. The authors establish sufficient conditions on the price sensitivities to ensure unimodality of the expected profit function and give an algorithm to compute the optimal prices. Other work on unconstrained multi-product pricing problems under specific GEV models includes [141], where the author considers joint assortment planning and pricing problems under the multinomial logit model with arbitrary price sensitivities. [53] show that the expected profit function under the nested logit model can have multiple local maxima when the price sensitivities are arbitrary and give sufficient conditions on the price sensitivities to ensure unimodality of the expected profit function. [116] study the pricing problem under the nested logit model with arbitrary price sensitivities and provide heuristics with performance guarantees. [89] and [69] study pricing problems under the d-level nested logit model with arbitrary price sensitivities. Our study of constrained multi-product pricing problems is motivated by the applications with inven- tory considerations. [52] study a network revenue management model where the sale of each product consumes a combination of resources and the resources have limited inventories. The goal is to find the prices for the products to maximize the expected revenue from each customer, while making sure that the expected consumptions of the resources do not exceed their inventories. The authors use their pricing problem to give heuristics for the case where customers arrive sequentially over time to make product pur- chases subject to resource availability. We show that their pricing problem is tractable under GEV models with homogeneous price sensitivities. [125] and [150] show that the expected revenue function under the multinomial logit model is concave in the market shares when the products have the same price sensitivity. [78] considers pricing problems under the multinomial logit and nested logit models when there are linear constraints on the expected sales of the products. The author establishes sufficient conditions to ensure that the expected revenue is concave in the market shares. [125] and [150] focus on the multinomial logit model with homogeneous price sensitivities for the products. Thus, our work generalizes theirs to an arbitrary 7 GEV model. [78] works with non-homogeneous price sensitivities. In that sense, his work is more general than ours. However, [78] works with specific GEV models. In that sense, our work is more general than his. Each GEV model is uniquely defined by a generating function. [98] gives sufficient conditions on the generating function to ensure that the corresponding GEV model is compatible with the random utility maximization principle, where each customer associates random utilities with the available alternatives and chooses the alternative that provides the largest utility. [99] discusses the connections between GEV models and other choice models. [132] cover the theory and application of GEV models. [35], [107] and [129] show how to combine generating functions from different GEV models to create a new GEV model. The GEV family offers a rich class of choice models. As discussed above, there is work on pricing problems under the multinomial logit, nested logit, paired combinatorial logit, and d-level nested logit models, but applications in numerous areas indicate that using other members of the GEV family can provide useful modeling flexibility. In particular, [124] uses the ordered GEV model, [18] use the principles of differentiation GEV model, [138] uses the cross-nested logit model, [144] use the generalized nested logit model, [128] uses the choice set generation logit model, and [110] use the network GEV model in applications including scheduling trips, route selection, travel mode choice, and purchasing computers. Organization: The chapter is organized as follows. In Section 2.2, we explain how we can characterize a GEV model by using a generating function. In Section 2.3, we study the unconstrained problem. In Sec- tion 2.4, we study the constrained problem. In Section 3.7, we conclude. 2.2 Generalized Extreme Value Models A general approach to construct discrete choice models is based on the random utility maximization (RUM) principle. Under the RUM principle, each product, including the no-purchase option, has a random utility associated with it. The realizations of these random utilities are drawn from a particular probability dis- tribution and they are known only to the customer. The customer chooses the alternative that provides the largest utility. We index the products by N =f1;:::;ng. We use 0 to denote the no-purchase option. For each i2 N[f0g, we let U i =m i +e i be the utility associated with alternative i, where m i is the determin- istic utility component ande i is the random utility component. Under the RUM principle, the probability that a customer chooses alternative i is given by PrfU i > U ` 8`2 N[f0g; `6= ig. The family of GEV models allows us to construct discrete choice models that are compatible with the RUM principle. A GEV 8 model is characterized by a generating function G that maps the vectorY =(Y 1 ;:::;Y n )2R n + to a scalar G(Y). The function G satisfies the following four properties. (i) G(Y) 0 for allY2R n + . (ii) The function G is homogeneous of degree one. In other words, we have G(lY)=l G(Y) for all l2R + andY2R n + . (iii) For all i2 N, we have G(Y)!¥ as Y i !¥. (iv) Using¶G i i ;:::;i k (Y) to denote the cross partial derivative of the function G with respect to Y i 1 ;:::;Y i k evaluated atY , if i 1 ;:::;i k are distinct from each other, then¶G i 1 ;:::;i k (Y) 0 when k is odd, whereas ¶G i 1 ;:::;i k (Y) 0 when k is even. Then, for any fixed vectorY2R n + , under the GEV model characterized by the generating function G, the probability that a customer chooses product i2 N is given by Q i (Y)= Y i ¶G i (Y) 1+ G(Y) : (2.1) With probability Q 0 (Y)= 1å i2N Q i (Y), a customer leaves without purchasing anything. Thus, the choice probabilities depend on the function G and the fixed vectorY2R n + . [98] shows that if the function G satisfies the four properties described above, then for any fixed vector Y 2R n + , the choice probability in (2.1) is compatible with the RUM principle, where the de- terministic utility components (m 1 ;:::;m n ) are given by m i = logY i for all i2 N, the deterministic utility component for the no-purchase option is fixed at m 0 = 0, and the random utility components (e 0 ;e 1 ;:::;e n ) have a generalized extreme value distribution with the cumulative distribution function F(x 0 ;x 1 ;:::;x n )= exp(e x 0 G(e x 1 ;:::;e x n )). The GEV models allow for correlated utilities and we can use different generating functions to model different correlation patterns among the random utili- ties. In the next example, we show that numerous choice models that are commonly used in the operations management and economics literature are specific instances of the GEV models. Example 2.2.1 (Specific Instances of GEV Models) The multinomial logit, nested logit, and paired combinatorial logit models are all instances of the GEV models. For some generic index set L, consider the function G given by G(Y)= å k2L å i2N (s ik Y i ) 1=t k ! t k ; 9 where for all i2 N, k2 L, t k 2(0;1], s ik 0, and for all i2 N, å k2L s ik = 1. The function G above satisfies the four properties described at the beginning of this section. Thus, the expression in (2.1) with this choice of the function G yields a choice model that is consistent with the RUM principle. The choice model that we obtain by using the function G given above is called the generalized nested logit model. [132] discusses how specialized choices of the index set L and the scalarsft j : j2 Lg and fs ik : i2 N; k2 Lg result in well-known choice models. If the set L is the singleton L =f1g and t 1 = 1, then G(Y)=å i2N Y i , and the expression in (2.1) yields the choice probabilities under the multi- nomial logit model. If, for each product i2 N, there exists a unique k i 2 L such that s i;k i = 1, then G(Y)=å g2L å i2N g Y 1=t g i t g where N g =fi2 N : k i = gg, in which case, the expression in (2.1) yields the choice probabilities under the nested logit model, and k i is known as the nest of product i. If the set L is given byf(i; j)2 N 2 : i6= jg and s ik = 1=(2(n 1)) whenever k =(i; j) or ( j;i) for some j6= i, then G(Y)=å (i; j)2N 2 :i6= j Y 1=t (i; j) i +Y 1=t (i; j) j t (i; j) =(2(n1)), and the expression in (2.1) yields the choice probabilities under the paired combinatorial logit model. The discussion in Example 2.2.1 indicates that the multinomial logit, nested logit, and paired combi- natorial logit models are special cases of the generalized nested logit model. As discussed by [144], the ordered GEV , principles of differentiation GEV , and cross-nested logit are special cases of the generalized nested logit model as well. However, although the nested logit model is a special case of the generalized nested logit model, the d-level nested logit model is not a special case of the generalized nested logit model. In Figure 2.1, we show the relationship between well-known GEV models. In this figure, an arc between two GEV models indicates that the GEV model at the destination is a special case of the GEV model at the origin. In the next lemma, we give two properties of functions that are homogeneous of de- gree one. These properties are a consequence of a more general result, known as Euler’s formula, but we provide a self-contained proof for completeness. We will use these properties extensively. Lemma 2.2.2 (Properties of Generating Functions) If G is a homogeneous function of degree one, then we have G(Y)=å i2N Y i ¶G i (Y) andå j2N Y j ¶G i j (Y)= 0 for all i2 N. Proof: Since the function G is homogeneous of degree one, we have G(lY)=l G(Y). Differentiating both sides of this equality with respect tol, we obtainå i2N Y i ¶G i (lY)= G(Y). Using the last equality with l = 1, we obtain G(Y)=å i2N Y i ¶G i (Y), which is the first desired equality. Also, differentiating both sides of this equality with respect to Y j , we obtain¶G j (Y)=¶G j (Y)+å i2N Y i ¶G i j (Y), in which 10 Generalized Nested Logit Wen & Koppelman(2001) d-Level Nested Logit Daganzo & Kusnic (1993) Nested Logit Williams (1977) Paired Combinatorial Logit Koppelman& Wen (2000) Ordered GEV Small (1987) Principles of Diff. GEV Bresnahanet al. (1997) Cross-Nested Logit Vovsha(1997) GEV McFadden (1978) Multinomial Logit Luce (1959) Figure 2.1: Relationship between well-known GEV models. case, canceling¶G j (Y) on both sides and noting that¶G i j (Y)= G ji (Y), we obtainå i2N Y i ¶G ji (Y)= 0, which is the second desired equality. 2.3 Unconstrained Pricing We consider unconstrained pricing problems where the mean utility of a product is a linear function of its price and we want to find the product prices that maximize the expected profit obtained from a customer. For each product i2 N, let p i 2R denote the price charged for product i, and c i denote its unit cost. As a function of the price of product i, the deterministic utility component of product i is given bym i =a i b p i , where a i 2R and b2R + are constants. [7] interpret the parameter a i as a measure of the quality of product i, while the parameterb is the price sensitivity that is common to all of the products. Throughout the chapter, we focus on the case where all of the products share the same price sensitivity. Noting the connection of the GEV models to the RUM principle discussed in the previous section, the deterministic utility component a i b p i of product i is given by logY i . So, let Y i (p i ) = e a i b p i for all i2 N, and letY(p)=(Y 1 (p 1 );:::;Y n (p n )). If we charge the pricesp=(p 1 ;:::; p n )2R n , then it follows from the selection probability in (2.1) that a customer purchases product i with probability 11 Q i (p) = Y i (p i )¶G i (Y(p))=(1+ G(Y(p))). Our goal is to find the prices for the products to maximize the expected profit from each customer, yielding the problem max p2R n R(p) def = å i2N (p i c i )Q i (p)= å i2N (p i c i ) Y i (p i )¶G i (Y(p)) 1+ G(Y(p)) : (UNCONSTRAINED) Since the function G satisfies the four properties at the beginning of Section 2.2, we have¶G i (Y) 0 for allY2R n + . We impose a rather mild additional assumption that ¶G i (Y)> 0 for allY2R n + satisfying Y i > 0 for all i2 N; that is, the partial derivative is strictly positive whenever every entry ofY is positive. This assumption holds for all of the GEV models we are aware of, including the variants that are discussed in Section 2.1, Example 2.2.1 and Figure 2.1. Let p denote the optimal solution to the UNCONSTRAINED problem. In Theorem 2.3.1, we will show thatp has a constant markup, so p i c i = m for all i2 N for some constant m . In other words, the optimal price of each product is equal to its unit cost plus a constant markup that does not depend on the product. In Proposition 2.3.2, we will also give an explicit formula for the optimal markup m in terms of the Lambert-W function. Since the Lambert-W function is available in most mathematical computation packages, this proposition greatly simplifies the computation of the optimal prices. Recall that the Lambert-W function is defined as follows: for all x2R + , W(x) is the unique value such that W(x)e W(x) = x. Using standard calculus, it can be verified that W(x) is increasing and concave in x2R + ; see [32]. The starting point for our discussion is the expression for the partial derivative of the expected profit function. Since Y i (p i )= e a i b p i , we have that dY i (p i )=d p i =b Y i (p i ), in which case, using the definition of R(p) in the UNCONSTRAINED problem, we have ¶R(p) ¶ p i = n Y i (p i )b Y i (p i )(p i c i ) o ¶G i (Y(p)) 1+ G(Y(p)) å j2N (p j c j )Y j (p j ) ¶G ji (Y(p))(1+ G(Y(p)))¶G j (Y(p))¶G i (Y(p)) (1+ G(Y(p))) 2 b Y i (p i ) = n Y i (p i )b Y i (p i )(p i c i ) o ¶G i (Y(p)) 1+ G(Y(p)) b Y i (p i )¶G i (Y(p)) 1+ G(Y(p)) ( å j2N (p j c j ) Y j (p j )¶G ji (Y(p)) ¶G i (Y(p)) å j2N (p j c j ) Y j (p j )¶G j (Y(p)) 1+ G(Y(p)) ) =b Y i (p i )¶G i (Y(p)) 1+ G(Y(p)) ( 1 b (p i c i ) å j2N (p j c j ) Y j (p j )¶G ji (Y(p)) ¶G i (Y(p)) + R(p) ) : 12 In the next theorem, we use the above derivative expression to show that the optimal prices for the UNCONSTRAINED problem involves a constant markup for all of the products. Theorem 2.3.1 (Constant Markup is Optimal) For all i2 N, p i c i = 1 b + R(p ). Proof: Note that there exist optimal prices that are finite; the proof is straightforward but tedious, and we defer the details to Appendix A.1. Since the optimal prices are finite, they satisfy the first order conditions: ¶R(p) ¶ p i p=p = 0 for all i. The finiteness also implies that Y i (p i )= e a i b p i > 0 for all i2 N. Since¶G i (Y)> 0 for allY2R n + with Y i > 0 for all i2 N, we have¶G i (Y(p ))> 0 as well. Thus, if the pricesp satisfy the first order conditions ¶R(p) ¶ p i p=p = 0 for all i, then by the expression for the partial derivative ¶R(p) ¶ p i right before the statement of the theorem p i c i = 1 b 1 ¶G i (Y(p )) å j2N (p j c j )Y j (p j )¶G ji (Y(p ))+ R(p ): For notational brevity, define m i = p i c i . Without loss of generality, we index the products such that m 1 ::: m n . By the discussion in Section 2.2, the function G satisfies the property¶G ji (Y) 0 for any Y2R n + and i6= j. In this case, using the equality above for i= 1 and noting that we have m 1 p i c i for all i2 N, we obtain m 1 1 b 1 ¶G 1 (Y(p )) å j2N m 1 Y j (p j )¶G j1 (Y(p ))+ R(p )= 1 b + R(p ); where the equality follows from Lemma 2.2.2. Therefore, we obtain m 1 1=b + R(p ). A similar ar- gument also yields m n 1=b + R(p ), in which case, we have m 1 1=b + R(p ) m n . Noting the assumption that m 1 ::: m n , we must have 1=b+R(p )= m 1 =:::= m n and the desired result follows by noting that m i = p i c i . Noting Theorem 2.3.1, let m = 1 b + R(p ) denote the optimal markup. In the next proposition, we give an explicit formula for m in terms of the Lambert-W function. Proposition 2.3.2 (Explicit Formula for the Optimal Markup) Let the scalarg be defined as g = G(Y 1 (c 1 );:::;Y n (c n ))= G(e a 1 b c 1 ;:::;e a n b c n ): 13 Then, it follows that m = 1+W(g e 1 ) b and R(p )= W(g e 1 ) b : Proof: The optimal prices have a constant markup. So, we focus on price vectorsp such that p i c i = m for all i2 N for some m2R + . Let c = (c 1 ;:::;c n ), and Y(me+c) = (Y 1 (m+ c 1 );:::;Y n (m+ c n )), wheree2R n is the vector with all entries of one. In this case, we can write the objective function of the UNCONSTRAINED problem as a function of m, which is given by R(m)= å i2N m Y i (m+ c i )¶G i (Y(me+c)) 1+ G(Y(me+c)) = m G(Y(me+c)) 1+ G(Y(me+c)) ; where the second equality relies on the fact that å i2N Y i (m+ c i )¶G i (Y(me+c))= G(Y(me+c)) by Lemma 2.2.2. Thus, we can compute the optimal objective value of the UNCONSTRAINED problem by maximizing R(m) over all possible values of m. Since dY i (m+ c i )=dm=b Y i (m+ c i ), differentiating the objective function above with respect to m, we get dR(m) dm = G(Y(me+c)) 1+ G(Y(me+c)) m å i2N ¶G i (Y(me+c)) (1+ G(Y(me+c))) 2 b Y i (m+ c i ) = G(Y(me+c)) 1+ G(Y(me+c)) 1 b m 1+ G(Y(me+c)) ; where the second equality once again uses the fact thatå i2N Y i (m+c i )¶G i (Y(me+c))= G(Y(me+c)). Because Y i (m+ c i ) is decreasing in m, and¶G i (Y) 0 for allY2R n + , it follows that G(Y(me+c)) is decreasing in m. Therefore, in the expression for dR(m) dm , the term 1 b m 1+G(Y(me+c)) is decreasing in m; this implies that the derivative dR(m) dm can change sign from positive to negative only once as the value of m increases, so R(m) is quasiconcave in m. Thus, setting the derivative with respect to m to zero provides a maximizer of R(m). By the derivative expression above, if dR(m) dm = 0, thenb m= 1+ G(Y(me+c)), so the optimal markup m satisfies b m = 1+ G(Y(m e+c)) = 1+ G(e a 1 b(m +c 1 ) ;:::;e a n b(m +c n ) ) = 1+ e b m G(e a 1 b c 1 ;:::;e a n b c n ) = 1+g e b m = 1+g e 1 e (b m 1) ; 14 where the third equality uses the fact that G is homogeneous of degree one. The last chain of equalities implies that (b m 1)e b m 1 = g e 1 , so that W(g e 1 )= b m 1. Solving for m , we obtain m = (1+W(g e 1 ))=b, which is the desired expression for the optimal markup. Furthermore, since p i c i = m for all i2 N, Theorem 2.3.1 implies that the optimal objective value of the UNCONSTRAINED problem is R(p )= m 1=b = W(g e 1 )=b. By Proposition 2.3.2, to obtain the optimal prices, we can simply compute g as in the proposition and set m = (1+ W(g e 1 ))=b, in which case, the optimal price for product i is m + c i . When the price sensitivities of the products are the same, the fact that the optimal prices have constant markup is shown in Proposition 1 in [67] for the multinomial logit model, in Lemma 1 in [6] for the nested logit model, and in Lemma 3 in [91] for the paired combinatorial logit model. Theorem 2.3.1 generalizes these results to an arbitrary GEV model. Explicit formulas for the optimal markup are given in Theorem 1 in [41] for the multinomial logit model and in Theorem 1 in [91] for the paired combinatorial logit model. Proposition 2.3.2 generalizes these results to an arbitrary GEV model. Theorem 2.3.1 also allows us to give comparative statistics that describe how the optimal prices change as a function of the unit costs. As a function of the unit product costs c = (c 1 ;:::;c n ) in the UNCONSTRAINED problem, let p (c)=(p 1 (c);:::; p n (c)) denote the optimal prices. To facilitate our exposition, we usee i 2R n + for the vector with one in the i-th entry and zeros everywhere else, and designatee2R n + as the vector of all ones. In the next corollary, which is a corollary to Theorem 2.3.1, we show that if the unit cost of a product increases, then its optimal price increases and the optimal prices of the other products decreases, whereas if the unit costs of all products increase by the same amount, then the optimal prices of all products increase as well. We defer the proof to Appendix A.2. Corollary 2.3.3 (Comparative Statistics) For alld 0, (a) For all i2 N, p i (c+de i ) p i (c), and for all j6= i, p j (c+de i ) p j (c); (b) For all i2 N, p i (c+de) p i (c). We can give somewhat more general versions of the results in this section. In particular, we partition the set of products N into the disjoint subsets N 1 ;:::;N m such that N =[ m k=1 N k and N k \ N k 0 =? for k6= k 0 . Similarly, we partition the vectorY =(Y 1 ;:::;Y n )2R n + into the subvectorsY 1 ;:::;Y m such that each subvectorY k is given byY k =(Y i : i2 N k ). Assume that the products in each partition N k share the same price sensitivity b k , and the generating function G is a separable function of the form G(Y)= å m k=1 G k (Y k ), where the functions G 1 ;:::;G m satisfy the four properties discussed at the beginning of 15 Section 2.2. In Appendix A.3, we use an approach similar to the one used in this section to show that the optimal prices for the products in the same partition have a constant markup and give a formula to compute the optimal markups. Considering unconstrained pricing problems under the nested logit model, when the products in each nest have the same price sensitivity, Theorem 2 in [90] shows that the optimal prices for the products in each nest have a constant markup and gives a formula that can be used to compute the optimal markup. The generating function for the nested logit model is a separable function of the form å m k=1 g k G k (Y k ), where the products in a partition N k correspond to the products in a nest. Thus, our results in Appendix A.3 generalize Theorem 2 in [90] to an arbitrary GEV model with a separable generating function. Throughout the chapter, we do not explicitly work with separable generating functions to minimize notational burden. 2.4 Constrained Pricing We consider constrained pricing problems where the expected sales of the products are constrained to lie in a convex set. Similar to the previous section, the products have the same price sensitivity parameterb. The goal is to find the product prices that maximize the expected revenue obtained from each customer, while satisfying the constraints on the expected sales. To formulate the constrained pricing problem, we define the vector (p)=(Q 1 (p);:::;Q n (p)), which includes the purchase probabilities of the products. To capture the constraints on the expected sales, let M denote some generic index set. For each`2 M, we let F ` be a convex function that maps the vectorq =(q 1 ;:::;q n )2R n + to a scalar. We are interested in solving the problem max p2R n ( å i2N p i Q i (Y(p)) : F ` ((p)) 0 8`2 M ) : (CONSTRAINED) The objective function above accounts for the expected revenue from each customer. InterpretingQ i (p) as the expected sales for product i, the constraints ensure that the expected sales for the products lie in the convex setfq2R n + : F ` (q) 0 8`2 Mg. The CONSTRAINED problem finds applications in the network revenue management setting, where the sale of each product consumes a combination of resources [52]. In this setting, the set M indexes the set of resources. The sale of product i consumes a `i units of resource `. There are C ` units of resource`. The expected number of customer arrivals is T . We want to find the product prices to maximize the expected revenue from each customer, while ensuring that 16 the expected consumption of each resource does not exceed its availability. If we charge the pricesp, then the expected sales for product i is TQ i (p). Thus, the constraint å i2N a `i TQ i (p) C ` ensures that the total expected consumption of resource ` does not exceed its inventory. In this case, defining F ` as F ` (q)=å i2N a `i T q i C ` , the constraints in the CONSTRAINED problem ensure that the expected capacity consumption of each resource does not exceed its inventory. In the CONSTRAINED problem, the objective function is generally not concave in the pricesp. Also, although F ` is convex, F ` ((p)) is not necessarily convex isp. Thus, the CONSTRAINED problem is not a convex program 1 . However, by expressing the CONSTRAINED problem in terms of the purchase prob- abilities or market shares, we will reformulate the problem into a convex program. In our reformulation, the decision variablesq =(q 1 ;:::;q n ) correspond to the purchase probabilities of the products. We let p(q)=(p 1 (q);:::; p n (q)) denote the prices that achieve the purchase probabilitiesq. Our reformulation of the CONSTRAINED problem is max q2R n + ( å i2N p i (q)q i : F ` (q) 0 8`2 M; å i2N q i 1 ) : (MARKET-SHARE-BASED) The interpretations of the objective function and the first constraint in the MARKET-SHARE-BASED formulation are similar to those of the CONSTRAINED problem. The last constraint in the MARKET-SHARE-BASED formulation ensures that the total purchase probability of all products does not exceed one. We will establish the following results for the MARKET-SHARE-BASED formulation. In The- orem 2.4.1 in Section 2.4.1, we show that for each market share vectorq, there exists the unique price vectorp(q) that achieves the market shares in the vectorq. Furthermore, the price vectorp(q) is the solution of an unconstrained minimization problem with a strictly convex objective function. Therefore, computingp(q) is tractable. Then, in Theorem 2.4.3 in Section 2.4.2, we show that the objective function in the MARKET-SHARE-BASED formulationq7!å i2N p i (q)q i is concave inq and we give an expression for its gradient. Since the constraints in the MARKET-SHARE-BASED formulation are convexq, we have a convex program. Thus, we can efficiently solve the MARKET-SHARE-BASED formulation and obtain the 1 The objective function of the CONSTRAINED problem is not quasi-concave. As an example, consider the multinomial logit choice model with N=f1;2g anda 1 =a 2 = 10 andb = 1. Then, the objective function is given by f(p 1 ; p 2 )= p 1 e 10p 1 + p 2 e 10p 2 1+ e 10p 1 + e 10p 2 8(p 1 ; p 2 )2R 2 If (x 1 ;x 2 )=(10;20) and (y 1 ;y 2 )=(20;10), then f(x 1 ;x 2 )= f(y 1 ;y 2 )= 5:0 but f(0:5(x 1 ;x 2 )+ 0:5(y 1 ;y 2 ))= f(15;15)= 0:2 < minf f(x 1 ;x 2 ); f(y 1 ;y 2 )g. So, the objective function is not quasi-concave. 17 optimal purchase probabilitiesq by using standard convex optimization methods [16]. Once we compute the optimal purchase probabilitiesq , we can also compute the corresponding optimal pricesp(q ). 2.4.1 Prices as a Function of Purchase Probabilities We focus on the question of how to compute the unique pricesp(q)=(p 1 (q);:::; p n (q)) that are necessary to achieve the given purchase probabilitiesq=(q 1 ;:::;q n ). The main result of this section is stated in the following theorem. Theorem 2.4.1 (Inverse Mapping) For eachq2R n + such that q i > 0 for all i2 N andå i2N q i < 1, there exists a unique price vectorp(q) such that q i =Q i (Y(p(q))) for all i2 N. Moreover,p(q) is the finite and unique solution to the strictly convex minimization problem min s2R n ( 1 b log(1+ G(Y(s)))+ å i2N q i s i ) : The proof of Theorem 2.4.1 makes use of the lemma given below. Throughout this section, all vectors are assumed to be column vectors. For any vectors2R n ,s > denotes its transpose and will be always be a row vector, whereas diag(s) denotes an n-by-n diagonal matrix whose diagonal entries correspond to the vectors. Also, letÑG(Y(s)) denote the gradient vector of the generator function G evaluated atY(s) and Ñ 2 G(Y(s)) denote the Hessian matrix of G evaluated atY(s). Last but not least, we useQ(Y(s))2R n to denote an n-dimensional vector whose entries are the selection probabilitiesQ 1 (Y(s));:::;Q n (Y(s)). Fix an arbitraryq2R n + such that q i > 0 for all i andå i2N q i < 1, and let f :R n !R be defined by: for all s2R n , f(s)= 1 b log(1+ G(Y(s)))+ å i2N q i s i : In the next lemma, we give the expressions for the gradient Ñ f(s) and the Hessian Ñ 2 f(s). The proof of this lemma directly follows by differentiating the function f and using the definition of the choice probabilities in (2.1). We defer the proof to Appendix A.4. Lemma 2.4.2 (Gradient and Hessian) For alls2R n ,Ñ f(s)=q Q(Y(s)) and 1 b Ñ 2 f(s)= diag(Q(Y(s)))Q(Y(s))Q(Y(s)) > + diag(Y(s))Ñ 2 G(Y(s))diag(Y(s)) 1+ G(Y(s)) : 18 In the proof of Theorem 2.4.1, we will also use two results in linear algebra. First, if the vector v2R n + satisfies v i > 0 for all i and å n i=1 v i < 1, then the matrix diag(v)vv > is positive definite. To see this result, since 1v > diag(v) 1 v = 1å n i=1 v i > 0, by the Sherman-Morrison formula, the in- verse of diag(v)vv > exists and it is given by diag(v) 1 +(diag(v)) 1 vv > diag(v) 1 =(1e > v)= diag(v) 1 +ee > =(1å n i=1 v i ); see Section 0.7.4 in [68]. The last matrix is clearly positive definite, which implies that diag(v)vv > is also positive definite. Second, if A is a symmetric matrix such that each row sums to zero and all off-diagonal entries are non-positive, then A is positive semidefinite. To see this result, by our assumption, A is a symmetric and diagonally dominant matrix with non-negative diagonal entries, and such a matrix is known to be positive semidefinite; see Theorem A.6 in [38]. Here is the proof of Theorem 2.4.1. Proof of Theorem 2.4.1: Note that the objective function of the minimization problem in the theorem is f(s). We claim that f is strictly convex. For anys2R n , let Y i (s i )= e a i b s i > 0 for all i2 N, so that Q i (Y(s)) = Y i (s i )¶G i (Y(s))=(1+ G(Y(s))> 0, where the inequality is by the assumption that ¶G i (Y)> 0 when Y i > 0 for all i2 N. Using Lemma 2.2.2, we also have å i2N Q i (Y(s)) = å i2N Y i (s i )¶G i (Y(s)) 1+ G(Y(s)) = G(Y(s)) 1+ G(Y(s)) < 1: In this case, by the first linear algebra result, the matrix diag(Q(Y(s)))Q(Y(s))Q(Y(s)) > is positive definite. Next, consider the matrix diag(Y(s))Ñ 2 G(Y(s))diag(Y(s)), which is symmetric and its(i; j)- th component is given by Y i (s i )¶G i j (Y(s))Y j (s j ). For i6= j, we have ¶G i j (Y(s)) 0 by the property of the generating function G, so all off-diagonal entries of the matrix are non-positive. Furthermore, by Lemma 2.2.2, we haveå j2N Y i (s i )¶G i j (Y(s))Y j (s j )= 0, so that each row of the matrix sums the zero. In this case, by the second linear algebra result, the matrix diag(Y(s))Ñ 2 G(Y(s))diag(Y(s)) is positive semidefinite. By the discussion in the previous paragraph, the matrix diag(Q(Y(s)))Q(Y(s))Q(Y(s)) > is pos- itive definite and the matrix diag(Y(s))Ñ 2 G(Y(s))diag(Y(s)) is positive semidefinite. Adding a pos- itive definite matrix to a positive semidefinite matrix gives a positive definite matrix. In this case, noting the expression for the Hessian of f given in Lemma 2.4.2, f is strictly convex, which establishes the claim. Therefore, f has a unique minimizer. Furthermore, we can show that for any L 0, there exists an M 0, such that havingksk M implies that f(s) L; the proof is straightforward but tedious, and we defer the details to Appendix A.5. Therefore, given somes 0 with f(s 0 ) 0, there exists M 0 0 19 such that havingksk M 0 implies that f(s) f(s 0 ). In this case, the minimizer of f must lie in the set fs2R n :ksk M 0 g, which implies that f has a finite minimizer. Since f is strictly convex and it has a finite minimizer, its minimizerp(q) is the solution to the first-order conditionÑ f(p(q))= 0, where 0 is the vector of all zeros. In this case, by the expression for the gradient of f given in Lemma 2.4.2, we must haveÑ f(p(q))=qQ(Y(p(q)))= 0, which implies that q i =Q i (Y(p(q))) for all i2 N, as desired. To summarize, given a vector of purchase probabilities q, the unique price vector p(q) that achieves these purchase probabilities is the unique optimal solution to the minimization problem min s2R n n 1 b log(1+ G(Y(s)))+å n i=1 q i s i o . Because the objective function in this problem is strictly convex, with its gradient given in Lemma 2.4.2, and there are no constraints on the decision variables, we can computep(q) efficiently using standard convex optimization methods. We emphasize that one might be tempted to set q i =Q i (Y(p)) for all i2 N and solve forp in terms ofq in order to computep(q). How- ever, solving this system of equations directly is difficult. Even showing that there is a unique solution to this system of equations is not straightforward. Theorem 2.4.1 shows that there is a unique solution to this system of equations, and we can compute the solution by solving an unconstrained convex optimization problem. 2.4.2 Concavity of the Expected Revenue Function and its Gradient Let R(q)=å i2N p i (q)q i denote the expected revenue function that is defined in terms of the market shares. The main result of this section is stated in the following theorem, which shows that R(q) is concave inq and provides an expression for its gradient. Theorem 2.4.3 (Concavity of the Revenue Function in terms of Market Shares) For allq2R n + such that q i > 0 for all i andå i2N q i < 1, the Hessian matrixÑ 2 R(q) is negative definite andÑR(q)=p(q) 1 b(1e > q) e. Before we proceed to the proof, we discuss the significance of Theorem 2.4.3. As noted at the begin- ning of Section 2.4, the functionp7!å i2N p i Q i (Y(p)) is not necessarily concave in the pricesp. However, the theorem above shows that when we express the problem in terms of market sharesq, the expected rev- enue function R(q) is concave inq. Using the gradient of the expected revenue function in the theorem, we can then immediately solve the MARKET-SHARE-BASED problem using standard tools from convex programming. 20 Also, we note that the restriction that q i > 0 for all i and å i2N q i < 1 is necessary for the ex- pected revenue function R(q) and its derivatives to be well-defined. To give an example, we con- sider the multinomial logit model. Under this choice model, the selection probability of product i is Q MNL i (p)= e a i b p i =(1+å k2N e a k b p k ). We can check that p i (q)= 1 b (a i + log(1å k2N q k ) logq i ) so that R(q)=å i2N 1 b (a i + log(1å k2N q k ) logq i )q i . This expected revenue function and its derivatives is well-defined only when q i > 0 for all i2 N andå i2N q i < 1. A key ingredient in the proof of Theorem 2.4.3 is the Jacobian matrix J(q) associated with the vector-valued mappingq7!p(q), which is given in the following lemma. To characterize this Jacobian, we define the n-by-n matrix B(q)=(B i j (q) : i; j2 N) as B(q)= diag(Y(p(q)))Ñ 2 G(Y(p(q)))diag(Y(p(q))) 1+ G(Y(p(q))) : The proof of Lemma 2.4.4 is given in Appendix A.6. Lemma 2.4.4 (Jacobian) The Jacobian matrix J(q)= ¶ p i (q) ¶q j : i; j2 N is given by J(q)= 1 b diag(q)qq > + B(q) 1 : We are ready to give the proof of Theorem 2.4.3. Proof of Theorem 2.4.3: First, we show the expression forÑR(q). Since R(q)=å i2N p i (q)q i , it follows that ÑR(q)=p(q)+ J(q) > q=p(q) 1 b diag(q)qq > + B(q) 1 q; where the last equality follows from Lemma 2.4.4. Consider the matrix(diag(q)qq > + B(q)) 1 on the right side above. In the proof of Theorem 2.4.1, we show that diag(Y(s))Ñ 2 G(Y(s))diag(Y(s)) is positive semidefinite and each of its rows sums to zero. Noting the definition of B(q), we can use precisely the same argument to show that B(q) is positive semidefinite and each of its rows sums to zero as well. Since diag(q) is positive definite and B(q) is positive semidefinite, diag(q)+ B(q) is invertible, in which case, we get (diag(q)+ B(q)) 1 (diag(q)+ B(q))e=e. We have B(q)e= 0 because the rows of B(q) sum to zero. Noting also that diag(q)e=q, the last equality implies that(diag(q)+ B(q)) 1 q=e. Using the fact that diag(q)+ B(q) is symmetric, taking the transpose, we haveq > (diag(q)+ B(q)) 1 =e > 21 as well. In this case, since we have 1q > (diag(q)+ B(q)) 1 q = 1q > e= 1å n i=1 q i > 0, by the Sherman-Morrison formula, the inverse of diag(q)qq > + B(q) exists and it is given by (diag(q)qq > + B(q)) 1 = (diag(q)+ B(q)) 1 + (diag(q)+ B(q)) 1 qq > (diag(q)+ B(q)) 1 1q > (diag(q)+ B(q)) 1 q = (diag(q)+ B(q)) 1 + 1 1e > q ee > : Using the equality above in the expression forÑR(q) at the beginning of the proof, together with the fact that(diag(q)+ B(q)) 1 q=e, we get ÑR(q) = p(q) 1 b e+ e > q 1e > q e = p(q) 1 b(1e > q) e; which is the desired expression for ÑR(q). Second, we show that Ñ 2 R(q) is negative definite. By the discussion at the beginning of the proof, B(q) is positive semidefinite. By the first linear algebra result discussed right after Lemma 2.4.2, diag(q)qq > is positive definite. Thus,(diag(q)qq > +B(q)) 1 is positive definite. Writing the gradient expression above componentwise, we get ¶R(q)=¶q i = p i (q) 1=(b(1å k2N q k )); differentiating it with respect to q j , we obtain ¶ 2 R(q)=¶q i ¶q j = ¶ p i (q)=¶q j 1=(b(1å k2N q k ) 2 ). The last equality in matrix notation is Ñ 2 R(q)= J(q) 1 b(1å k2N q k ) 2 ee > = 1 b diag(q)qq > + B(q) 1 + 1 (1å k2N q k ) 2 ee > ; where the last equality uses Lemma 2.4.4. The above equality shows that Ñ 2 R(q) is negative definite, because diag(q)qq > + B(q) is positive definite, so its inverse is also positive definite. The results that we present in this section demonstrate that any optimization problem that maximizes the expected revenue subject to constraints that are convex in the market shares of the products can be solved efficiently, as long as customers choose under a member of the GEV family with homogeneous price sensitivities. As discussed at the beginning of Section 2.4, our formulation of the constrained pricing problem finds applications in network revenue management settings, where the goal is to set the prices for the products to maximize the expected revenue obtained from each customer, the sale of a product consumes a combination of resources, and the resources have limited inventories [52, 14]. Our formulation also becomes useful when the products have limited inventories and we set the prices for the products to maximize the expected revenue obtained from each customer, subject to the constraint that the expected 22 sales of the products do not exceed their inventories [61, 49]. There may be other applications, where the goal is to set the prices for the products so that the market shares of the products deviate from fixed desired market shares by at most a given margin. We can handle such constraints through convex functions of the market shares as well. In Appendix A.7, we provide a short numerical study in the network revenue management setting, where the choices of the customers are governed by the paired combinatorial logit model with homogeneous price sensitivities. When we have as many as 200 products and 80 resources, we can use our results to solve the corresponding constrained multi-product pricing problem in less than 20 seconds. We note that all the results in this section continue to hold when products have marginal costs. In that case, the revenue function is given by R(q)=å i2N (p i (q) c i )q i , where c i denotes the marginal cost of product i. The statements of all theorems and lemmas remain the same, except in Theorem 2.4.3 where the expression of the gradient ÑR(q) will change to ÑR(q)=p(q) 1 b(1e > q) ec to include the marginal cost vectorc=(c 1 ;:::;c n ). Under the multinomial logit model with homogeneous price sensitivities, Proposition 2 in [125] and Section 3.3 in [150] give a formula for the prices that achieve given market shares and show that the expected revenue function is concave in the market shares. Our Theorems 2.4.1 and 2.4.3 generalize these results to an arbitrary GEV model. When the customers choose according to the nested logit model and the products in each nest have the same price sensitivity, the discussion at the beginning of Section 2.1 and Theorem 1 in [90] give a formula for the prices that achieve given market shares and show that the expected revenue function is concave in the market shares. Naturally, if the price sensitivities of all products are the same, then these results apply and imply that the expected revenue function under the nested logit model is concave in the market shares. Also, considering the same setup discussed at the end of Section 2.3, where the set of products are partitioned into the subsets and the generating function is separable by the partitions, in Appendix A.8, we show that if the products in each partition share the same price sensitivity, then we can compute the prices that achieve given market shares efficiently and the expected revenue function is concave in the market shares. The generating function for the nested logit model is separable by the products in each nest, so our results in Appendix A.8 generalize the discussion at the beginning of Section 2.1 and Theorem 1 in [90] to an arbitrary GEV model with a separable generating function. 23 2.5 Conclusions This chapter unifies and extends some of the pricing results that were discovered under special cases of the GEV model, such as the multinomial logit, nested logit, and paired combinatorial logit models. The value of our results derives from the fact that they hold under any arbitrary GEV model. For instance, to our knowledge, there has been no attempt to solve multi-product constrained pricing problems under the paired combinatorial logit model. The generality of our results comes at the cost of assuming that the price sensitivity parameters of the products are identical. Existing research has shown that for certain product categories, this is a reasonable assumption. An important avenue for research is to investigate to what extent our results can be extended to non-homogeneous price sensitivities. In Appendices A.3 and A.8, we extend our results to the case where the set of products are partitioned into subsets, the generating function is separable by the products in different partitions, and the products in a partition share the same price sensitivity. This extension is a step towards non-homogeneous price sensitivities, but addressing completely general price sensitivities seems rather nontrivial. Another interesting research direction is to consider assortment optimization problems under the GEV choice model. The assortment optimization problem has a combinatorial nature, and thus, it appears to need an entirely new line of attack. 24 Chapter 3 Assortment Optimization under the Paired Combinatorial Logit Models 3.1 Introduction Traditional revenue management models commonly assume that each customer arrives into the system with the intention to purchase a particular product. If this product is available for purchase, then the customer purchases it; otherwise, the customer leaves the system without a purchase. In reality, however, customers observe the set of available alternatives and make a choice among the available alternatives. Under such a customer choice process, the demand for a particular product depends on the availability of other products. In this case, discrete choice models provide a useful representation of demand since discrete choice models capture the demand for each product as a function of the entire set of products in the offer set. A growing body of literature indicates that capturing the choice process of customers using discrete choice models can significantly improve the quality of operational decisions; see, for example, [50], [130], and [139]. While more refined choice models yield a more accurate representation of the customer choice process, the assortment and other operational problems under more refined choice models become more challenging. Thus, it is useful to identify realistic choice models, where the corresponding operational problems remain efficiently-solvable. In this chapter, we study assortment problems under the paired combinatorial logit (PCL) model. There is a fixed revenue for each product. Customers choose among the offered products according to the PCL model, which is discussed in Section 3.2. The goal is to find an offer set that maximizes the expected revenue obtained from a customer. We consider both the uncapacitated version, where we can offer any subset of products, as well as the capacitated version, where there is an upper bound on the number of 25 products that we can offer. We show that even the uncapacitated assortment problem is strongly NP-hard. We give a framework for constructing approximation algorithms for the assortment problem. We use this framework to develop approximation algorithms for the uncapacitated and capacitated versions. Our computational experiments on randomly generated problem instances demonstrate that our approximation algorithms perform quite well, yielding solutions with no larger than 0.9% optimality gaps on average. The PCL model is compatible with random utility maximization, where each customer associates ran- dom utilities with the alternatives. The utilities are sampled from a certain distribution. The customer knows the utilities and chooses the alternative that provides the largest utility. Other choice models, such as the multinomial logit, nested logit and a mixture of multinomial logit models, are also compatible with random utility maximization. The PCL model allows correlations between the utilities of any pair of alter- natives. In contrast, the multinomial logit model assumes that the utilities are independent. In the nested logit model, the alternatives are grouped into nests. There is a single parameter governing the correla- tion between the utilities of the alternatives in the same nest. The utilities of the alternatives in different nests are independent. A mixture of multinomial logit models allows general correlations, but it presents difficulties in solving the corresponding assortment problem, as discussed in our literature review. By allowing correlations between the utilities of any pair of alternatives, the PCL model captures the situation where the preference of a customer for one product offers insight into their inclination to- wards another product. There is work in the literature showing that there can be significant correlations between the utilities of the alternatives when passengers choose, for example, among travel modes and routes. When such correlations between the utilities of the alternatives are present, the PCL model can provide better predictions of the demand process of the passengers; see [25], [84], and [112]. However, although the PCL model can provide better predictions of the demand process when there are correlations between the utilities of different alternatives, there is little research on understanding the complexity of making operational decisions under the PCL model and providing efficient algorithms for making such decisions. Our work in this chapter is directed towards filling this gap. 3.1.1 Main Contributions We make four main contributions. First, we show that the uncapacitated assortment problem under the PCL model is strongly NP-hard. Our proof uses a reduction from the max-cut problem. This result is in contrast with the assortment problem under the closely-related multinomial logit and nested logit models. There 26 exist polynomial-time algorithms to solve even the capacitated assortment problem under the multinomial logit and nested logit models. Second, we give a framework to develop approximation algorithms for assortment problems under the PCL model. In particular, we show that the assortment problem is equivalent to finding the fixed point of a function f :R!R, whose evaluation at a certain point requires solving a nonlinear integer program. To obtain ana-approximation algorithm, we design an upper bound f R :R!R to the function f() and compute the fixed point ˆ z of f R (). In this case, we develop ana-approximation algorithm for the nonlinear integer program that computes the value of f() at ˆ z. It turns out that an a-approximate solution to this nonlinear integer program is an a-approximate solution to the assortment problem. [37], [45] , and [51] use the connection between the assortment problem under the nested logit model and the fixed point of a function, but the function involved under the nested logit model is significantly simpler since it can be computed by focusing on one nest at a time. Third, we use our approximation framework to give an approximation algorithm for the uncapacitated assortment problem. We construct the upper bound f R () through a linear programming (LP) relaxation of the nonlinear integer program that computes f(). We that show we can compute the fixed point of f R () by solving an LP. After we compute the fixed point ˆ z of f R (), we get a solution to the LP that computes f R () at ˆ z. We use randomized rounding on this solution to get a 0.6-approximate solution to the nonlinear integer program that computes f() at ˆ z. Lastly, to obtain a deterministic algorithm, we de-randomize this approach by using the standard method of conditional expectations; see Section 15.1 in [4] and Section 5.2 in [146]. Our framework can allow other approximation algorithms. For example, if we construct f R () by using a semidefinite programming (SDP) relaxation of the nonlinear integer program, then we can use the spherical rounding method of [57] to obtain a 0.79-approximation algorithm for the uncapacitated assortment problem. This approximation algorithm requires solving an SDP. We can theoretically solve an SDP in polynomial time, but solving a large-scale SDP in practice can be computationally difficult. Thus, the SDP relaxation can be less appealing than the LP relaxation. Fourth, we give an approximation algorithm for the capacitated assortment problem, also by using our approximation framework. Here, we exploit the structural properties of the extreme points of the LP relaxation and use an iterative rounding method, followed by coupled randomized rounding, to develop a 0.25-approximation algorithm. In this algorithm, if there are n products that can be offered to the cus- tomers, then we solve at most n successive LP relaxations, fixing the value of one decision variable after solving each LP relaxation. Once we solve the LP relaxations, we perform coupled randomized rounding 27 on the solution of the last LP relaxation to obtain a solution to the assortment problem. Using the method of conditional expectations, we can de-randomize this solution to obtain a deterministic algorithm with the same performance guarantee. We provide computational experiments over a large collection of test problems. The practical performance of our approximation algorithms is substantially better than their theoretical performance guarantees. Our approximation algorithms, on average, yield expected revenues within 0.9% of an efficiently-computable upper bound on the optimal expected revenue. 3.1.2 Literature Review Considering operational decisions under the PCL model, the only work we are aware of is [91], where the authors study pricing problems under the PCL model. In their problem, the goal is to choose the prices for the products to maximize the expected revenue obtained from a customer. The authors give sufficient conditions for the price sensitivities of the products to ensure that the pricing problem can be solved effi- ciently. Despite limited work on solving operational problems under the PCL model, there is considerable work, especially in the transportation literature, on using the PCL model to capture travel mode and route choices. [84] estimate the parameters of the PCL model by using real data on the travel mode choices of the passengers. Their empirical results indicate that there are statistically significant correlations between the utilities that a passenger associates with different travel modes. [25] and [112] use numerical examples to demonstrate that the PCL model can provide improvements over the multinomial logit model in predicting route choices. The authors argue that different routes overlap with each other to varying extents, creating complex correlations between the utilities provided by different routes. [26] and [76] study various traffic equilibrium problems under the PCL model and discuss the benefits from using this choice model. The nu- merical work in the transportation literature demonstrates that the PCL model can provide improvements over the multinomial logit and nested logit models in predicting travel mode and route choices, especially when the utilities provided by different alternatives exhibit complex correlation structures. There is considerable work on assortment and pricing problems under the multinomial logit and nested logit models. In the multinomial logit model, the utilities of the products are independent of each other. In the nested logit model, the products are grouped into disjoint nests. Associated with each nest, there is a dissimilarity parameter characterizing the correlation between the utilities of the products in the same nest, but the utilities of the products in different nests are independent of each other. In the PCL model, there exists one nest for each pair of products, so the nests are overlapping. Associated with each nest, 28 there is a separate dissimilarity parameter characterizing the correlation between the utilities of each pair of products. Therefore, when compared with the multinomial logit and nested logit models, we can use the PCL model to specify a significantly more general correlation structure between the utilities of the products. [132] provides a through discussion of the multinomial logit, nested logit and PCL models. [84] discuss the correlation structure of the utilities under the PCL model, including the joint distributions and the correlation coefficients of the utilities. [50] and [130] give an efficient algorithm for the uncapacitated assortment problem under the multi- nomial logit model. [36] formulate an LP to solve the assortment problem under the multinomial logit model when there are constraints on the offered assortment that can be represented through a totally uni- modular constraint structure. In a mixture of multinomial logit models, there are multiple customer types and customers of different types choose according to different multinomial logit models. [100] show that a mixture of a multinomial logit models can allow arbitrary correlations between the utilities of the alterna- tives. [19] show that the assortment problem under this choice model is NP-hard. Letting m be the number of customer types, the authors assume that m can get large. [119] show that the same problem is NP-hard even when m= 2. [40] show that there is no polynomial-time algorithm with an approximation guarantee of O(1=m 1d ) for any constantd > 0 unless NP BPP. Considering the assortment problem under the nested logit model, [37] develop an efficient algorithm for the unconstrained version of the problem. Their algorithm requires solving an LP whose size grows polynomially with the number products and nests. [51] study the assortment problem under the nested logit model when a capacity constraint limits the number of products offered in each nest, whereas [45] study the same problem when a capacity constraint limits the total number of products offered in all nests. For both versions, the authors give efficient algorithms to compute the optimal assortment. [89] consider the assortment problem under the multi-level nested logit model, where the products are hierarchically orga- nized into nests and subnests. The authors give an efficient algorithm to compute the optimal assortment. [34] and [139] empirically use the multinomial logit and nested logit models in airline applications. In the past, [41], [62], [125] and [151] study pricing problems under the multinomial logit model. As- suming that the price sensitivities of the products are the same, the authors show that the expected revenue is concave when expressed as a function of the purchase probabilities of the products. [152] generalize this result by showing that if the price sensitivities of the products are the same, then the expected revenue in the pricing problem is concave in the purchase probabilities under any generalized extreme value model, 29 which is a broad class of choice models that includes the multinomial logit, nested logit and PCL mod- els. [90] investigate the pricing problem under the nested logit model when the products in the same nest share the same price sensitivities and show that the problem can be formulated as a convex program. [53] analyze the same problem when the products have different price sensitivities. They show that the pricing problem can be cast as a search over a single dimension. [69] show that the pricing problem under the multi-level nested logit model can also be cast as a search over a single dimension. [116] consider the pricing problem under the nested logit model with arbitrary price sensitivities and show how to compute a (1e)-approximate solution by solving an LP whose size grows polynomially with the input size and 1=e. The papers discussed in this paragraph focus on pricing problems, where the prices take values over a continuum. Thus, these papers focus on characterizing the stationary points of smooth optimization problems, whereas the decisions in the assortment optimization setting are inherently discrete. The chapter is organized as follows. In Section 3.2, we formulate the uncapacitated and capaci- tated assortment problems and show that even the uncapacitated version is strongly NP-hard. In Sec- tion 3.3, we give our approximation framework. In Section 3.4, we use our approximation framework to give a 0.6-approximation algorithm for the uncapacitated problem. Also, we discuss how to obtain a 0.79-approximation algorithm by using an SDP relaxation. In Section 3.5, we use our approximation framework to give a 0.25-approximation algorithm for the capacitated problem. In Section 3.6, we give our computational experiments. In Section 3.7, we conclude. 3.2 Assortment Problem In this section, we formulate the assortment problem under the PCL model, characterize its complexity and discuss the use of the PCL model in assortment problems. 3.2.1 Problem Formulation and Complexity The set of products is indexed by N =f1;:::;ng. The revenue of product i is p i 0. We use the vector x=(x 1 ;:::;x n )2f0;1g n to capture the subset of products that we offer to the customers, where x i = 1 if and only if we offer product i. We refer to the vector x simply as the assortment or the subset of products that we offer. Throughout the chapter, we denote the vectors and matrices in bold font. We denote the collection of nests by M=f(i; j)2 N 2 : i6= jg. For each nest (i; j)2 M, we let g i j 2[0;1] be the dissimilarity parameter of the nest. For each product i, we let v i be the preference weight of product i. 30 Under the PCL model, we can view the choice process of a customer as taking place in two stages. First, the customer either decides to make a purchase in one of the nests or leaves the system without making any purchase. In particular, letting V i j (x)= v 1=g i j i x i + v 1=g i j j x j and using v 0 > 0 to denote the preference weight of the no purchase option, if we offer the subset of productsx, then a customer decides to make a purchase in nest(i; j) with probability V i j (x) g i j =(v 0 +å (k;`)2M V k` (x) g k` ). Second, if the customer decides to make a purchase in nest(i; j), then she chooses product i with probability v 1=g i j i x i =V i j (x), whereas she chooses product j with probability v 1=g i j j x j =V i j (x). Note that if a customer decides to make a purchase in nest(i; j), then she must choose one of the products in this nest. If we offer the subset of productsx and a customer has already decided to make a purchase in nest(i; j), then the expected revenue that we obtain from the customer is R i j (x) = p i v 1=g i j i x i + p j v 1=g i j j x j V i j (x) : We usep(x) to denote the expected revenue that we obtain from a customer when we offer the subset of productsx. In this case, we have p(x) = å (i; j)2M V i j (x) g i j v 0 +å (k;`)2M V k` (x) g k` R i j (x) = å (i; j)2M V i j (x) g i j R i j (x) v 0 +å (i; j)2M V i j (x) g i j : Throughout the chapter, we consider both uncapacitated and capacitated assortment problems. In the uncapacitated assortment problem, we can offer any subset of products to the customers. In the capaci- tated assortment problem, we have an upper bound on the number of products that we can offer to the customers. To capture both the uncapacitated and capacitated assortment problems succinctly, for some c2Z + , we useF =fx2f0;1g n :å i2N x i cg to denote the feasible subsets of products that we can offer to the customers. Since there are n products, the constraintå i2N x i c is not binding when we have c n. Thus, we obtain the uncapacitated assortment problem by choosing a value of c that is no smaller than n, whereas we obtain the capacitated assortment problem with other values of c. In the assortment problem, our goal is to find a feasible subset of products to offer that maximizes the expected revenue obtained from a customer, corresponding to the combinatorial optimization problem z = max x2F p(x) = max x2F ( å (i; j)2M V i j (x) g i j R i j (x) v 0 +å (i; j)2M V i j (x) g i j ) : (Assortment) 31 We note that our formulation of the PCL model is slightly different from the one that often appears in the literature. In the existing literature, the collection of nests is oftenf(i; j)2 N 2 : i< jg, whereas in our formulation, the collection of nests isf(i; j)2 N 2 : i6= jg. If we let g i j =g ji for all (i; j)2 N 2 with i> j and double the preference weight of the no purchase option in our formulation, then it is simple to check that the two formulations of the PCL model are consistent. Our formulation of the PCL model will significantly reduce the notational burden. In the next theorem, we show that the Assortment problem is strongly NP-hard even whenF =f0;1g n so that the problem is uncapacitated andg i j =g ji for all(i; j)2 M so that we focus on the formulation of the PCL model that often appears in the literature. Theorem 3.2.1 (Computational Complexity) The Assortment problem is strongly NP-hard, even when we haveF =f0;1g n andg i j =g ji for all(i; j)2 M. The proof of Theorem 3.2.1 is in Appendix B.1. It uses a reduction from the max-cut problem, which is a well-known NP-hard problem; see Section A.2.2 in [54]. Motivated by this complexity result, we focus on developing approximation algorithms for the Assortment problem. Fora2[0;1], ana-approximation algorithm is a polynomial-time algorithm that, for any problem instance, computes an assortment ^ x2F , whose expected revenue is at leasta times the optimal expected revenue; that is, since the optimal expected revenue is z ,p(^ x)a z . 3.2.2 Paired Combinatorial Logit Model in Assortment Problems In the Assortment problem, the assortment that we offer is a decision variable. Nevertheless, we use the PCL model with the same parameters to capture the choices of the customers within different assortments, simply dropping the products that are not offered. We can justify our approach by using the fact that the PCL model is a generalized extreme value model, which is a broad class of choice models that is based on random utility maximization. In a generalized extreme value model, if we offer the assortmentx, then a customer associates the random utilitiesfm i (x i )+e i : i2 Ng with the products, where m i (x i ) is the deterministic component ande i is the random shock for the utility of product i. For some fixedb i 2R, the deterministic component is given bym i (x i )=b i if x i = 1, whereasm i (x i )=¥ if x i = 0. So, if a product is not offered, then its utility is negative infinity. Similarly, a customer associates the random utility m 0 +e 0 with the no purchase option. The no purchase option is always available, but for notational uniformity, we use m 0 (x 0 ) to denote the deterministic component of the utility of the no purchase option. For some 32 fixed b 0 2R, we have m 0 (x 0 )= b 0 . The random shocks (e 0 ;e 1 ;:::;e n ) have a cumulative distribution function of the formPfe 0 u 0 ;e 1 u 1 ;:::;e n u n g= exp(G(e u 0 ;e u 1 ;:::;e u n )) for some function G :R n+1 + !R + . Different choices of G yield different generalized extreme value models. A customer chooses the alternative that provides the largest utility. Therefore, if we offer the assortmentx, then a customer chooses product i with probabilityPfm i (x i )+e i = max j2N[f0g m j (x j )+e j g. Theorem 1 in [98] shows that if the function G(;;:::;) satisfies a number of prop- erties that ensure that F(u 0 ;u 1 ;:::;u n ) = exp(G(e u 0 ;e u 1 ;:::;e u n )) is a cumulative distri- bution function, then the purchase probability of product i under a generalized extreme value model has a closed-form expression given by Pfm i (x i )+e i = max j2N[f0g m j (x j )+e j g = e m i (x i ) ¶ i G(e m 0 (x 0 ) ;e m 1 (x 1 ) ;:::;e m n (x n ) )=G(e m 0 (x 0 ) ;e m 1 (x 1 ) ;:::;e m n (x n ) ), where ¶ i G(y 0 ;y 1 ;:::;y n ) is the par- tial derivative of G(;;:::;) with respect to the i-th coordinate evaluated at (y 0 ;y 1 ;:::;y n ). The PCL model is a generalized extreme value model that corresponds to the choice of G(;;:::;) given by G(y 0 ;y 1 ;:::;y n )= y 0 +å (i; j)2M (y 1=g i j i + y 1=g i j j ) g i j . Recalling that v i is the preference weight of product i in the previous section, the preference weight of product i is related to the deterministic component of the utility of this product through the relationship v i = e b i . So, sincem i (x i )=b i = logv i if x i = 1, whereas m i (x i )=¥ if x i = 0, we get e m i (x i ) = v i x i . In this case, if we offer the assortmentx, then the purchase probability of product i under the PCL model is P n m i (x i )+e i = max j2N[f0g m j (x j )+e j o = e m i (x i ) ¶ i G(e m 0 (x 0 ) ;e m 1 (x 1 ) ;:::;e m n (x n ) ) G(e m 0 (x 0 ) ;e m 1 (x 1 ) ;:::;e m n (x n ) ) = e m i (x i ) å j2N:(i; j)2M e m i (x i )(1=g i j 1) (e m i (x i )=g i j + e m j (x j )=g i j ) g i j 1 e m 0 (x 0 ) + å (k;`)2M (e m k (x k )=g k` + e m ` (x ` )=g k` ) g k` = å j2N:(i; j)2M v 1=g i j i x i V i j (x) V i j (x) g i j v 0 + å (k;`)2M V k` (x) g k` ; where the second equality follows directly by differentiating G(;;:::;) and the third equality holds since e m i (x i )=g i j + e m j (x j )=g i j = v 1=g i j i x i + v 1=g i j j x j = V i j (x). In the choice process discussed in the previous section, if we offer the assortmentx, then a customer chooses nest(i; j) with probability V i j (x) g i j =(v 0 +å (k;`)2M V k` (x) g k` ). If the customer chooses nest(i; j), then she purchases product i with probability v 1=g i j i x i =V i j (x). Thus, the purchase probability of product i under any assortmentx in the choice process discussed in the previous section is the same as the purchase probability on the right side above, which is obtained by using random utility maximization, justifying the use of the PCL model with the same parameters to capture choices within different assortments. 33 0.1 0.4 0.7 1.0 0 0.125 0.25 0.375 0.5 12 Corr. Coe↵ . of ✏ 1 and ✏ 2 Figure 3.1: Correlation coefficient of(e 1 ;e 2 ) as a function ofg 12 . Since the cumulative distribution function of (e 0 ;e 1 ;:::;e n ) is Pfe 0 u 0 ;e 1 u 1 ;:::;e n u n g = exp(G(e u 0 ;e u 1 ;:::;e u n )) with G(y 0 ;y 1 ;:::;y n ) = y 0 + å (i; j)2M (y 1=g i j i + y 1=g i j j ) g i j , noting that G(;;:::;) is not separable by the products, we cannot express the cumulative distribution function of (e 0 ;e 1 ;:::;e n ) as a product of the individual cumulative distribution functions of e 0 ;e 1 ;:::;e n . Thus, the random shocks are not independent. [84] discuss the correlation structure of the utilities under the PCL model, showing that the joint distribution of the pair(e i ;e j ) depends only on the parameters(g i j ;g ji ). Con- sidering the caseg i j =g ji , we useg i j with i< j to denote the common value ofg i j andg ji . In Figure 3.1, we plot the correlation coefficient of(e 1 ;e 2 ) as a function ofg 12 for the case with n= 3. Asg 12 increases, the correlation coefficient of(e 1 ;e 2 ) decreases. Ifg 12 = 1, thene 1 ande 2 are uncorrelated. We can use the parametersfg i j :(i; j)2 Mg to induce rather intricate correlations between the utilities. We can setg 13 = 1 to ensure that the utilities of alternatives 1 and 3 are uncorrelated, but we can also setg 12 < 1 andg 23 < 1 to ensure that there are correlations between the utilities of alternatives of 1 and 2, as well as between those of alternatives 2 and 3. Such modeling flexibility can be useful in a variety of settings. [84] give an example where different alternatives correspond to different paths for commuting. Alternative 1 takes roads a and c. Alternative 2 takes roads b and c. Alternative 3 takes roads b and d. Since alternatives 1 and 2 share road c, the travel times provided by these alternatives are expected to be correlated, resulting in correlations between the utilities of alternatives 1 and 2. Similarly, alternatives 2 and 3 share road b. Alternatives 1 and 3 do not share a road, in which case, the travel times provided by these alternatives may be uncorrelated, so there may be no correlation between the utilities of alternatives 1 and 3. Similar modeling flexibility can be useful when different products display different levels of similarity to each other. 34 The type of correlation structure discussed above is not possible under the multinomial logit and nested logit models. Under the multinomial logit model, the utilities of the different alternatives are uncorrelated. Under the nested logit model, the alternatives are grouped in nests. The utilities of the alternatives in same nest are correlated and there is a single parameter that governs the correlation between the utilities of the alternatives in the same nest. The utilities of the alternatives in different nests are uncorrelated. Thus, under the nested logit model, if the utilities of alternatives 1 and 2 are correlated and the utilities of alternatives 2 and 3 are correlated, then the utilities of alternatives 1 and 3 must be correlated. As stated in Section 2.3 in [84], neither the PCL nor the nested logit model is a special case of the other, but the flexible correlation structure provided by the PCL model can be useful in a variety of settings. To further substantiate this point, in the next subsection, we will provide a small numerical example to show that PCL fit complex choice behaviors significantly better than the multinomial logit model. The generalized nested logit model is a generalized extreme value model that subsumes the multino- mial logit, nested logit and PCL models as special cases. As discussed in Section 4.4.2 in [132], we can interpret the choice process of a customer under any generalized nested logit model as taking place in two stages. In the first stage, the customer chooses a nest. In the second stage, the customer chooses a product within the nest. Under the nested logit model, since each alternative appears in only one nest with the utilities of the alternatives in the same nest being correlated and the utilities of the alternatives in different nests being uncorrelated, it is often the case that the two stages correspond to the actual thought process that a customer goes through when making a purchase. In the first stage, the customer chooses a group of products that are similar to each other, which is captured by a nest. In the second stage, the customer chooses a product within this group. We can interpret the choice process of a customer under the PCL model as taking place in two stages, but it is more difficult to argue that the two stages correspond to the actual thought process that a customer goes through when making a purchase. Instead, the two stages occur simply as a result of the correlation structure between the utilities and we only use the two stages to intuitively describe the PCL model. Lastly, since only the random shocks for the products that are offered play a role in the choice process, it is enough to sample the random shocks for the offered products according to their corresponding joint distribution. This approach is equivalent to sampling the random shocks for all products, but ignoring those for the products that are not offered. 35 3.2.3 A Small Numerical Example: Predictive Power of the PCL Model In this part, we perform a small numerical example. We generate choice data with a complex ground choice model and fit the data with the PCL model and the multinomial logit model. Comparison of out-of-sample metrics review significant edge of the PCL model, as presented in Figure 3.3. Setup of the example: To better motivate the use of the paired combinatorial logit (PCL) model, we consider the route choice of a commuter that needs to travel from the origin mode to the destination node in the network given in Figure 3.2. The edges in the network are labeled byfe 1 ;e 2 ;e 3 ;e 4 ; f 1 ; f 2 ; f 3 ; f 4 g. The travel times on the edges are random, but the commuter knows the travel times on the edges before choosing a route. The (dis)-utility of a route is given by the sum of the travel times on the edges that are in the route. The travel times on different edges are independent of each other, but since two different routes can include the same edge, the travel times for the different routes can be dependent. The commuter chooses the route that provides the largest utility. To avoid proliferation of the routes that can be chosen by the commuter from the origin to the destination, we limit the set of possible routes that the commuter can take to five routes R 1 ;R 2 ;R 3 ;R 4 and R 5 , which are given by 1 R 1 : e 1 ! e 2 ! e 3 ! e 4 ; R 2 : e 1 ! e 2 ! e 3 ! f 4 ; R 3 : e 1 ! e 2 ! f 3 ! f 4 ; R 4 : e 1 ! f 2 ! f 3 ! f 4 ; R 5 : f 1 ! f 2 ! f 3 ! f 4 : Origin Des*na*on e 1 e 2 e 3 e 4 f 1 f 2 f 3 f 4 Figure 3.2: Network for the commuting example. Correlation structure for the utilities: There are two interesting features of the setup above. First, the utilities for each pair of routes may have a different level of correlation between them. Therefore, it is 1 We can consider the edges e 1 ;e 2 ;e 3 and e 4 as highway segments and f 1 ; f 2 ; f 3 and f 4 as local segments. In this case, the set of five routes R 1 ;R 2 ;R 3 ;R 4 and R 5 naturally occur when the commuter is not willing to get back on a highway segment once she gets on a local segment. 36 desirable to use a choice model that allows us to control the correlation between the utilities of each pair of routes, bringing up the PCL model as a natural candidate for describing the choice process of the commuter among the routes. Second, the correlation structure for the utilities of the routes is quite different from the correlation structure that is specified by the nested logit model. We explain these features in the two bullet points below. Since the different routes have different numbers of common edges, there can be different levels of correlation between the utilities of the routes. For example, routes R 1 and R 2 have three common edges. We expect the utilities of these routes to have relatively high correlation. On the other hand, routes R 1 and R 4 have only one common edge. We expect the utilities of these routes to have relatively low correlation. Since the routes R 1 and R 5 do not have a common edge, their utilities are independent of each other. We cannot partition the routes into disjoint nests in such a way what the utilities of the routes in different nests are independent of each other and the utilities of the routes in the same nest are dependent on each other. In particular, since routes R 1 ;R 2 ;R 3 and R 4 share edge e 1 , their utilities are dependent on each other. Thus, we may want to put these routes in the same nest. Similarly, since routes R 2 ;R 3 ;R 4 and R 5 share edge f 4 , their utilities are dependent on each other as well. Thus, we may want to put these routes in the same nest as well. If, however, we include all of the five routes in one nest, then the routes R 1 and R 5 , whose utilities are independent of each other, end up being in the same nest. Thus, the utilities of each pair of routes may have a different level of correlation between them. Fur- thermore, the correlation structure of the utilities do not fit the one under the nested logit model. The PCL model, with the parameterg i j for each pair of routes i and j, allows us to capture a different level of correlation between each pair of routes i and j. Fit of the PCL model: Consider the case where the utility provided by each edge is a normal random variable with mean 3 and standard deviation 1. The utility of a route is the sum of the utilities of the edges included in the route. The commuter chooses the route that provides the largest utility. Assuming that the commuter chooses among the routes according to such a ground choice model, we offer the commuter t randomly generated assortments of routes and sample the choice of the commuter within each of these assortments. In this way, we generate the training dataf(S t ;i t ) : t = 1;:::;tg, where S t is the t-th offered assortment and i t is the choice of the commuter from this assortment according to the ground choice model. We vary t to work with different levels of data availability. Note that the ground choice model is not a PCL model. 37 We fit a PCL and a multinomial logit (MNL) model to the training data. We check the mean absolute percent errors (MAPE) and the out-of-sample log-likelihoods of the fitted choice models. In particular, let P GRN i (S) be the probability that the commuter chooses route i within assortment of routes S under the ground choice model. Similarly, let P PCL i (S) be the probability that the commuter chooses route i within assortment of routes S under the fitted PCL model. Letting N =fR 1 ;:::;R 5 g be the set of routes, the MAPE for the fitted PCL model is MAPE PCL = 1 å S22 NjSj å S22 N å i2S jP GRN i (S) P PCL i (S)j P GRN i (S) : Note that we can compute the choice probability P GRN i (S) for the ground choice model by using simulation. We define the MAPE for the fitted MNL model similarly. Also, by using the same approach that we use to generate the training dataf(S t ;i t ) : t = 1;:::;tg, we generated another 10,000 assortment-choice pairs for the commuter from the ground choice model. Using these data points as the testing data, we compute the out-of-sample log-likelihoods of the fitted PCL and MNL models. Note that although the PCL model has more parameters than the MNL model, it is not necessarily guaranteed that the MAPE and the out-of- sample log-likelihood for the fitted PCL model will be better than those for the fitted MNL model. With its larger number of parameters, the PCL model may overfit to the data and its out-of-sample performance may be poor. We show our results in Figure 3.3. In the two panels, the horizontal axes show the number of data points t that we use to fit the PCL and MNL models. In each of the two panels, the vertical axes, respectively, shows the MAPE and the out-of-sample log-likelihoods of the fitted PCL and MNL models 2 . The overall observation from our results is that if t is reasonably large so that we have a reasonably large amount of data, then the MAPE and out-of-sample log-likelihood for the fitted PCL choice model are significantly better than those of MNL. However, if we have too little data, then the performance measures for the fitted PCL and MNL models are roughly comparable. Note that the number of parameters that we estimate for the PCL model is O(jNj 2 ), whereas the number of parameters that we estimate for the MNL model is O(jNj). Therefore, if we have too little data, then it may be difficult to estimate the larger number of parameters for the PCL model. For example, if t = 100, then out-of-sample log-likelihood for the fitted MNL model is better than that for the fitted PCL model. However, as the data availability increases, the fitted PCL model quickly starts to perform significantly better than the fitted MNL model. For example, 2 To reduce the effect of noise, we repeated our experiments 100 times and report the average of the MAPE’s and out-of-sample log-likelihoods over 100 repetitions. 38 Figure 3.3: MAPE’s and out-of-sample log-likelihoods for the fitted PCL and MNL models. if t = 1000 or larger, then the MAPE and the out-of-sample log-likelihood for the fitted PCL model are significantly better than those of the fitted MNL model. 3.3 A General Framework for Approximation Algorithms In this section, we provide a general framework that is useful for developing approximation algorithms for the Assortment problem. Our framework is applicable to both the uncapacitated and capacitated problems simultaneously. 3.3.1 Connection to a Fixed Point Problem Note that we have p(x) = å (i; j)2M V i j (x) g i j R i j (x) v 0 +å (i; j)2M V i j (x) g i j z if and only if å (i; j)2M V i j (x) g i j (R i j (x) z) v 0 z. Therefore, to check whether there exists a subset of products that provides a revenue of z or more, we can check whetherå (i; j)2M V i j (x) g i j (R i j (x) z) v 0 z for somex2f0;1g n . Motivated by this observation, we define the function f :R!R as f(z) = max x2F ( å (i; j)2M V i j (x) g i j (R i j (x) z) ) : (Function Evaluation) 39 For each x2F , the objective function of the Function Evaluation problem on the right side above is decreasing in z, which implies that f(z) is also decreasing in z. It is also not difficult to see that f() is continuous. For any z + z , letting x be an optimal solution to the Function Evaluation problem when we solve this problem with z = z , we have f(z ) = V i j (x ) g i j (R i j (x ) z ) and f(z + ) V i j (x ) g i j (R i j (x ) z + ), where the inequality uses the fact that x is a feasible but not necessarily an optimal solution to the Function Evaluation problem when we solve this problem with z= z + . Subtracting the last inequality from the last equality and noting that f() is decreasing, we obtain 0 f(z ) f(z + ) V i j (x ) g i j (z + z ). Letting V max = maxfv i : i2 Ng, we have V i j (x ) g i j 2V max by the definition of V i j (x). Thus, we obtain 0 f(z ) f(z + ) 2V max (z + z ), which implies that f() is continuous. Lastly,x= 02R n is a feasible solution to the Function Evaluation problem and provides an objective value of zero, which implies that f() 0. Therefore, f() is decreasing and continuous, satisfy- ing f(0) 0. On the other hand, defining the function g :R!R as g(z)= v 0 z, g() is strictly increasing and continuous, satisfying g(0)= 0 and lim z!¥ g(z)=¥. In this case, there exists a unique ˆ z 0 satisfy- ing f(ˆ z)= v 0 ˆ z. Note that the value of ˆ z that satisfies f(ˆ z)= v 0 ˆ z is the fixed point of the function f()=v 0 . It is possible to show that this value of ˆ z corresponds to the optimal objective value of the Assortment problem and we can solve the Function Evaluation problem with z= ˆ z to obtain an optimal solution to the Assortment problem. We do not give a proof for this result, since this result follows as a corollary to a more general result that we shortly give in Theorem 3.3.1. Noting that the Function Evaluation problem is a nonlinear integer program, finding the fixed point of f()=v 0 can be difficult. To get around this difficulty, we will construct an upper bound f R () on f() so that f R (z) f(z) for all z2R. We will construct the upper bound f R () by using an LP or SDP relaxation of the Function Evaluation problem, so we can compute the upper bound f R (z) at any z2R efficiently. The upper bound f R () will be decreasing and continuous, satisfying f R () 0. Therefore, by the same argument in the previous paragraph, there exists a unique ˆ z 0 satisfying f R (ˆ z)= v 0 ˆ z, which is the fixed point of the function f R ()=v 0 . In the next theorem, we show that this value of ˆ z upper bounds the optimal objective value of the Assortment problem and we can use this value of ˆ z to obtain an approximate solution. Theorem 3.3.1 (Approximation Framework) Assume that f R () satisfies f R (z) f(z) for all z2R. Let ˆ z 0 satisfy f R (ˆ z) = v 0 ˆ z and ^ x2F be such that å (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z)a f R (ˆ z) (Sucient Condition) 40 for somea2[0;1]. Then, we havep(^ x)a ˆ za z ; so, ˆ z upper bounds the optimal objective value of the Assortment problem and ^ x is ana-approximate solution to the Assortment problem. Proof: Noting that v 0 ˆ z = f R (ˆ z), we have a v 0 ˆ z = a f R (ˆ z) å (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) å (i; j)2M V i j (^ x) g i j (R i j (^ x)a ˆ z), where the first inequality uses the Sucient Condition. Thus, we have a v 0 ˆ z å (i; j)2M V i j (^ x) g i j (R i j (^ x)a ˆ z), in which case, solving for ˆ z in the last inequality, we get a ˆ z å (i; j)2M V i j (^ x) g i j R i j (^ x)=(v 0 +å (i; j)2M V i j (^ x) g i j ) = p(^ x). Next, we show that ˆ z z . We let x be an optimal solution to the Assortment problem. Since x is a feasible but not necessar- ily an optimal solution to the Function Evaluation problem when we solve this problem with z = ˆ z, we have f(ˆ z) å (i; j)2M V i j (x ) g i j (R i j (x ) ˆ z). Noting that v 0 ˆ z = f R (ˆ z) f(ˆ z), we obtain v 0 ˆ z å (i; j)2M V i j (x ) g i j (R i j (x ) ˆ z). Solving for ˆ z in this inequality, we get ˆ zå (i; j)2M V i j (x ) g i j R i j (x )=(v 0 + å (i; j)2M V i j (x ) g i j )=p(x )= z . By Theorem 3.3.1, we can obtain an approximate solution to the Assortment problem by using the fixed point of f R ()=v 0 , but this theorem also indicates that we can obtain an optimal solution to the Assortment problem by using the fixed point of f()=v 0 . In particular, letting ˆ z 0 be such that f(ˆ z)= v 0 ˆ z, let ^ x be an optimal solution to the Function Evaluation problem with z= ˆ z. Thus, we have f(ˆ z)= å (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z), which implies that ˆ z and ^ x satisfies the Sucient Condition with f R ()= f() and a = 1. In this case, by Theorem 3.3.1, we obtain p(^ x) ˆ z z . Since ^ x is a feasible but not necessarily an optimal solution to the Assortment problem, we also have z p(^ x), which yields z p(^ x) ˆ z z . Thus, all of the inequalities in the last chain of inequalities hold as equalities and ^ x is an optimal solution to the Assortment problem. By Theorem 3.3.1, to obtain an a-approximate solution to the Assortment problem, it is enough to execute the following three steps. Approximation Framework Step 1: Construct an upper bound f R () on f() such that f R (z) f(z) for all z2R. Step 2: Find the fixed point ˆ z of f R ()=v 0 ; that is, find the value of ˆ z such that f R (ˆ z)= v 0 ˆ z. Step 3: Find an assortment ^ x2F such thatå (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z)a f R (ˆ z). In Section 3.3.2, we show how to construct the upper bound f R () on f() by using an LP relaxation of the Function Evaluation problem. In Section 3.3.3, we show how to compute the fixed point of f R ()=v 0 by solving an LP. Using these results, we can execute Steps 1 and 2 in our approximation framework. In 41 Section 3.4, we tackle the uncapacitated problem and use randomized rounding on an optimal solution of an LP relaxation of the Function Evaluation problem to construct an assortment ^ x that satisfies the Sucient Condition with a = 0:6, yielding a 0:6-approximation algorithm. We also discuss an SDP relaxation to satisfy the Sucient Condition witha = 0:79. In Section 3.5, we tackle the capacitated problem and use iterative rounding to construct an assortment that satisfies the Sucient Condition witha= 0:25, yielding a 0:25-approximation algorithm. [37], [51] and [45] use analogues of the Function Evaluation problem and Theorem 3.3.1 to solve assortment problems under the nested logit model, but these authors face rather different algorithmic chal- lenges. First, under the nested logit model, since each product appears in one nest, the analogue of the Function Evaluation problem decomposes into one subproblem for each nest. The authors focus on the subproblem for each nest separately. Second, using an equivalent LP formulation for the subproblem for each nest, the authors characterize the optimal solutions for the subproblem for all possible values of z2R. In this way, they construct a collection of candidate assortments for each nest that is guaranteed to include the optimal assortment to offer in the nest. The size of the candidate collection for each nest is polynomial in the number of products. Third, the authors solve an LP to stitch together an optimal solution to the assortment problem by picking one assortment from the candidate collection for each nest. The steps described above are not possible under the PCL model. First, since a product appears in multiple nests under the PCL model, the Function Evaluation problem does not decompose by the nests. Second, the idea of constructing a collection of candidate assortments for each nest does not yield a useful algorithmic approach under the PCL model, since there are two products in each nest, yielding four pos- sible assortments in each nest anyway. Third, there is a stronger interaction between the nests under the PCL model. If we offer an assortment that includes product i in nest(i; j), then we must offer assortments that include product i in all nestsf(i;`) :`2 Nnfigg. Due to this interaction, we cannot solve an LP to stitch together an optimal solution to the assortment problem by picking one assortment from a candidate collection for each nest. 42 3.3.2 Constructing an Upper Bound We construct an upper bound f R () on f() by using an LP relaxation of the Function Evaluation prob- lem. Noting the definition of V i j (x) and R i j (x), we have V i j (x) g i j (R i j (x) z) = (v 1=g i j i x i + v 1=g i j j x j ) g i j (p i z)v 1=g i j i x i +(p j z)v 1=g i j j x j v 1=g i j i x i + v 1=g i j j x j : We letr i j (z) be the expression on the right side above when x i = 1 and x j = 1 andq i (z) be the expression on the right side above when x i = 1 and x j = 0. In other words, we have r i j (z) = (v 1=g i j i + v 1=g i j j ) g i j (p i z)v 1=g i j i +(p j z)v 1=g i j j v 1=g i j i + v 1=g i j j and q i (z) = v i (p i z): There are only four possible values of (x i ;x j ). In this case, letting m i j (z) = r i j (z)q i (z)q j (z) for notational brevity, we can express V i j (x) g i j (R i j (x) z) succinctly as V i j (x) g i j (R i j (x) z)=r i j (z)x i x j +q i (z)x i (1 x j )+q j (z)(1 x i )x j =m i j (z)x i x j +q i (z)x i +q j (z)x j : Writing its objective function aså (i; j)2M (m i j (z)x i x j +q i (z)x i +q j (z)x j ), the Function Evaluation problem is equivalent to f(z)= max ( å (i; j)2M (m i j (z)x i x j +q i (z)x i +q j (z)x j ) : å i2N x i c; x i 2f0;1g8 i2 N ) : In general, the sign ofm i j (z) can be positive or negative, but we shortly show thatm i j (z) 0 whever p i z and p j z. To construct an upper bound f R () on f(), we use a standard approach to linearize the term x i x j in the objective function above. Define the decision variable y i j 2f0;1g with the interpretation that y i j = x i x j . To ensure that y i j takes the value x i x j , we impose the constraints y i j x i + x j 1, y i j x i and y i j x j . If x i = 0 or x j = 0, then the constraints y i j x i and y i j x j ensure that y i j = 0. If x i = 1 and x j = 1, then the constraint y i j x i + x j 1 ensures that y i j = 1. We define our upper bound f R () on f() by using the LP relaxation f R (z) = max å (i; j)2M (m i j (z)y i j +q i (z)x i +q j (z)x j ) (Upper Bound) 43 s:t: y i j x i + x j 1 8(i; j)2 M y i j x i ; y i j x j 8(i; j)2 M å i2N x i c 0 x i 1 8i2 N; y i j 0 8(i; j)2 M: Since the Upper Bound problem is an LP relaxation of the Function Evaluation problem, we have f R (z) f(z) for all z2R. Setting x i = 0 for all i2 N and y i j = 0 for all(i; j)2 M gives a feasible solution to the Upper Bound problem, so f R () 0. Sinceq i (z) and m i j (z) are continuous in z and the optimal objective value of a bounded LP is continuous in its objective function coefficients, f R (z) is continuous in z. It is not immediately clear that f R (z) is decreasing in z since it is not immediately clear that the objective function coefficient m i j (z) in the Upper Bound problem is decreasing in z. In the next lemma, we show that f R (z) is decreasing in z. Since f R () is decreasing and continuous with f R (0) 0, there exists a unique ˆ z 0 satisfying f R (ˆ z)= v 0 ˆ z. Lemma 3.3.2 (Monotonicity of Upper Bound) The optimal objective value of the Upper Bound problem is decreasing in z. Proof: Consider z + z and let(x ;y ) withy =fy i j :(i; j)2 Mg be an optimal solution to the Upper Bound problem when we solve this problem with z= z + . Since m i j (z)=r i j (z)q i (z)q j (z) by the definition ofm i j (z), we obtain f R (z + ) = å (i; j)2M (m i j (z + )y i j +q i (z + )x i +q j (z + )x j ) = å (i; j)2M (r i j (z + )y i j +q i (z + )(x i y i j )+q j (z + )(x j y i j )) å (i; j)2M (r i j (z )y i j +q i (z )(x i y i j )+q j (z )(x j y i j )) = å (i; j)2M (m i j (z )y i j +q i (z )x i +q j (z )x j ) f R (z ); 44 where the first inequality is by the fact that r i j (z) and q i (z) are decreasing in z, along with the fact that y i j x i and y i j x j , whereas the second inequality is by the fact that (x ;y ) is a feasible but not necessarily an optimal solution to the Upper Bound problem with z= z . One useful property of the Upper Bound problem is that there exists an optimal solution to this problem where the decision variable x i takes a non-zero value only when p i z and the decision variable y i j takes a non-zero value only when p i z and p j z. Thus, we need to keep the decision variable x i only when p i z and we need to keep the decision variable y i j only when p i z and p j z. This property allows us to significantly simplify the Upper Bound problem. In particular, let N(z)=fi2 N : p i zg and M(z)=f(i; j)2 N(z) 2 : i6= jg. In Lemma B.2.1 in Appendix B.2, we show that there exists an optimal solutionx =fx i : i2 Ng andy =fy i j :(i; j)2 Ng to the Upper Bound problem such that x i = 0 for all i62 N(z) and y i j = 0 for all(i; j)62 M(z). The proof of this result follows by showing that if ^ x=f ˆ x i : i2 Ng and ^ y=f ˆ y i j :(i; j)2 Ng is a feasible solution to the Upper Bound problem, then we can set ˆ x i = 0 for all i62 N(z) and ˆ y i j = 0 for all(i; j)62 M(z) while making sure that the solution(^ x; ^ y) remains feasible to the Upper Bound problem and we do not degrade the objective value provided by this solution. In this case, letting 1() be the indicator function and dropping the decision variable x i for all i62 N(z) and the decision variable y i j for all(i; j)62 M(z), we write the objective function of the Upper Bound problem as å (i; j)2M 1(i2 N(z); j2 N(z))(m i j (z)y i j +q i (z)x i +q j (z)x j ) + å (i; j)2M 1(i2 N(z); j62 N(z))q i (z)x i + å (i; j)2M 1(i62 N(z); j2 N(z))q j (z)x j : For the last two sums, we have å (i; j)2M 1(i2 N(z); j62 N(z))q i (z)x i =jNn N(z)jå i2N(z) q i (z)x i and å (i; j)2M 1(i62 N(z); j2 N(z))q j (z)x j =jNnN(z)jå j2N(z) q j (z)x j . Thus, the objective function of the Up- per Bound problem takes the formå (i; j)2M(z) (m i j (z)y i j +q i (z)x i +q j (z)x j )+ 2jNn N(z)jå i2N(z) q i (z)x i . A simple lemma, given as Lemma B.2.2 in Appendix B.2, shows that m i j (z) 0 for all(i; j)2 M(z). So, the decision variable y i j takes its smallest possible value in the Upper Bound problem, which implies that the constraints y i j x i and y i j x j are redundant. In this case, the Upper Bound problem is equivalent to the problem f R (z) = max å (i; j)2M(z) (m i j (z)y i j +q i (z)x i +q j (z)x j )+ 2jNn N(z)j å i2N(z) q i (z)x i s:t: y i j x i + x j 1 8(i; j)2 M(z) (Compact Upper Bound) 45 å i2N(z) x i c 0 x i 18i2 N(z); y i j 0 8(i; j)2 M(z): Both the Upper Bound and Compact Upper Bound problems will be useful. We will use the Upper Bound problem to find the fixed point of f R ()=v 0 . We will use the Compact Upper Bound problem above to find an assortment ^ x satisfying the Sucient Condition. Noting the objective function å (i; j)2M (m i j (z)x i x j +q i (z)x i +q j (z)x j ) in the equivalent formulation of the Function Evaluation problem, this problem does not decompose by the nests even when we focus on the uncapacitated assortment problem. Therefore, unlike the approach used by [37], [51] and [45] for tackling assortment problems under the nested logit model, since the Function Evaluation problem does not decompose by the nests under the PCL model, it is difficult to characterize the optimal or approximate solutions to the Function Evaluation problem for all values of z2R. Instead, we approximate the Function Evaluation problem without decomposing it. Rather than characterizing approximate solutions for all values of z2R, we construct an approximate solution to the Function Evaluation problem with z= ˆ z, where ˆ z is the fixed point of f R ()=v 0 . 3.3.3 Computing the Fixed Point To compute the fixed point of f R ()=v 0 , we use the dual of the Upper Bound problem. For each(i; j)2 M, leta i j ,b i j , andg i j be the dual variables of the constraints y i j x i +x j 1, y i j x i and y i j x j , respectively. For each i2 N, letl i be the dual variable of the constraint x i 1. Letd be the dual variable of the constraint å i2N x i c. The dual of the Upper Bound problem is f R (z) = min cd+ å i2N l i + å (i; j)2M a i j (Dual) s:t: a i j +b i j +g i j m i j (z) 8(i; j)2 M d+l i + å j2N 1( j6= i)(a i j +a ji b i j g ji ) 2(n 1)q i (z) 8i2 N a i j 0; b i j 0; g i j 08(i; j)2 M; l i 08i2 N; d 0: In the Dual problem above, the decision variables are =fa i j : (i; j)2 Mg, =fb i j : (i; j)2 Mg, =fg i j :(i; j)2 Mg,=fl i : i2 Ng andd. We treat z as fixed. In the next theorem, we show that if we 46 treat z as a decision variable and add one constraint to the Dual problem that involves the decision variable z, then we can recover the fixed point of f R ()=v 0 . Theorem 3.3.3 (Fixed Point Computation by Using an LP) Let( ˆ ; ˆ ; ˆ ; ˆ ; ˆ d; ˆ z) be an optimal solution to the problem min cd+ å i2N l i + å (i; j)2M a i j (Fixed Point) s:t: a i j +b i j +g i j m i j (z) 8(i; j)2 M d+l i + å j2N 1( j6= i)(a i j +a ji b i j g ji ) 2(n 1)q i (z) 8i2 N cd+ å i2N l i + å (i; j)2M a i j = v 0 z a i j 0; b i j 0; g i j 08(i; j)2 M; l i 0 8i2 N; d 0; z is free: Then, we have f R (ˆ z)= v 0 ˆ z; so, ˆ z is the fixed point of f R ()=v 0 . Proof: Let ¯ z be the fixed point of f R ()=v 0 so that f R (¯ z)= v 0 ¯ z. We will show that ˆ z= ¯ z. Let( ¯ ; ¯ ; ¯ ; ¯ ; ¯ d) be an optimal solution to the Dual problem when we solve this problem with z = ¯ z. Thus, we have c ¯ d +å i2N ¯ l i +å (i; j)2M ¯ a i j = f R (¯ z)= v 0 ¯ z, which implies that the solution ( ¯ ; ¯ ; ¯ ; ¯ ; ¯ d; ¯ z) satisfies the last constraint in the Fixed Point problem in the theorem. Furthermore, since the solution( ¯ ; ¯ ; ¯ ; ¯ ; ¯ d) is feasible to the Dual problem, it satisfies the first two constraints in the Fixed Point problem as well. Thus, ( ¯ ; ¯ ; ¯ ; ¯ ; ¯ d; ¯ z) is a feasible but not necessarily an optimal solution to the the Fixed Point problem, which implies that v 0 ¯ z= f R (¯ z)= c ¯ d+ å i2N ¯ l i + å (i; j)2M ¯ a i j c ˆ d+ å i2N ˆ l i + å (i; j)2M ˆ a i j = v 0 ˆ z; where the last equality uses the fact that ( ˆ ; ˆ ; ˆ ; ˆ ; ˆ d; ˆ z) satisfies the last constraint in the Fixed Point problem. The chain of inequalities above implies that ¯ z ˆ z. Next, we show that ¯ z ˆ z. Note that ( ˆ ; ˆ ; ˆ ; ˆ ; ˆ d) is a feasible solution to the Dual problem with z= ˆ z, so that the objective value of the Dual problem at the solution( ˆ ; ˆ ; ˆ ; ˆ ; ˆ d) is no smaller than its optimal objective value. Therefore, we have f R (ˆ z) c ˆ d+å i2N ˆ l i +å (i; j)2M ˆ a i j = v 0 ˆ z, where the equality uses the fact that( ˆ ; ˆ ; ˆ ; ˆ ; ˆ d; ˆ z) satisfies the last constraint in the Fixed Point problem. Since f R () is a decreasing function, having f R (¯ z)= v 0 ¯ z and f R (ˆ z) v 0 ˆ z implies that ¯ z ˆ z. 47 Sincem i j (z) andq i (z) are linear functions of z, the Fixed Point problem is an LP. Thus, we can compute the fixed point of f R ()=v 0 by solving an LP. 3.4 Applying the Approximation Framework to the Uncapacitated Problem In Sections 3.3.2 and 3.3.3, we show how to construct an upper bound f R () on f() by using an LP relaxation of the Function Evaluation problem and how to find the fixed point of f R ()=v 0 by using the Fixed Point problem. This discussion allows us to execute Steps 1 and 2 in our approximation framework that we give in Section 3.3.1. In this section, we focus on Step 3 in our approximation framework for the uncapacitated problem. In particular, letting ˆ z be such that f R (ˆ z)= v 0 ˆ z, we give an efficient approach to find a subset of products ^ x that satisfieså (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) 0:6 f R (ˆ z). In this case, Theorem 3.3.1 implies that ^ x is a 0.6-approximate solution to the uncapacitated assortment problem. Letting ˆ z satisfy f R (ˆ z)= v 0 ˆ z, since the value of ˆ z is fixed, to simplify our notation, we exclude the reference to ˆ z in this section. In particular, we letm i j =m i j (ˆ z),q i =q i (ˆ z),r i j =r i j (ˆ z), f R = f R (ˆ z), ˆ N= N(ˆ z) and ˆ M= M(ˆ z). Also, since we consider the uncapacitated assortment problem, we omit the constraintå i2N(z) x i c. Thus, we write the Compact Upper Bound problem as f R = max å (i; j)2 ˆ M (m i j y i j +q i x i +q j x j )+ 2jNn ˆ Nj å i2 ˆ N q i x i s:t: y i j x i + x j 1 8(i; j)2 ˆ M 0 x i 1 8i2 ˆ N; y i j 08(i; j)2 ˆ M: Our goal is to find ^ x that satisfieså (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) 0:6 f R , where f R is the optimal objective value of the problem above. We use randomized rounding for this purpose. Let (x ;y ) be an optimal solution to the problem above. We define a random subset of products ^ X =f ˆ X i : i2 Ng by independently rounding each coordinate ofx as follows. For each i2 ˆ N, we set ˆ X i = 8 > > < > > : 1 with probability x i 0 with probability 1 x i : 48 For each i2 Nn ˆ N, we set ˆ X i = 0. Note that the subset of products ^ X is a random variable withEf ˆ X i g= x i for all i2 ˆ N. In the next theorem, we give the main result of this section. Theorem 3.4.1 (0.6-Approximation) Let ^ X be obtained by using the randomized rounding approach. Then, we have E å (i; j)2M V i j ( ^ X) g i j (R i j ( ^ X) ˆ z) 0:6 f R : Proof: Here, we will show thatEfå (i; j)2M V i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 0:5 f R and briefly discuss how to refine the analysis to get the approximation guarantee of 0.6. The details of the refined analysis are in Appendix B.3. As discussed at the beginning of Section 3.3.2, we have V i j ( ^ X) g i j (R i j ( ^ X) ˆ z)=m i j ˆ X i ˆ X j + q i ˆ X i +q j ˆ X j . So, sincef ˆ X i : i2 Ng are independent andEf ˆ X i g= x i , we have EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g= 8 > > > > > > > > > > < > > > > > > > > > > : m i j x i x j +q i x i +q j x j if i2 ˆ N; j2 ˆ N; i6= j q i x i if i2 ˆ N; j = 2 ˆ N q j x j if i= 2 ˆ N; j2 ˆ N 0 if i= 2 ˆ N; j = 2 ˆ N: Letting[a] + = maxfa;0g and considering the four cases above through the indicator function, we can write å i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g equivalently as å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g = å (i; j)2M 1(i2 ˆ N; j2 ˆ N)(m i j x i x j +q i x i +q j x j ) + å (i; j)2M 1(i2 ˆ N; j62 ˆ N)q i x i + å (i; j)2M 1(i62 ˆ N; j2 ˆ N)q j x j = å (i; j)2 ˆ M (m i j x i x j +q i x i +q j x j )+ 2jNn ˆ Nj å i2 ˆ N q i x i = å (i; j)2 ˆ M (m i j [x i + x j 1] + +q i x i +q j x j )+ 2jNn ˆ Nj å i2 ˆ N q i x i + å (i; j)2 ˆ M m i j (x i x j [x i + x j 1] + ) = f R + å (i; j)2 ˆ M m i j (x i x j [x i + x j 1] + ) f R + 1 4 å (i; j)2 ˆ M m i j : 49 In the chain of inequalities above, the fourth equality follows because we havem i j 0 for all(i; j)2 ˆ M by Lemma B.2.2 so that the decision variable y i j takes its smallest possible value in an optimal solution to the Compact Upper Bound problem, which implies that y i j =[x i + x j 1] + . The last inequality follows from the fact that we have 0 ab[a+ b 1] + 1=4 for any a;b2[0;1]. To complete the proof, we proceed to giving a lower bound on f R . Let ˆ x i = 1 2 for all i2 ˆ N and ˆ y i j = 0 for all(i; j)2 ˆ M. In this case, (^ x; ^ y) is a feasible solution to the LP that computes f R at the beginning of this section, which implies that the objective value of this LP at(^ x; ^ y) provides a lower bound on f R . Therefore, we can lower bound f R as f R å (i; j)2 ˆ M q i +q j 2 +jNn ˆ Nj å i2 ˆ N q i å (i; j)2 ˆ M q i +q j 2 å (i; j)2 ˆ M q i +q j r i j 2 = å (i; j)2 ˆ M m i j 2 ; where the second inequality holds since q i 0 for all i2 ˆ N, the third inequality holds since r i j 0 for all (i; j)2 ˆ M and the equality follows from the definition of m i j . Using the lower bound above on f R in the earlier chain of inequalities, we haveå (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g f R + 1 4 å (i; j)2 ˆ M m i j 1 2 f R ; which is the desired result. Next, we briefly discuss how to refine the analysis to improve the approximation guarantee to 0.6. The refined analysis is lengthy and we defer the details of the refined analysis to Appendix B.3. The discussion above uses a lower bound on f R that is based on a feasible solution(^ x; ^ y) with ˆ x i = 1 2 for all i2 ˆ N and ˆ y i j = 0 for all(i; j)2 ˆ M. This lower bound may not be tight. In our refined analysis, we discuss that if(^ x; ^ y) is an extreme point of the feasible region in the LP that computes f R at the beginning of this section, then we have ˆ x i 2f0; 1 2 ;1g for all i2 ˆ N. Motivated by this observation, we enumerate over a relatively small collection of feasible solutions(^ x; ^ y) to the LP that computes f R , where we have ˆ x i 2f0; 1 2 ;1g for all i2 ˆ N and ˆ y i j =[x i + x j 1] + for all(i; j)2 ˆ M. We pick the best one of these solutions to obtain a tighter lower bound on f R . In this case, we can show thatå (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 0:6 f R . The subset of products ^ X is a random variable, but in Theorem 3.3.1, we need a deterministic subset of products ^ x that satisfies the Sucient Condition. In particular, even if the subset of products ^ X satisfies å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 0:6 f R , Theorem 3.3.1 does not necessarily imply thatEfp( ^ X)g 0:6z . Nevertheless, we can use the method of conditional expectations to de-randomize the subset of products ^ X so that we obtain a deterministic subset of products ^ x that satisfieså (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) 0:6 f R , in which case, we obtainp(^ x) 0:6z by Theorem 3.3.1. Therefore, the subset of products ^ x 50 that we obtain by de-randomizing the subset of products ^ X is a 0.6-approximate solution to the Assortment problem. The method of conditional expectations is standard in the discrete optimization literature; see Section 15.1 in [4] and Section 5.2 in [146]. We give an overview of our use of the method of conditional expecta- tions but defer the detailed discussion of this method to Appendix B.4. In the method of conditional expec- tations, we inductively construct a subset of productsx (k) =( ˆ x 1 ;:::; ˆ x k ; ˆ X k+1 ;:::; ˆ X n ) for all k2 N, where the first k products in this subset are deterministic and the last n k products are random variables. In par- ticular, we start withx (0) = ^ X and construct the subset of productsx (k) =( ˆ x 1 ;:::; ˆ x k1 ; ˆ x k ; ˆ X k+1 ;:::; ˆ X n ) by using the subset of productsx (k1) =( ˆ x 1 ;:::; ˆ x k1 ; ˆ X k ; ˆ X k+1 ;:::; ˆ X n ). These subsets of products are constructed in such a way that they satisfyå (i; j)2M EfV i j (x (k) ) g i j (R i j (x (k) ) ˆ z)g 0:6 f R for all k2 N. In this case, the subset of productsx (n) =( ˆ x 1 ;:::; ˆ x n ) corresponds to a deterministic subset of products and it satisfieså (i; j)2M V i j (x (n) ) g i j (R i j (x (n) ) ˆ z) 0:6 f R . In other words, the deterministic subset of products x (n) satisfies the Sucient Condition witha = 0:6. Constructing the subset of productsx (k) by using the subset of productsx (k1) takes O(n) operations, in which case, the method of conditional expectations takes O(n 2 ) operations. Thus, for the uncapacitated problem, we can execute the approximation framework given in Section 3.3.1 as follows. We solve the LP given in the Fixed Point problem in Theorem 3.3.3 to compute the fixed point ˆ z of f R ()=v 0 . Next, recalling that we set m i j =m i j (ˆ z),q i =q i (ˆ z), ˆ M= M(z) and ˆ N= N(z) for notational brevity, we solve the LP that computes f R at the beginning of this section to obtain the optimal solution(x ;y ). The solution(x ;y ) characterizes the random subset of products ^ X through our ran- domized rounding approach discussed in this section. Lastly, we de-randomize the subset of products ^ X by using the method of conditional expectations to obtain a deterministic subset of products ^ x satisfying å (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) 0:6 f R . The subset of products ^ x is a 0.6-approximate solution to the un- capacitated problem. Considering the computational effort, each of the two LP formulations that we solve has O(n 2 ) decision variables and O(n 2 ) constraints. The method of conditional expectations takes O(n 2 ) operations. We can also use an SDP relaxation of the Function Evaluation problem to construct an upper bound f R () on f(). This SDP has O(n 2 ) decision variables and O(n 2 ) constraints. We can compute the fixed point ˆ z of f R ()=v 0 by solving an SDP of the same size. In this case, we can construct a subset of products ^ X that satisfies å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 0:79 f R (ˆ z), where f R () refers to the upper bound constructed by using the SDP relaxation. This approach provides a stronger approximation guarantee, but 51 comes at the expense of solving an SDP. We summarize this approach in the next theorem and defer the details to Appendix B.5. Theorem 3.4.2 (SDP Relaxation) There exists an algorithm to find the fixed point ˆ z of a function f R ()=v 0 that satisfies f R (z) f(z) for all z2R and to construct a random subset of products ^ X that satisfies å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 0:79 f R (ˆ z). This algorithm requires solving two SDP formulations, each with O(n 2 ) decision variables and O(n 2 ) constraints. 3.5 Applying the Approximation Framework to the Capacitated Problem In this section, we consider Step 3 in our approximation framework for the capacitated assortment problem. Letting ˆ z satisfy f R (ˆ z) = v 0 ˆ z, we focus on finding a subset of products ^ x such that å (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) 1 4 f R (ˆ z) and å i2N ˆ x i c. In this case, by Theorem 3.3.1, the subset of products ^ x is a 0.25-approximate solution to the Assortment problem. 3.5.1 Half-Integral Solutions Through Iterative Rounding As in the previous section, since the value of ˆ z is fixed, to simplify our notation, we drop the argument ˆ z in the Compact Upper Bound problem. In particular, we letm i j =m i j (ˆ z),q i =q i (ˆ z),r i j =r i j (ˆ z), f R = f R (ˆ z), ˆ N= N(ˆ z) and ˆ M= M(ˆ z). Noting that the optimal objective value of the Compact Upper Bound problem is f R , we write the Compact Upper Bound problem as f R = max å (i; j)2 ˆ M (m i j y i j +q i x i +q j x j )+ 2jNn ˆ Nj å i2 ˆ N q i x i s:t: y i j x i + x j 1 8(i; j)2 ˆ M å i2 ˆ N x i c 0 x i 1 8i2 ˆ N; y i j 08(i; j)2 ˆ M: We can construct counterexamples to show that all of the decision variablesfx i : i2 ˆ Ng can take strictly positive values in an extreme point of the set of feasible solutions to the LP above. In Example B.6.1 in Appendix B.6, we give one such counterexample. In this case, if we construct a random subset of products 52 by using randomized rounding on an optimal solution to the LP above, then the random subset of products may violate the capacity constraint. To address this difficulty, we will use an iterative rounding algorithm, where we iteratively solve a modified version of the LP above after fixing some of the decision variables fx i : i2 ˆ Ng at 1 2 . In this way, we construct a feasible solution to the LP above, where the decision variables fx i : i2 ˆ Ng ultimately all take values inf0; 1 2 ;1g. In the feasible solution that we construct, since the smallest strictly positive value for the decision variablesfx i : i2 ˆ Ng is 1 2 , noting the constraintå i2 ˆ N x i c, no more than 2c of the decision variablesfx i : i2 ˆ Ng can take strictly positive values. In this case, we will be able to use a coupled randomized rounding algorithm on the feasible solution to obtain a random subset of products that includes no more than c products. Since we will solve the LP above after fixing some of the decision variablesfx i : i2 ˆ Ng at 1 2 , we study the extreme points of the set of feasible solutions to this LP with some of the decision variablesfx i : i2 ˆ Ng fixed at 1 2 . In particular, if we fix the decision variablesfx i : i2 Hg at 1 2 , then the set of feasible solutions to the LP is given by the polyhedron P(H)= ( (x;y)2[0;1] j ˆ Nj R j ˆ Mj + : y i j x i + x j 1 8(i; j)2 ˆ M; å i2 ˆ N x i c; x i = 1 2 8i2 H ) : In the next lemma, we show a useful structural property of the extreme points ofP(H). In particular, it turns out that if none of the decision variablesfx i : i2 ˆ Ng take a fractional value larger than 1 2 in an extreme point ofP(H), then all of the decision variablesfx i : i2 ˆ Ng must take values inf0; 1 2 ;1g. This result shortly becomes useful for arguing that our iterative rounding algorithm terminates with a feasible solution to the LP that computes f R at the beginning of this section, where all of the decision variables fx i : i2 ˆ Ng are guaranteed to take values inf0; 1 2 ;1g. We defer the proof of the next lemma to Appendix B.6. Lemma 3.5.1 (Extreme Points) For any H ˆ N, let (^ x; ^ y) be an extreme point ofP(H). If there is no product i2 ˆ N such that 1 2 < ˆ x i < 1, then we have ˆ x i 2f0; 1 2 ;1g for all i2 ˆ N. We consider the following iterative rounding algorithm to construct a feasible solution to the LP that computes f R at the beginning of this section. Iterative Rounding Step 1: Set the iteration counter to k= 1 and initialize the set of fixed variables H k =?. 53 Step 2: Let(x k ;y k ) be an optimal solution to the LP f k = max ( å (i; j)2 ˆ M (m i j y i j +q i x i +q j x j )+ 2jNn ˆ Nj å i2 ˆ N q i x i :(x;y)2P(H k ) ) : (Variable Fixing) Step 3: If there exists some i k 2 ˆ N such that 1 2 < x k i k < 1, then set H k+1 = H k [fi k g, increase k by one and go to Step 2. Otherwise, stop. Without loss of generality, we assume that the optimal solution(x k ;y k ) to the Variable Fixing problem in Step 2 is an extreme point ofP(H k ). In Step 3 of the iterative rounding algorithm, if there does not exist some i2 ˆ N such that 1 2 < x k i < 1, then we stop. By Lemma 3.5.1, if there does not exist some i2 ˆ N such that 1 2 < x k i < 1, then we have x k i 2f0; 1 2 ;1g for all i2 ˆ N. Therefore, if the iterative rounding algorithm stops at iteration k with the solution (x k ;y k ), then we must have x k i 2f0; 1 2 ;1g for all i2 ˆ N. Similar iterative rounding approaches are often used to design approximation algorithms for discrete optimization problems; see [86]. We use(x ;y ) to denote an optimal solution to the Variable Fixing problem at the last iteration of the iterative rounding algorithm. By the discussion in the previous paragraph, we have x i 2f0; 1 2 ;1g for all i2 ˆ N. Also, since this solution is feasible to the Variable Fixing problem, we haveå i2 ˆ N x i c. Therefore, noting that the smallest strictly positive value offx i : i2 ˆ Ng is 1 2 , no more than 2c of the decision variables fx i : i2 ˆ Ng can take strictly positive values. Nevertheless, including each product i2 ˆ N in a subset with probability x i may not provide a solution that satisfies the capacity constraint, because this subset can include as many as 2c products. Instead, we use a coupled randomized rounding approach to obtain a random subset of products that satisfies the capacity constraint with probability one. 3.5.2 Feasible Subsets Through Coupled Randomized Rounding Letting(x ;y ) be an optimal solution to the Variable Fixing problem at the last iteration of the iterative rounding algorithm, we use the following coupled randomized rounding approach to obtain a random subset of products ^ X =f ˆ X i : i2 Ng. Coupled Randomized Rounding Recall that we have x i 2f0; 1 2 ;1g for all i2 ˆ N. We assume that the number of the decision variables in fx i : i2 ˆ Ng that take the value 1 2 is even. The idea of coupled randomized rounding is similar under the odd case. Letting 2` be the number of the decision variables infx i : i2 ˆ Ng that take the value 1 2 , we use 54 fx i 1 ;x j 1 ;x i 2 ;x j 2 ;:::;x i ` ;x j ` g to denote these decision variables. We view each of(i 1 ; j 1 );(i 2 ; j 2 );:::;(i ` ; j ` ) as a pair. Using the solution(x ;y ), we define the random subset of products ^ X=f ˆ X i : i2 Ng as follows. For each pair (i m ; j m ), we set ( ˆ X i m ; ˆ X j m )=(1;0) with probability 0:5, whereas we set ( ˆ X i m ; ˆ X j m )=(0;1) with probability 0:5. Note that the decision variablesfx i : i2 ˆ Ng that are not infx i 1 ;x j 1 ;x i 2 ;x j 2 ;:::;x i ` ;x j ` g take the value zero or one. For all i2 ˆ Nnfi 1 ; j 1 ;i 2 ; j 2 ;:::;i ` ; j ` g, we set ˆ X i = 1 with probability x i and ˆ X i = 0 with probability 1 x i . Finally, we set ˆ X i = 0 for all i2 Nn ˆ N. In our construction, ˆ X i m and ˆ X j m for the pair (i m ; j m ) are dependent, but the components of ^ X corre- sponding to different pairs are independent. Also, we haveEf ˆ X i g= x i for all i2 ˆ N. Lastly, the subset of products ^ X always satisfies the capacity constraintå i2N ˆ X i c. In particular, we let S=fi2 ˆ N : x i = 1 2 g and L=fi2 ˆ N : x i = 1g. We havejSj= 2`. By the definition of ^ X, there are exactly`+jLj products in ^ X, soå i2N ˆ X i =`+jLj. Since(x ;y ) is a feasible solution to the Variable Fixing problem, we have å i2 ˆ N x i c, but we can write the last sum as å i2 ˆ N x i = 2` 1 2 +jLj=`+jLj, indicating ^ X satisfies the capacity constraintå i2N ˆ X i =`+jLj c. The main result of this section is stated in the following theorem. Theorem 3.5.2 (0.25-Approximation) Let ^ X be obtained by using the coupled randomized rounding approach. Then, we have å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 1 4 f R : We give the proof of this theorem in Section 3.5.3. The subset of products ^ X is random, but we can use the method of conditional expectations discussed in Section 3.4 and Appendix B.4 to obtain a deterministic subset of products ^ x that satisfieså (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) 1 4 f R andå i2N ˆ x i c. The only difference is that we condition on the pair ( ˆ X i m ; ˆ X j m ) when we consider the products (i m ; j m ) that are paired in the coupled randomized rounding approach. For product i that is not paired, we condition on ˆ X i . Using the same argument in Appendix B.4, for the capacitated problem, the method of conditional expectations takes O(n 2 ) operations. Thus, for the capacitated problem, we can execute the approximation framework given in Section 3.3.1 as follows. We solve the LP given in the Fixed Point problem in Theorem 3.3.3 to compute the fixed point ˆ z of f R ()=v 0 . Next, recalling that we set m i j = m i j (ˆ z), q i = q i (ˆ z), ˆ M = M(z) and ˆ N = N(z) in the Variable Fixing problem, we execute the iterative rounding algorithm. Let (x ;y ) be an optimal solution to the Variable Fixing problem at the last iteration of the iterative rounding algorithm. Through the 55 coupled randomized rounding approach, the solution(x ;y ) characterizes the random subset of products ^ X. Lastly, we de-randomize the subset of products ^ X by using the method of conditional expectations to obtain a deterministic subset of products ^ x satisfyingå (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) 1 4 f R andå i2N ˆ x i c. Considering the computational effort, to compute the fixed point of f R ()=v 0 , we solve an LP with O(n 2 ) decision variables and O(n 2 ) constraints. The iterative rounding algorithm terminates in O(n) iterations. At each iteration of the iterative rounding algorithm, we solve an LP with O(n 2 ) decision variables and O(n 2 ) constraints. The method of conditional expectations takes O(n 2 ) operations. 3.5.3 Proof of Theorem 3.5.2 The proof of Theorem 3.5.2 relies on the next two lemmas. As the iterations of the iterative rounding algo- rithm progress, we fix additional decision variables at the value 1 2 in the Variable Fixing problem. There- fore, noting that the optimal objective value of the Variable Fixing problem at iteration k is f k , since the Variable Fixing problem at iteration k+ 1 has one more decision variable fixed at 1 2 , we have f k+1 f k for all k= 1;2;:::. In the next lemma, we give an upper bound on the degradation in the optimal objective value of the Variable Fixing problem at the successive iterations of the iterative rounding algorithm. Lemma 3.5.3 (Reduction in Objective) For all k= 1;2;:::, we have f k f k+1 (jNj 1)q i k . Proof: We have å (i; j)2 ˆ M q i x i = å i2 ˆ N å j2 ˆ N 1(i6= j)q i x i = (j ˆ Nj 1)å i2 ˆ N q i x i . Similarly, we have å (i; j)2 ˆ M q j x j =(j ˆ Nj1)å i2 ˆ N q i x i . In this case, sincej ˆ Nj 1+jNn ˆ Nj=jNj1, we can write the objective function of the Variable Fixing problem as å (i; j)2 ˆ M (m i j y i j +q i x i +q j x j )+ 2jNn ˆ Nj å i2 ˆ N q i x i = å (i; j)2 ˆ M m i j y i j + 2(j ˆ Nj 1) å i2 ˆ N q i x i + 2jNn ˆ Nj å i2 ˆ N q i x i = å (i; j)2 ˆ M m i j y i j + 2(jNj 1) å i2 ˆ N q i x i : Since the iterative rounding algorithm did not stop at iteration k, we have x k i k 2( 1 2 ;1). We define the solution( ˜ x; ˜ y) to the Variable Fixing problem as follows. We set ˜ x i = x k i for all i2 ˆ Nnfi k g and ˜ x i k = 1 2 . Also, we set ˜ y i j =[ ˜ x i + ˜ x j 1] + for all(i; j)2 ˆ M. We claim that( ˜ x; ˜ y) is a feasible solution to the Variable Fixing problem at iteration k+1. In particular, since x k i k 2( 1 2 ;1), but ˜ x i k = 1 2 , we have ˜ x i x k i for all i2 ˆ N. Therefore,å i2 ˆ N ˜ x i å i2 ˆ N x k i c, where the last inequality uses the fact that(x k ;y k ) is a feasible solution to the Variable Fixing problem at iteration k 56 so that it satisfies the capacity constraint. Also, we have x k i = 1 2 for all i2 H k . Since H k+1 = H k [fi k g and ˜ x i k = 1 2 , we have ˜ x i = 1 2 for all i2 H k+1 . Thus, it follows that( ˜ x; ˜ y) is a feasible solution to the Variable Fixing problem at iteration k+1. Since( ˜ x; ˜ y) is not necessarily an optimal to the same problem at iteration k+ 1, we have f k f k+1 f k å (i; j)2 ˆ M m i j ˜ y i j 2(jNj 1) å i2 ˆ N q i ˜ x i = å (i; j)2 ˆ M m i j (y k i j ˜ y i j )+ 2(jNj 1)q i k x k i k 1 2 + 2(jNj 1) å i2 ˆ Nnfi k g q i (x k i ˜ x i ) å (i; j)2 ˆ M m i j (y k i j ˜ y i j )+(jNj 1)q i k ; where the equality uses the fact that (x k ;y k ) is an optimal solution to the Variable Fixing problem at iteration k and the second inequality uses the fact that x k i k 1 and ˜ x i = x k i for all i2 ˆ Nnfi k g. Assume for the moment that y k i j =[x k i + x k j 1] + for all(i; j)2 ˆ M. By our construction, we have ˜ y i j =[ ˜ x i + ˜ x j 1] + as well. Earlier in this paragraph, we showed that x k i ˜ x i for all i2 ˆ N. Since[] + is an increasing function, we get y k i j ˜ y i j for all (i; j)2 ˆ M. By Lemma B.2.2, we have m i j 0 for all (i; j)2 ˆ M. In this case, we obtainå (i; j)2 ˆ M m i j (y k i j ˜ y i j )+(jNj 1)q i k (jNj 1)q i k and the desired result follows from the chain of inequalities above. To complete the proof, we argue that y k i j = [x k i + x k j 1] + for all (i; j)2 ˆ M. Note that (x k ;y k ) is an extreme point ofP(H k ). If y k i j >[x k i + x k j 1] + for some (i; j)2 ˆ M, then we can perturb only this component ofy k by+e ande for a small enoughe> 0, while keeping the other components of(x k ;y k ) constant. The two points that we obtain are inP(H k ) and (x k ;y k ) is a convex combination of the two points, which contradicts the fact that(x k ;y k ) is an extreme point ofP(H k ). In the next lemma, we accumulate the degradation in the optimal objective value of the Variable Fixing problem over the iterations of the iterative rounding algorithm to compare the optimal objective value of the Variable Fixing problem at the last iteration with f R . Lemma 3.5.4 (Objective at Last Iteration) If(x ;y ) is an optimal solution to the Variable Fixing prob- lem at the last iteration of the iterative rounding algorithm, then we have å (i; j)2 ˆ M (m i j y i j +q i x i +q j x j )+ 2jNn ˆ Nj å i2 ˆ N q i x i 1 2 f R : 57 Proof: By the discussion at the beginning of the proof of Lemma 3.5.3, we can write the objective function of the Variable Fixing problem aså (i; j)2 ˆ M m i j y i j + 2(jNj 1)å i2 ˆ N q i x i . We let K be the last iteration of the iterative rounding algorithm. Consider the Variable Fixing problem at iteration K. In this problem, we fix the decision variables infx i : i2 H K g at 1 2 and have H K =fi 1 ;:::;i K1 g by the construction of the iterative rounding algorithm. If we set x i = 1 2 for all i2 H K , x i = 0 for all i2 ˆ NnfH K g and y i j = 0 for all (i; j)2 ˆ M, then we obtain a feasible solution to the Variable Fixing problem at iteration K and this solution provides the objective value of 2(jNj 1)å i2H K q i 2 =(jNj 1)å i2H Kq i . Since the optimal objective value of the Variable Fixing problem at iteration K is f K , we get f K (jNj 1)å i2H Kq i . By Lemma 3.5.3, we also have f k f k+1 (jNj 1)q i k for all k= 1;:::;K 1. In this case, we obtain å (i; j)2 ˆ M (m i j y i j +q i x i +q j x j )+ 2jNn ˆ Nj å i2 ˆ N q i x i = f K 1 2 (jNj 1) å i2H K q i + 1 2 (jNj 1) å i2H K q i f K 1 2 (jNj 1) å i2H K q i + 1 2 K1 å k=1 ( f k f k+1 ) 1 2 f K + 1 2 K1 å k=1 ( f k f k+1 ) = 1 2 f 1 : In the chain of inequalities above, the first equality is from the fact that(x ;y ) is an optimal solution to the Variable Fixing problem at iteration K. The first inequality is by the fact that f k f k+1 (jNj 1)q i k for all k= 1;:::;K 1 and H K =fi 1 ;:::;i K1 g. The second inequality holds since f K (jNj 1)å i2H Kq i . Since H 1 =?, the Variable Fixing problem at the first iteration is identical to the LP that computes f R at the beginning of Section 3.5.1. Therefore, we get f 1 = f R , in which case, the desired result follows from the chain of inequalities above. Finally, here is the proof of Theorem 3.5.2. Proof of Theorem 3.5.2: Since (x ;y ) is an extreme point solution to the Variable Fixing prob- lem, by the same discussion at the end of the proof of Lemma 3.5.3, we have y i j =[x i + x j 1] + for all (i; j)2 ˆ M. Also, by the same discussion at the beginning of Section 3.3.2, we have V i j ( ^ X) g i j (R i j ( ^ X) ˆ z) = m i j ˆ X i ˆ X j +q i ˆ X i +q j ˆ X j . There are four cases to consider. Case 1: Suppose i2 ˆ N and j2 ˆ N with i6= j. We claim that EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 1 2 (m i j y i j +q i x i +q j x j ) in this case. First, assume that the products i and j are paired in the coupled randomized rounding approach. Thus, we must have x i = 1 2 and x j = 1 2 , so that y i j =[x i + x j 1] + = 0. Also, since products i and j are paired, we have( ˆ X i ; ˆ X j )=(1;0) or( ˆ X i ; ˆ X j )=(0;1), so that ˆ X i ˆ X j = 0. So, we get 58 EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g = m i j Ef ˆ X i ˆ X j g+q i Ef ˆ X i g+q j Ef ˆ X j g = q i 2 + q j 2 1 2 (m i j y i j +q i x i +q j x j ): Second, assume that the products i and j are not paired. Thus, ˆ X i and ˆ X j are independent, in which case, EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g = m i j Ef ˆ X i gEf ˆ X j g+q i Ef ˆ X i g+q j Ef ˆ X j g = m i j x i x j +q i x i +q j x j . If x i 2f0;1g or x j 2f0;1g, then we have[x i + x j 1] + = x i x j . Thus, we get EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g = m i j x i x j +q i x i +q j x j 1 2 (m i j x i x j +q i x i +q j x j ) = 1 2 (m i j [x i + x j 1] + +q i x i +q j x j ) = 1 2 (m i j y i j +q i x i +q j x j ); where the inequality uses the fact thatm i j x i x j +q i x i +q j x j =r i j x i x j +q i x i (1x j )+q j (1 x i )x j and r i j 0,q i 0 andq j 0 for all(i; j)2 ˆ M. If x i = 1 2 and x j = 1 2 , then y i j =[x i + x j 1] + = 0, so since r i j 0, we obtain EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g = m i j x i x j +q i x i +q j x j = r i j q i q j 4 + q i 2 + q j 2 q i 4 + q j 4 = 1 2 (m i j y i j +q i x i +q j x j ); where the last equality uses the fact that y i j = 0. In all cases, we get EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 1 2 (m i j y i j +q i x i +q j x j ) for all i2 ˆ N and j2 ˆ N with i6= j, which is the desired the claim. Case 2: Suppose i2 ˆ N and j62 ˆ N. Since ˆ X j = 0, we get V i j ( ^ X) g i j (R i j ( ^ X) ˆ z)=m i j ˆ X i ˆ X j +q i ˆ X i + q j ˆ X j =q i ˆ X i . Thus, we haveEfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g=q i Ef ˆ X i g=q i x i 1 2 q i x i . Case 3: Suppose i62 ˆ N and j2 ˆ N. Using the same argument as in Case 2, it follows that EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 1 2 q j x j . Case 4: Suppose i62 ˆ N and j = 2 ˆ N with i6= j. In this case, noting that ˆ X i = 0 and ˆ X j = 0, we get EfV i j ( ˆ X) g i j (R i j ( ˆ X) ˆ z)g= 0. Collecting the four cases above, if i2 ˆ N and j2 ˆ N, then we have EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 1 2 (m i j y i j +q i x i +q j x j ). Also, if i2 ˆ N and j62 ˆ N, then we have EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 1 2 q i x i . 59 Similarly, if i62 ˆ N and j2 ˆ N, then we haveEfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 1 2 q j x j . Finally, if i62 ˆ N and j = 2 ˆ N, then we haveEfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g= 0. Therefore, we get å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 1 2 ( å (i; j)2M 1(i2 ˆ N; j2 ˆ N)(m i j y i j +q i x i +q j x j ) + å (i; j)2M 1(i2 ˆ N; j62 ˆ N)q i x i + å (i; j)2M 1(i62 ˆ N; j2 ˆ N)q j x j ) = 1 2 ( å (i; j)2 ˆ M (m i j y i j +q i x i +q j x j )+ 2jNn ˆ Nj å i2 ˆ N q i x i ) 1 4 f R ; where the equality follows becauseå (i; j)2M 1(i2 ˆ N; j62 ˆ N)q i x i =jNn ˆ Njå i2 ˆ N q i x i =å (i; j)2M 1(i62 ˆ N; j2 ˆ N)q j x j and the last inequality follows from Lemma 3.5.4. As an alternative to the coupled randomized rounding approach, letting S =fi2 ˆ N : x i = 1 2 g and recalling that we havejSj= 2` in our discussion of the coupled randomized rounding approach, we can simply sample ` elements of S without replacement. Using ˆ S to denote these elements, we can define the random subset of products ^ X =f ˆ X i : i2 Ng as follows. For each i2 ˆ N, we set ˆ X i = 1 if i2 ˆ S and ˆ X i = 0 if i62 ˆ S. For each i2 Nn ˆ N, we set ˆ X i = 0. In this case, the subset of products ^ X still satisfies Theorem 3.5.2. The only difference in the proof of Theorem 3.5.2 is that since the subset ˆ S is obtained by sampling` elements of S without replacement, for i; j2 ˆ N with i6= j, we have ˆ X i = 1 and ˆ X j = 1 with probability 2`2 `2 = 2` ` . Furthermore, for any i; j2 ˆ N with i6= j, ˆ X i and ˆ X j are never independent, so computing the conditional expectations involved in the method of conditional expectations gets slightly more complicated. 3.6 Computational Experiments In this section, we present computational experiments to test the performance of our approximation algo- rithms on a large number of randomly generated test problems. 60 3.6.1 Computational Setup We work with both uncapacitated and capacitated problems. To obtain an a-approximate solution, we need to compute the fixed point ˆ z of f R ()=v 0 that satisfies f R (ˆ z) = v 0 ˆ z and find a subset of prod- ucts ^ x that satisfies å (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) a f R (ˆ z). To compute ˆ z, we solve the Fixed Point problem in Theorem 3.3.3. For uncapacitated problems, to find the subset of products ^ x that satisfies å (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z)a f R (ˆ z), we let (x ;y ) be an optimal solution to the LP that computes f R at the beginning of Section 3.4. The solution(x ;y ) characterizes the random subset of products ^ X through the randomized rounding approach in Section 3.4. We use the method of conditional expectations on (x ;y ) to de-randomize ^ X, so that we obtain the desired deterministic subset ^ x. For capacitated problems, to find the subset of products ^ x that satisfies å (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z)a f R (ˆ z), we let (x ;y ) be an optimal solution obtained at the last iteration of the iterative rounding algorithm in Section 3.5.1. The solution(x ;y ) characterizes the random subset of products ^ X through the coupled random- ized rounding approach in Section 3.5.2. We use the method of conditional expectations on (x ;y ) to de-randomize ^ X, so that we obtain the desired deterministic subset ^ x. We do not test the performance of the approximation algorithm that is based on an SDP relaxation, because the approximation algorithm that is based on an LP relaxation already performs quite well. Moreover, although we can solve an SDP in polynomial time in theory, the size of our test problems prevents us from solving SDP relaxations for the large number of problem instances that we consider. We carry out our computational experiments using MATLAB and Gurobi 6.5.0 with 3.1 GHz Intel Core i7 CPU and 16 GB RAM. By Theorem 3.3.1, if ˆ z satisfies f R (ˆ z)= v 0 ˆ z, then we have ˆ z z , so ˆ z is an upper bound on the optimal expected revenue z for the Assortment problem. Recalling that p(^ x) is the expected revenue from the subset of products ^ x, to evaluate the quality of the subset of products ^ x obtained by our approximation algorithms, we report the quantity 100p(^ x)=ˆ z, which corresponds to the percentage of the upper bound captured by the subset ^ x. This quantity provides a conservative estimate of the optimality gaps of the solutions obtained by our approximation algorithms, because ˆ z is an upper bound on the optimal expected revenue, rather than the optimal expected revenue itself. To compute the upper bound ˆ z, we solve the Fixed Point problem in Theorem 3.3.3, which is an LP. Letting ( ˆ ; ˆ ; ˆ ; ˆ ; ˆ d; ˆ z) be an optimal solution to the Fixed Point problem, by Theorem 3.3.3, ˆ z satisfies f R (ˆ z)= v 0 ˆ z, in which case, by Theorem 3.3.1, ˆ z is an upper bound on the optimal expected revenue z . In our test problems, we have n= 50 or n= 100 products. Finding the optimal subset of products through enumeration requires checking the expected revenue from 61 O(2 n ) assortments, which is not computationally feasible for the sizes of our problem instances. Thus, we provide comparisons with an upper bound on the optimal expected revenue. 3.6.2 Uncapacitated Problems We randomly generate a large number of test problems and check the performance of our approximation algorithm on each test problem. To generate the dissimilarity parameters of the nests, we sample g i j from the uniform distribution over [0; ¯ g] for all (i; j)2 M, where ¯ g is a parameter that we vary in our computational experiments. To generate the preference weights of the products, we sample v i from the uniform distribution over [0;1] for all i2 N. Using 12R n to denote the vector of all ones, if we offer all products, then a customer leaves without a purchase with probability v 0 =(v 0 +å (i; j)2M V i j (1) g i j ). To generate the preference weight of the no purchase option, we set v 0 =f 0å (i; j)2M V i j (1) g i j =(1f 0 ), where f 0 is a parameter that we vary. In this case, if we offer all products, then a customer leaves without a purchase with probabilityf 0 . We work with two classes of test problems when generating the revenues of the products. In the first class, we sample the revenue p i of each product i from the uniform distribution over[0;1]. We refer to these problem instances as independent instances since the preference weights and the revenues are independent. In the second class, we set the revenue p i of each product i as p i = 1v i . We refer to these problem instances as correlated instances since the preference weights and the revenues are correlated. In the correlated instances, more expensive products have smaller preference weights, making them less desirable. As stated earlier, we use n= 50 or n= 100 products. We vary ¯ g overf0:1;0:5;1:0g and f 0 overf0:25;0:50;0:75g. Using I and C to, respectively, refer to the independent and correlated instances, we label our test problems as(T;n; ¯ g;f 0 )2fI;Cgf50;100g f0:1;0:5;1:0gf0:25;0:50;0:75g, where T is the class of the test problem and n, ¯ g andf 0 are as discussed above. In this way, we obtain 36 parameter combinations. In each parameter combination, we randomly generate 100 individual test problems. We use our approximation algorithm to obtain an approximate solution for each test problem. For test problem s, we use ^ x s to denote the solution obtained by our approximation algorithm and ˆ z s to denote the value of ˆ z satisfying f R (ˆ z)= v 0 ˆ z. The dataf100p(^ x s )=ˆ z s : s= 1;:::;100g characterizes the quality of the solutions obtained for the 100 test problems in a parameter combination. We give our computational results in Table 3.1. The first column in this table shows the parameter com- bination. The second, third, fourth and fifth columns, respectively, show the average, 5th percentile, 95th 62 Param. Conf. p(^ x)=ˆ z CPU (T;n; ¯ g;f 0 ) min 5th Avg. 95th Std. Secs. (I;50;0:1;0:25) 96.6 97.5 98.6 99.8 0.7 0.08 (I;50;0:1;0:25) 98.8 99.2 99.7 100 0.3 0.04 (I;50;0:1;0:25) 99.7 99.8 99.9 100 0.1 0.05 (I;50;0:5;0:25) 96.7 97.6 98.8 99.8 0.7 0.04 (I;50;0:5;0:25) 99.1 99.4 99.8 100 0.2 0.05 (I;50;0:5;0:25) 99.7 99.8 100 100 0.1 0.05 (I;50;1:0;0:50) 96.6 97.7 98.9 99.7 0.7 0.04 (I;50;1:0;0:50) 99.1 99.4 99.8 100 0.2 0.05 (I;50;1:0;0:50) 99.8 99.9 100 100 0.1 0.05 (I;100;0:1;0:50) 97.5 97.9 98.7 99.5 0.5 0.19 (I;100;0:1;0:50) 99.3 99.4 99.7 99.9 0.2 0.20 (I;100;0:1;0:50) 99.8 99.8 100 100 0.1 0.21 (I;100;0:5;0:75) 97.3 97.9 98.7 99.5 0.5 0.18 (I;100;0:5;0:75) 99.1 99.4 99.7 100 0.2 0.22 (I;100;0:5;0:75) 99.8 99.9 100 100 0.1 0.22 (I;100;1:0;0:75) 97.5 98.1 98.8 99.6 0.5 0.18 (I;100;1:0;0:75) 99.3 99.5 99.7 99.9 0.1 0.21 (I;100;1:0;0:75) 99.9 99.9 100 100 0.1 0.23 Average 99.5 Param. Conf. p(^ x)=ˆ z CPU (T;n; ¯ g;f 0 ) min 5th Avg 95th Std. Secs. (C;50;0:1;0:25) 96.2 97.5 98.6 99.7 0.7 0.04 (C;50;0:1;0:25) 99.0 99.2 99.7 100 0.2 0.05 (C;50;0:1;0:25) 99.8 99.8 100 100 0.1 0.05 (C;50;0:5;0:25) 97.2 97.7 98.8 99.8 0.6 0.04 (C;50;0:5;0:25) 99.2 99.4 99.7 100 0.2 0.05 (C;50;0:5;0:25) 99.8 99.9 100 100 0.0 0.05 (C;50;1:0;0:50) 97.0 97.8 98.8 99.8 0.6 0.04 (C;50;1:0;0:50) 99.2 99.3 99.8 100 0.2 0.05 (C;50;1:0;0:50) 99.8 99.9 100 100 0.1 0.05 (C;100;0:1;0:50) 97.1 97.6 98.5 99.3 0.5 0.19 (C;100;0:1;0:50) 99.1 99.3 99.7 100 0.2 0.20 (C;100;0:1;0:50) 99.8 99.9 100 100 0.1 0.21 (C;100;0:5;0:75) 97.4 98.0 98.7 99.5 0.5 0.19 (C;100;0:5;0:75) 99.2 99.5 99.8 100 0.2 0.21 (C;100;0:5;0:75) 99.8 99.9 100 100 0.1 0.23 (C;100;1:0;0:75) 97.8 98.1 98.8 99.5 0.5 0.19 (C;100;1:0;0:75) 99.4 99.5 99.8 100 0.1 0.21 (C;100;1:0;0:75) 99.9 99.9 100 100 0.1 0.23 Average 99.5 Table 3.1: Computational results for the uncapacitated test problems. percentile and standard deviation of the dataf100p(^ x s )=ˆ z s : s= 1;:::;100g. The last column shows the average CPU seconds to run our approximation algorithm, where the average is computed over the 100 test problems in a parameter configuration. The results in Table 3.1 indicate that our approximation algorithm performs quite well. Over all test problems, on average, our approximation algorithm obtains 99.5% of the upper bound on the optimal expected revenue. The 5th percentile of the optimality gaps is no larger than 2.5%. Over the largest test problems with n= 100, on average, our approximation algorithm runs in 0.15 seconds. 3.6.3 Capacitated Problems The approach that we use to generate the capacitated test problems is the same as the one that we use to generate the uncapacitated ones, but we also need to choose the available capacity in the capacitated test problems. We set the capacity c as c=dd ne, where d is a parameter that we vary. We label our test problems by (T;n; ¯ g;f 0 ;d)2fI;Cgf50;100gf0:1;0:5;1:0gf0:25;0:75gf0:2;0:5;0:8g, which yields 72 parameter combinations. We randomly generate 100 individual test problems in each param- eter combination. Using ^ x s and ˆ z s with the same interpretation that we have for the uncapacitated test problems, the dataf100p(^ x s )=ˆ z s : s= 1;:::;100g continues to characterize the quality of the solutions obtained for the 100 test problems in a parameter combination. We give our computational results in Ta- ble 3.2. The layout of this table is identical to that of Table 3.1. The optimality gaps reported in Table 3.2 are slightly larger than those in Table 3.1. On average, our approximation algorithm obtains 98.9% of the 63 Param. Conf. p(^ x)=ˆ z CPU (T;n; ¯ g;f 0 ;d) min 5th Avg. 95th Std. Secs. (I;50;0:1;0:25;0:2) 96.6 98.2 99.5 100 0.7 0.05 (I;50;0:1;0:25;0:5) 94.1 96.2 98.0 99.6 1.1 0.05 (I;50;0:1;0:25;0:8) 91.0 93.1 95.9 98.3 1.6 0.05 (I;50;0:1;0:75;0:2) 99.7 99.9 100 100 0.0 0.07 (I;50;0:1;0:75;0:5) 99.7 99.8 100 100 0.1 0.07 (I;50;0:1;0:75;0:8) 99.3 99.6 99.9 100 0.1 0.07 (I;50;0:5;0:25;0:2) 97.3 98.2 99.3 100 0.7 0.05 (I;50;0:5;0:25;0:5) 94.8 95.5 97.9 99.2 1.1 0.05 (I;50;0:5;0:25;0:8) 93.4 93.8 96.5 98.3 1.4 0.05 (I;50;0:5;0:75;0:2) 99.7 99.8 100 100 0.1 0.06 (I;50;0:5;0:75;0:5) 99.6 99.8 100 100 0.1 0.07 (I;50;0:5;0:75;0:8) 99.5 99.6 99.9 100 0.1 0.08 (I;50;1:0;0:25;0:2) 96.6 98.2 99.4 100 0.7 0.06 (I;50;1:0;0:25;0:5) 94.9 96.2 98.1 99.6 1.1 0.06 (I;50;1:0;0:25;0:8) 92.9 94.5 96.6 99.0 1.4 0.05 (I;50;1:0;0:75;0:2) 99.7 99.8 100 100 0.1 0.07 (I;50;1:0;0:75;0:5) 99.7 99.7 99.9 100 0.1 0.07 (I;50;1:0;0:75;0:8) 99.6 99.7 99.9 100 0.1 0.07 (I;100;0:1;0:25;0:2) 98.1 98.6 99.5 100 0.4 0.24 (I;100;0:1;0:25;0:5) 96.4 96.8 98.1 99.3 0.7 0.23 (I;100;0:1;0:25;0:8) 90.8 93.6 95.8 97.6 1.2 0.21 (I;100;0:1;0:75;0:2) 99.9 99.9 100 100 0.0 0.29 (I;100;0:1;0:75;0:5) 99.8 99.8 100 100 0.1 0.33 (I;100;0:1;0:75;0:8) 99.6 99.7 99.9 100 0.1 0.36 (I;100;0:5;0:25;0:2) 98.0 98.9 99.5 100 0.4 0.25 (I;100;0:5;0:25;0:5) 95.5 96.6 98.0 99.0 0.8 0.26 (I;100;0:5;0:25;0:8) 94.0 94.9 96.4 97.9 1.0 0.22 (I;100;0:5;0:75;0:2) 99.8 99.9 100 100 0.0 0.32 (I;100;0:5;0:75;0:5) 99.8 99.8 99.9 100 0.1 0.36 (I;100;0:5;0:75;0:8) 99.6 99.7 99.9 100 0.1 0.34 (I;100;1:0;0:25;0:2) 97.9 98.3 99.4 100 0.5 0.29 (I;100;1:0;0:25;0:5) 96.6 96.8 98.0 99.1 0.7 0.28 (I;100;1:0;0:25;0:8) 93.6 94.6 96.4 98.1 1.0 0.24 (I;100;1:0;0:75;0:2) 99.8 99.9 100 100 0.1 0.34 (I;100;1:0;0:75;0:5) 99.7 99.8 99.9 100 0.1 0.37 (I;100;1:0;0:75;0:8) 99.6 99.7 99.9 100 0.1 0.33 Average 98.9 Param. Conf. p(^ x)=ˆ z CPU (T;n; ¯ g;f 0 ;d) min 5th Avg. 95th Std. Secs. (C;50;0:1;0:25;0:2) 97.9 98.4 99.5 100 0.6 0.06 (C;50;0:1;0:25;0:5) 94.9 96.1 98.1 99.6 1.1 0.06 (C;50;0:1;0:25;0:8) 92.4 93.7 96.0 97.9 1.3 0.06 (C;50;0:1;0:75;0:2) 99.8 99.9 100 100 0.0 0.07 (C;50;0:1;0:75;0:5) 99.5 99.7 99.9 100 0.1 0.07 (C;50;0:1;0:75;0:8) 99.5 99.6 99.9 100 0.1 0.07 (C;50;0:5;0:25;0:2) 98.0 98.3 99.4 100 0.6 0.06 (C;50;0:5;0:25;0:5) 93.3 95.8 97.9 99.5 1.1 0.06 (C;50;0:5;0:25;0:8) 92.8 94.3 96.5 98.3 1.3 0.06 (C;50;0:5;0:75;0:2) 99.7 99.9 100 100 0.1 0.07 (C;50;0:5;0:75;0:5) 99.5 99.7 99.9 100 0.1 0.07 (C;50;0:5;0:75;0:8) 99.6 99.7 99.9 100 0.1 0.07 (C;50;1:0;0:25;0:2) 97.6 98.4 99.5 100 0.6 0.07 (C;50;1:0;0:25;0:5) 94.7 95.7 98.0 99.5 1.2 0.07 (C;50;1:0;0:25;0:8) 92.5 94.2 96.5 98.4 1.3 0.06 (C;50;1:0;0:75;0:2) 99.7 99.8 100 100 0.1 0.07 (C;50;1:0;0:75;0:5) 99.4 99.7 99.9 100 0.1 0.08 (C;50;1:0;0:75;0:8) 99.3 99.6 99.9 100 0.1 0.07 (C;100;0:1;0:25;0:2) 98.8 98.9 99.5 100 0.3 0.27 (C;100;0:1;0:25;0:5) 96.0 96.8 98.1 99.0 0.7 0.27 (C;100;0:1;0:25;0:8) 93.6 93.9 95.8 97.5 1.0 0.24 (C;100;0:1;0:75;0:2) 99.9 99.9 100 100 0.0 0.30 (C;100;0:1;0:75;0:5) 99.7 99.8 99.9 100 0.1 0.33 (C;100;0:1;0:75;0:8) 99.6 99.7 99.9 100 0.1 0.35 (C;100;0:5;0:25;0:2) 98.6 98.8 99.5 100 0.4 0.28 (C;100;0:5;0:25;0:5) 95.8 96.7 98.0 99.1 0.8 0.28 (C;100;0:5;0:25;0:8) 93.7 94.7 96.3 98.0 1.1 0.25 (C;100;0:5;0:75;0:2) 99.8 99.9 100 100 0.0 0.32 (C;100;0:5;0:75;0:5) 99.8 99.9 100 100 0.1 0.36 (C;100;0:5;0:75;0:8) 99.6 99.7 99.9 100 0.1 0.35 (C;100;1:0;0:25;0:2) 98.2 98.7 99.4 100 0.4 0.29 (C;100;1:0;0:25;0:5) 95.6 96.5 97.9 99.0 0.8 0.31 (C;100;1:0;0:25;0:8) 93.9 95.0 96.5 98.0 0.9 0.25 (C;100;1:0;0:75;0:2) 99.9 99.9 100 100 0.0 0.33 (C;100;1:0;0:75;0:5) 99.8 99.9 99.9 100 0.1 0.37 (C;100;1:0;0:75;0:8) 99.7 99.7 99.9 100 0.1 0.35 Average 98.9 Table 3.2: Computational results for the capacitated test problems. upper bound on the optimal expected revenue. In other words, we have an average optimality gap no larger than 1.1%. The larger optimality gaps in Table 3.2 can be attributed to the performance of the approxi- mation algorithm being inferior or the upper bounds being looser, but it is not possible to say which one of these factors plays a dominant role without knowing the optimal expected revenues. Over the largest test problems with n= 100, on average, our approximation algorithm runs in 0.25 seconds. To put this running time in perspective, if we partition the 100 products into two nests and assume that the customers choose under the nested logit model, then the average CPU seconds for the approach proposed by [45] is 0.86 seconds. We solve multiple LP relaxations in the iterative rounding algorithm. Over all of our test problems, the iterative rounding algorithm terminated after solving at most five LP relaxations, with only 1.37 LP relaxations on average. 64 3.7 Conclusions In this chapter, we developed approximation algorithms for the uncapacitated and capacitated assortment problems under the PCL model. We can extend our work to a slightly more general version of the PCL model. In particular, the generalized nested logit model is a more general version of both the nested logit and PCL models. Under the generalized nested logit model, each product can be in multiple nests and each nest can include an arbitrary number of products. For each product and nest combination, there is a membership parameter that characterizes the extent to which the product is a member of the nest. Considering the generalized nested logit model with at most two products in each nest, we can generalize our results to tackle the uncapacitated and capacitated assortment problems under this choice model. We discuss this extension in Appendix B.7. There are several future research directions to pursue. First, our approximation algorithms exploit the fact that we can formulate the Function Evaluation problem as an integer program by linearizing the quadratic terms in the objective function. Our performance guarantees are based on the fact that we can choose the values of the decision variables withinf0; 1 2 ;1g to construct a provably good feasible solution to the LP relaxation of the integer program. This observation does not hold when each nest includes more than two products. Solving assortment problems under variants of the generalized nested logit model that have more than two products in each nest is a worthwhile and highly non-trivial extension. Second, we can consider a variant of the multinomial logit model with synergies between the pairs of products, where the deterministic component of the utility of product i increases byD i when product i is offered along with some other product i p . Usingx to denote the subset of offered products, if we formulate an assortment problem under such a choice model, then the objective function can be written as a ratio of two quadratic functions ofx. An interesting question is whether we can use an approach similar to ours to develop approximation algorithms under this variant of the multinomial logit model. A straightforward extension of our approach does not work when the increase D i in the deterministic component of the utility of product i has no relationship with the base deterministic component of the utility of product i when this product is not offered along with product i p . More research is needed in this direction. Third, although there is work on pricing problems under the PCL model, this work assumes that the price sensitivities of the products satisfy certain conditions. We can formulate a pricing problem as a variant of our assortment problem. In particular, we can create multiple copies of a product, corresponding to offering a product at different price levels. In this case, we need to impose the constraint that we offer at most one copy of a particular product, meaning that each product should have 65 one price level, if offered. Our efforts to extend our approximation algorithms to this type of a pricing problem showed that the pricing problem is considerably more difficult and more work is also needed in this direction. Fourth, the PCL model is flexible as it allows a rather general correlation structure among the utilities of the products. Empirical studies in the route choice domain demonstrate that the flexibility provided by the PCL model may be beneficial. It would be useful to conduct additional empirical studies in the operations management domain to understand the benefits of the PCL model in predicting the customer purchase behavior. 66 Chapter 4 Position Ranking and Auctions for Online Marketplaces 4.1 Introduction E-commerce platforms such as Amazon, eBay, Taobao, and Google Shopping connect sellers and con- sumers. When a consumer enters a search keyword related to a product of interest, the platform’s search engine returns a list. Typically, the consumer looks for a desired item by searching downward in the list. With a large volume of returned results, consumers rarely consider all of the items, as examining each option is costly. For example, [81] demonstrate that consumers typically only view up to 10 to 15 items of a longer list, because of search costs. Therefore, the way in which products are ranked and displayed is a critical decision of these platforms. Most prior work on this topic has investigated this problem, often empirically, from the perspective of consumers and shows that efficient rankings benefit users and improve their surplus [29, 81, 134]. For instance, [134] finds that rankings based on a product’s expected utility lead to a twofold increase in consumer welfare. Researchers have also recognized that platforms have goals beyond consumer welfare. For example, [87] show how to optimally rank search results to maximize an objective that synthesizes search-result relevance and sales revenue. 1 In addition, to attract and retain sellers, a platform needs to ensure a sufficient seller profit. We study the problem of ranking products by formulating a multi-objective optimization that takes consumer welfare, supply-side surplus (the aggregate benefit of the sellers and the platform), and sales revenue into account. We refer to this problem as the weighted surplus maximization problem, which is formally defined in Section 2. One key challenge in obtaining a satisfactory practical solution to this problem is information asymmetry. That is, the platform may be unaware of the private benefits to sellers’ 1 This work follows conventional separable assumption on click-through rates (i.e., the click probability is the multiplication of a position effect and a product effect) and does not explicitly model externalities. 67 of each consumer purchase—for example, profits, brand effects, and so on. We demonstrate that making an uninformed ranking decision can lead to a large loss of surplus to the platform, especially when the number of items is large. We use sponsored search—i.e., selling the top slots—to resolve this predicament. In practice, leading online marketplaces like Taobao and Amazon have implemented the sponsored search program that allows sellers to bid for prominent positions within search results. Taobao introduced this program in 2008 and Amazon followed in 2012 after cultivating its own marketplace. Since that time, Amazon has rapidly developed its sponsored search program, achieving a 100% annual revenue growth and driving more than $1.5 billion in sales revenue globally in 2015 [127] and maintaining a 71% growth rate at the end of 2018. In a typical sponsored search, the few top returned results are “sponsored,” with clear marks indicating as such, while the other results are “organic.” Our work theoretically shows that a sponsored search program, when implemented appropriately, is a robust approach in obtaining a high weighted surplus with information asymmetry. The main message of our work is as follows. Position ranking is an critical operational problem for platforms, especially under information asymmetry. The problem is challenging because of both opti- mization and mechanism design considerations. Under stylized assumptions of consumer searches on a platform, we show that with complete information, the optimal ranking has a simple structure. With information asymmetry, we note that the well-known Vickrey-Clarke-Groves (VCG) mechanism is not applicable, and we construct a near-optimal solution with sponsored search to extract this private informa- tion. Although extensions of the consumer search process lead to a non-trivial ranking problem, we can still derive near-optimal yet simple solutions. We next discuss our results and contributions in more detail. A novel modeling framework with consumer search. Our work combines consumer sequential search costs, product ranking, and position auctions for online shopping intermediaries. Although spon- sored searches and position auctions, as used by search engines such as Google, Bing, and Yahoo! [43, 137], have been studied in the context of online adwords auctions, most of these studies have assumed that the probability of a user choosing a product (i.e., clicks on an ad) can be estimated by multiplying a position-dependent factor and a product-dependent factor. This assumption, often referred to as separa- bility, does not take into account the externalities imposed by sellers on each other, nor does it explicitly model consumer search costs. In our base model, the consumer balances the trade-off between accepting a suboptimal product choice and incurring additional information acquisition costs in the search process [143]; therefore, a highly desirable product displayed in a prominent position could negatively impact 68 the purchase probabilities of other items. As such, we derive a consumer purchase model with seller ex- ternalities resulting from consumer search costs. Although this externalities effect has been reported by empirical studies [e.g., 72], it has not yet drawn much systematic theoretic research in mechanism design. Furthermore, the introduction of search cost allows us to explicitly account for consumer welfare in the platform’s objective function. In this work, we consider maximization of the weighted surplus, which is composed of the supply-side surplus, consumer surplus, and aggregate sales revenue. A simple sorting solution optimal to the search result ranking problem. Our work theoretically answers the question regarding the optimal platform-ranking design highlighted in the empirical research [29, 81, 134, 149]. In a complete information setting in which sellers’ private valuations are observed, we show that the optimal ranking problem can be solved by sorting products according to their net surplus values, which can be interpreted as the weighted sum of a seller’s private valuation and his quality score (Theorem 4.3.1). Significant welfare loss with information asymmetry. A comparison of the complete and incom- plete information settings shows that as the number of sellers increases, the worst-case average welfare loss due to information asymmetry goes to infinity (Propositions 4.4.1 and 4.7.4). To understand this potentially large welfare loss, consider a simple example in which there are n items from n different sell- ers returned with a consumer search, and the consumer evaluates items one-by-one downwardly in the returned list. We assume that conditional on viewing an item, the consumer accepts it and leaves with probability 0:2. Assume that an item exists that which is much more profitable than the rest. If the plat- form observes the private seller profits, it is optimal for the platform to rank this item on top to optimize the seller’s profit whereupon this item will then be purchased with probability 0:2. However, if the profit information is not available and the platform ranks this item at the bottom, it is purchased with probability (1 0:2) 4 0:2 0:082 when n= 5, and(1 0:2) 9 0:2 0:027 when n= 10. The inefficiency due to information asymmetry deteriorates fast as the number of sellers increase. This inefficiency requires an effective treatment in light of the typically large volume of results returned for consumer searches. A sponsored search program retains high surplus value. Motivated by the optimal ranking with complete information, we propose a simple ranking rule, the surplus-ordered ranking (SOR), in the sponsored search to select the sellers ranked in the top slot to close the welfare gap. We show numer- ically and theoretically (Theorem 4.4.4) that the difference between optimal surplus from the complete information benchmark, and that of selling only a few slots with SOR, is marginal. 69 Implementing SOR using mechanism design. We also study the implementation of SOR under the mechanism design approach. Compared with the conventional framework, additional practical concerns arise in our context. Sellers who are not ranked on top, albeit a part of the mechanism, are considered organic results, and they are not subject to a sponsored search surcharge. This zero-payment property imposes additional complexities for the mechanism design. We construct a mechanism to implement SOR that satisfies all of the desired constraints (Theorem 4.5.1). Extensions to more general settings. We extend the base model to capture more general consumer search processes and demonstrate that the results and insights from the base model are robust. For example, in practice, platforms can display items and share partial item information with the consumer such as pictures and prices through the list-page. The consumer could then use such information to choose desired items, click on them to visit their item-pages, and learn further details. To incorporate such information- exchange processes, we extend our base model. Specifically, during the sequential search on the list-page, whenever a consumer finds a good item, she visits its item-page to learn further information and finalize the purchase decision. This model is a novel extension building upon the optimal sequential search framework from [143], but the analysis under this model is quite difficult. We derive near-optimal ranking rules by slightly modifying the simple sorting solution optimal in our base model (Theorems 4.6.2). Organization. The chapter is organized as follows. In Section 4.2, we introduce our model. In Section 4.3, we solve the optimal ranking problem with social welfare as the objective, in both complete and incomplete information settings. In Section 4.4, we compare the social welfare obtained in the two settings, remarking that the surplus loss can be arbitrarily bad with incomplete information, and propose using SOR to sell the top slots to improve the surplus. Through numerical and theoretical studies, we characterize the SOR performance. In Section 4.5, we investigate SOR in the mechanism design framework and discuss its payment function. In Section 4.6, we extend our model, and discuss ranking rules with performance guarantees under the extension. Auxiliary results to support the arguments in the main body are presented in Appendix. In particular, in Section C.1, we characterize the optimal stopping problem of the consumer when she can sort the items by metrics such as prices using tools provided by the platform. In Section C.2, we extend our base model to a satisficing choice model, in which the consumer searches greedily and stops with a good-enough item. In Section C.3, we show that the well-known Vickrey-Clarke-Groves mechanism cannot be used in our setting to sell the top slots. In Section C.4, we derive an alternative representation of the payment function for SOR. In Section C.5, we show that we can implement our mechanism as a Nash equilibrium in a 70 modified generalized second price (GSP) auction when all slots are sold. In Section C.6, we consider the optimal ranking and auctions if instead of maximizing the weighted surplus, the platform is interested in maximizing its direct profit. The approach used in the main text can be extended in this case to derive similar optimal ranking rules and robust auction techniques. 4.1.1 Literature Review Assortment Optimization. In the management science literature, the product-showcasing decision has been studied extensively as assortment optimizations, in which researchers consider how a seller should select a subset of products to offer to maximize the profit, typically in the context of the retail or airline industry. Central to assortment optimization problems are the consumer choice models, which describe how consumers choose a desired product given an assortment. It is often assumed that consumers aim to maximize their utility through purchasing. For example, the seminal work by [130] assumes that con- sumers choose according to the multinomial logit (MNL) model in the set of items offered and demonstrate the optimality of revenue-ordered solutions. Later works, such as [37] and [89], extend the MNL to other models based on consumer random utility maximization. Recently, this literature has witnessed growing considerations on learning and modeling consumer exploratory behavior. For example, [133] look at a model that simultaneously optimizes the profit while learning consumer behavior. [8] study the consider-then-choose model under which all consumers share the same ranking list for products, but different types of consumers have different considerations sets. They show that it is generally hard to approximate, and they focus on developing efficient algorithms for special cases under empirically vetted assumptions. One work closely related to our study is [142], who assume that a consumer first forms her consideration set by balancing the search cost and the expected product utility in the consideration set, and then chooses the product that optimizes her utility. Our work differs from this literature in three major aspects. First, because the platform may have a contractual obligation to display all third-party seller items, we study the item ranking instead of the as- sortment optimization problem. The importance of ranking in an online setting is also driven by consumer exploratory behavior. Consumers tend to search sequentially, discovering and evaluating products down- ward in a list. Due to these differences, the techniques for solving assortment optimization problems do not extend to our setting. Second, while assortment optimization problems typically focus on maximizing a seller’s own profits, we consider the welfare of all involved participants. Motivated by consumer search 71 behavior, we introduce the optimal sequential search as the micro-economic basis. In particular, this ap- proach integrates consumer surplus into the aggregate welfare consideration from a platform perspective. Third, the information asymmetry between the platform and third-party sellers calls for the mechanism design approach. We use auctions to sell the top slots to generate efficient rankings. Consumer search theory. Our work builds on the optimal sequential search theory pioneered by [143]. With different specifications of the consumer search process, several authors [63, 111, 117] have developed and advanced the theory of simultaneous search, in which consumers usually decide the consid- eration set first and then search the best alternative within the set. A large body of literature in economics and marketing research focuses on applications of search problems. For example, [147] considers the im- plications of search on market competition and equilibrium. [42] study efficient market design that reduces consumer search costs. [142] provide a comparison of the sequential and simultaneous search models with simulations. For a comprehensive review of this literature, please refer to [106]. The theory of sequen- tial search has recently been extended to consider the gradual revelation of information with one or more alternatives [17, 77, 102]. On the empirical side, the sequential search theory has recently been applied to the study of e- commerce platform-ranking design. [81] demonstrate that demand estimates from a sequential search model provide accurate predictions of actual product sales ranks. Furthermore, [24] apply the sequential search theory in the empirical research of search results ranking and policy simulations to show that, apart from the position and identity of a link, consumer behavior is also an important factor in online demand. [29] propose a structural model of consumer sequential search on e-commerce platforms under uncertainty about product-attribute levels. In [134], the author uses a data set with experimental variation in the ranking to identify the causal effect of rankings. It is shown that although item rankings can only affect consumer behavior through a search cost, the utility a consumer derives from an item is intrinsic only to that item. Exploiting the “opaque offers” feature of the data set, the author demonstrates that rankings determine consumer behavior through lowering total search costs instead of affecting consumers’ expectations of utility. These characteristics of consumer behaviors are fully captured in our search model. Online adwords auctions. Deciding sellers’ positions with information asymmetry naturally leads to an auction-design approach. Our work is closely related to studies on online adwords auctions. Seminal works such as [43] and [137] characterize the GSP auction and VCG mechanism under the aforementioned separable assumption on click-through rates. [43] show that no dominant strategies exist in the GSP auction and that the VCG outcome is a locally envy-free equilibrium, meaning that no seller is willing to 72 swap his position with the one above him. [137] focuses on a special set of Nash equilibriums of the GSP auction, namely the symmetric Nash equilibrium. These studies show that although the GSP auction looks similar to the VCG mechanism in such an environment, the properties of GSP are in fact very different. Later studies re-examine the GSP auction format with different outlooks but similar separable assumptions on click-through rates. [23] model adwords auctions as a dynamic game of incomplete information and connect sellers’ best responses in this game to the equilibrium of the complete information game. On the auction revenue side, [94] show that GSP achieves at least a constant fraction of the optimal auction revenue. Several studies explore more general forms of click-through probabilities to incorporate seller exter- nalities for online adwords auctions [2, 10, 28, 79, 83]. The focus of these studies remains the properties of the GSP auction and its connections to the VCG mechanism. [2] propose a simple Markovian user model, while [79] present a cascade model in which consumers make independent random decisions with ad-specific probabilities on whether to continue scanning after viewing each advertisement. [10] examine position auctions with explicit consideration of consumer search behavior using a model different from ours. They model the quality of an item from consumers’ perspective as the probability of meeting their demands, which results in seller externalities, and analyze consumer search strategies, equilibrium bidding, and the welfare benefits of position auctions. [28] and [83] generalize the work of [10] by considering the dynamic environment and the endogenizing of purchase prices, respectively. Inspired by observations in empirical studies [e.g., 134], we use the optimal sequential search theory [143] as the micro-economic foundation, which allows us to calculate consumer surplus. Related to the online adwords-auction litera- ture, in Section C.5 we discuss the GSP auction in our context. Compared with the literature, we adopt a different definition of social welfare and consider a more general objective in the study of e-commerce platform-ranking and auctions. We construct a modified GSP auction that supports the socially efficient outcome in our setting. 4.2 Model and Preliminaries In this section, we introduce the basic components of our model, as well as our notations. We briefly overview the setting for the online marketplace in Section 4.2.1. We then model the consumer rational choice behavior and derive the sellers’ purchase probabilities in Section 4.2.2. 73 4.2.1 The Marketplace Consider an online marketplace with n sellers. Each one sells one product to consumers, and each con- sumer seeks to buy one item through a keyword search. Throughout this manuscript, we will call the operator of this marketplace “the platform,” and use “he” to refer to the sellers, and “she” for the con- sumer, who maximizes her utility through purchasing. We also use “seller” and “item” interchangeably. The platform makes the ranking and display decision. Specifically, the n items from the sellers are presented to the consumer in a list. Let N =f1;2;::::ng. We use permutation p2P : N7! N to denote such a ranking of sellers. p( j)= i means that slot j in the list is assigned to seller or item i. P is the set of all feasible rankings. In addition, givenp, we writes() for its inverse mapping. Thus,p( j)= i is equivalent tos(i)= j. Assumptions on the seller side. Let us assume that(u i ; p i ) are i.i.d. random variables that correspond to seller i2 N or item i, where u i is the mean utility that item i provides to the consumer because of its product characteristics other than the price, and p i is the price of item i. Once the sellers are present, both u i and p i are observed by the platform. Furthermore, with sellers in the market, the platform forms the belief about the private valuation of seller i,q i , which is the gain of seller i from selling one item. For example, this could include private profits and the branding effect. This belief is denoted by the distribution F i (q i ). To simplify the notations, we denote its support byQ i = 0; ¯ q i , where 0 ¯ q i ¥. We assumeq i is independent ofq j for i6= j. Assumptions on the consumer side. Let us fix the ranking of items, without loss of generality, to be the same as the index. We will now explain the decision process of the representative consumer. The utility that item i provides to the consumer is given by V i = u i p i +e i , where e i is a zero-mean random variable capturing the consumer idiosyncrasy. Therefore, u i p i is the expected utility that the consumer derives from the item or the intrinsic part of the utility. We assume e i are i.i.d. Also, we assume that given the ranking rule, the consumer always believes that the products are homogeneous in the sense that V i follows a continuous i.i.d. distribution for i2 N, determined by the distribution of u i p i ande i . This assumption on consumer beliefs is motivated by the findings of [29], who use the consumer search theory to study platform-ranking design empirically. They demonstrate that the alternative assumption, in which consumers understand the ranking rule, leads to an inferior estimation of model parameters. Many factors may have contributed to this seemingly counterintuitive phenomenon. First, before evaluating the item, the consumer does not know the realization of u i , p i , ore i . This is especially true if the platform chooses to 74 reveal limited information about the items when returning the search results or the consumer believes that the mean utility that an item provides is highly correlated with its price—i.e., high-price products offer high utility. Second, while the platform decides the ranking of the items with potentially complex algorithms, it is practical to consider that it is beyond the consumer’s ability to decipher the ranking algorithm. Third, the simple stylized model enabled by this assumption may actually offer robust findings in general. We examine this point in Section 4.6, in which we extend our results to a more general setting. To lighten the notational burden and for the ease of exposition, we start our analysis under this stylized assumption. Before entering the search, the representative consumer observes her outside option with utility V 0 =e 0 , wheree 0 follows some random distribution with a zero mean after normalization. She incurs search cost s for evaluating each item and learns the realization of V i after evaluating item i. We assume that the distribu- tion ofe 0 is independent of everything else. Therefore, the customer may not want to evaluate all the items, or even enter the search, depending on the realization of random variables. Should she enter the search, she starts from the beginning of the list and decides whether to examine the next item after viewing the current one to optimize her expected utility. At any time, she could leave the system either purchasing the current best item or without making any purchase. This constitutes an optimal stopping problem. Throughout this chapter we assume that all distributions have densities for mathematical convenience. Notice that if s! 0 and e i are i.i.d. Gumbel(0;1) distributed for all i2 N[f0g, the widely used multinomial (MNL) model can be viewed as a special case of our setting. Lastly, we use bold-face letters such asu,p, and to denote vectors. Platform objective. For the long-standing development of the marketplace, we assume the plat- form wants to maximize the synthesis of several welfare measures, including consumer surplus, aggregate revenue of the platform, and supply-side surplus (the aggregate profit of both platform and sellers) by choosing a sellers’ ranking. We explain this idea in more detail in ensuing sections. 4.2.2 The Consumer Search Model In our work, consumers are modeled as rational utility maximizers. This view has been commonly used in other types of studies regarding product showcasing, such as assortment optimization problems [133, 142]. Our model is based on the consumer sequential search of [143]. Let us assume for now that p(i)= i. In the search, the representative consumer decides whether to examine the next item or simply buy the current best item and leave. Let v be the current best utility 75 before evaluating the next item. Note that the ex-ante distributions of V i are i.i.d. across all items from the perspective of the consumer. Let V be a random utility of the next item following the same distribution. Suppose that she incurs search cost s and evaluates the next item. P V v v then accounts for the expected utility of the best item, if the next item is worse than the current best, whileP V> v E[VjV > v] represents the expected utility of the best item if the next is better. Thus, given that the best option so far with utility v,P V v v+P V > v E[VjV > v] s is the consumer’s expected utility if she examines only one more item and ignores the past sunk costs. Intuitively, a myopic consumer chooses to buy the current best if vP V v v+P V> v E[VjV > v]s, and she evaluates the next one otherwise. When s> 0, let us define v as the unique solution to the equation v=P V v v+P V > v E[VjV > v] s: (4.1) The existence and uniqueness of the solution will be established in the e-companion. When s= 0, we assume v =+¥. The existence of this myopic strategy suggests that if v< v , the consumer will then keep viewing the next item. In fact, as shown in the next proposition, v is the threshold for the optimal stopping rule. The sketch of the proof will be deferred to the appendix. Lemma 4.2.1 (Optimal Consumer Search) The consumer searches if and only if v > V 0 . Upon enter- ing the search, (i) she continues the search until she first views an item i such that V i > v ; (ii) other- wise, she views all items and leaves with the option with the highest utility—i.e., choosing i such that i2 argmax j2N[f0g fV j g: 2 In Section C.1, we consider the case in which the consumer can sort the items using metrics such as by the price or relevance of the items before she carries out the search, and prove a generalized version of Lemma 4.2.1 (Proposition C.1.1). Purchase probabilities and seller externalities. We now discuss the implications of consumer search on the purchase probabilities for sellers. Lemma 4.2.1 suggests that the demand for a particular item happens in two ways. If V i > v and i is the first such item, the consumer purchases item i immediately. On the other hand, the consumer may return to item i after viewing all items, when V i v and V i = max k2N[f0g V k . We refer to the former as the fresh demand of item i and the latter as the returned demand and denote them by q j;p and q i 0 , respectively. Notice that the returned demand, q i 0 , is invariant to ranking. The detailed 2 We assume that the consumer continues the search if she is indifferent between doing so or stopping. If the consumer finishes the list and there are multiple items with the largest utility, she picks the one with the smallest item index. 76 expressions of these purchase probabilities are presented in Corollary 4.7.1. As can be seen there, one feature of our model is that the purchase probability decays exponentially. In the context of online adwords, [48] show that a similar exponential decay model predicts the click-through-rate very well. We mention that the standard view of the purchase probability in a similar context is the separable assumption adopted in most of the studies on the online adwords auction [43, 108, 137] and also in [87]. Although the purchase rate is normally assumed to be only dependent on the seller and the position, seller externalities highlighted can hardly be accounted, as shown in the empirical work by [72]. Seller externalities mean that the purchase probabilities are also influenced by the quality of other sellers, and our model with consumer sequential search provides a micro-economic foundation for this phenomenon. Please see further discussion in this thread under Corollary 4.7.1. We remark that seller externalities play important roles in the development of our other theoretical results—e.g., discussions in Section 4.5. 4.3 Weighted Surplus Maximization and Optimal Ranking How to rank sellers or items when displaying items to consumers is a fundamental problem of platform operations, as highlighted in empirical works [149, 29, 134]. This is our central topic in this section. The consumer search model allows us to explicitly formulate a comprehensive objective on the social welfare side, as outlined in Section 4.3.1. We investigate this problem in both the complete information setting, where the valuations q i are known to the platform, and the incomplete information setting, where the valuations are private. In Section 4.3.2, we show that, in both settings, the solutions are certain sorting rules. 4.3.1 Weighted Surplus Maximization Compared with previous work in the online adwords-auction literature, we consider a more general opti- mization problem with multiple objectives. Supply-side surplus represents the total surplus of the sellers and the platform, which has been generally referred to as social surplus in previous online adwords auc- tion work when consumer surplus is absent [43, 108, 137]. Aggregate revenue represents the total sales revenue from consumers. Consumer surplus is the total net surplus the consumer obtains from searching 77 and purchasing, and it is the expected utility of the consumer search minus that of the outside option. The general form of the objective function is then the weighted surplus: maxg 1 supply-side surplus+g 2 aggregate revenue+g 3 consumer surplus; (4.2) for someg i 0, i= 1;2;3. Note that the objective value depends on the sellers’ rankings. By adopting this objective, we primarily focus on the long-term operational goals and we optimize the welfare. However, most of our results in Sections 4.3 to 4.5 can be extended if the platform only cares about its profit, which includes the sellers’ commission fee, the platform’s own product revenue, and the sponsored search auction revenue. See the discussion in C.6 for more details. Let us first consider the complete information setting, under which the platform knows. Supply- side surplus and aggregate revenue can be then written aså n j=1 q j;p q p( j) +å n i=1 q i 0 q i andå n j=1 q j;p p p( j) + å n i=1 q i 0 p i , respectively. For consumer surplus, note that the consumer enters the search if and only if V 0 v . If the consumer purchases the item ranked on slot j in the fresh demand, her expected utility from the item will beE V p( j) V p( j) > v . Using the independence of random variables, we can write the expected consumer surplus beyond the outside option V 0 as n å j=1 q j;p E V p( j) V p( j) > v E V 0 V 0 v js + n å i=1 q i 0 E V i V 0 A i ns ; where we denote event A i def = V i v T i= minf j : j2 argmaxfV k : k2 N[f0ggg . In summary, the weighted surplus maximization problem with complete information could be ex- panded as max p 8 > > < > > : W p () def = 8 > > < > > : å n j=1 q j;p g 1 q p( j) +g 2 p p( j) +g 3 E V p( j) V p( j) > v E V 0 V 0 v js +å n i=1 q i 0 g 1 q i +g 2 p i +g 3 E V i V 0 A i ns 9 > > = > > ; 9 > > = > > ; : (MAX-WEIGHTED-SURPLUS) Let us defineL p () def =å n j=1 q j;p g 1 q p( j) +g 2 p p( j) +g 3 E V p( j) V p( j) > v jsE V 0 V 0 v to represent the weighted surplus from the fresh demand. In addition, we use W 0 () def = å n i=1 q i 0 g 1 q i +g 2 p i +g 3 E V i V 0 A i ns for the weighted surplus from the returned demand, 78 which is independent of the sellers’ ranking. As a result,W p ()=L p ()+W 0 (), and max p2P W p () is equivalent to max p2P L p (). We next study the efficient sellers’ ranking that optimizes this objective. 4.3.2 The Optimal Ranking of Sellers Let us definea i = P(V i < v ). We define the net surplus of seller i as r i (q i ) def = g 1 q i +w i ; (4.3) where w i def = g 2 p i +g 3 E V i V i > v s=(1a i ) : (4.4) Note that w i can be thought of as the quality score of seller i, and is independent of the private seller valuation q i . In particular, the term E V i V i v s=(1a i ) strikes a balance between the expected utility conditional on purchase—i.e.,E V i V i v —and the normalized search cost s=(1a i ). As(1 a i ) increases, ceteris paribus, the item becomes more preferable because it enables the consumer to quickly end the search and save more search cost. We writer i as a function onq i to highlight its dependence on the private information. Intuitively,r i (q i ) is the net value added to the marketplace by seller i. As will be shown in Theorem 4.3.1,r i (q i ) is critical in determining the optimal ranking. We define social net surplus as W() def = max p2P ( å j2N 1a p( j) Õ 0k< j a p(k) r p( j) (q p( j) ) ) : By Lemma 4.7.2, the problem on hand is equivalent to maximizing the social net surplus. Theorem 4.3.1 further states that the optimal ranking follows a simple sorting rule. Theorem 4.3.1 (Optimal Ranking with Complete Information) Ranking p (), given by decreasing order ofr i (q i ) (ties can be broken arbitrarily), is optimal to the optimization problem MAX-WEIGHTED- SURPLUS. We refer to rankingp () as the optimal ranking with complete information. If the platform is the central decision maker who observes everything and optimizes the weighted surplus, this is the ranking rule that should be adopted. Theorem 4.3.1 is proved by contradiction using an interchange argument: if a ranking 79 does not possess the monotonicity described, we can then locally swap a pair of sellers to improve the objective function value. If the platform only cares about a single objective, such as aggregate revenue or supply-side surplus, it then reduces to the sorting with respect to a single criterion, such as the price or private benefit of sellers. Similarly, if consumer surplus maximization is considered, it is then optimal to rank in decreasing order ofE V i V i v s=(1a i ). Further discussions about consumer surplus maximization is provided in Section 4.7, in which we show that ifE V i V i v s=(1a i ) is monotone in u i p i under some regularity condition. Optimal ranking with incomplete information. The aforementioned discussion assumes complete information. In practice, valuations are often private and non-observable. The platform may only have distributional information regarding q i in optimizing the expected weighted surplus. In this case, our objective function of interest can be written as the following form, using the same line of analysis as in the complete information setting: W(E[])= max p2P ( å j2N 1a p( j) Õ 0k< j a p(k) r p( j) (E[q p( j) ]) ) : The next theorem characterizes the optimal ranking with incomplete information. As its proof is similar to that of Theorem 4.3.1, it is omitted. Theorem 4.3.2 (Optimal Ranking with Incomplete Information) Ranking ˆ p(E[]), given by decreas- ing order of r i (E[q i ])=g 1 E[q i ]+w i (ties can be broken arbitrarily), is optimal to the weighted surplus maximization with incomplete information. This ranking will be similarly referred to as the optimal ranking with incomplete information. Theorem 4.3.1 and Theorem 4.3.2 are parallel results regarding the optimal ranking of sellers for the platform with different information structures. From now on, with the understanding that p and ˆ p are functions of andE[], respectively, we drop the dependency on them in notations. 4.4 Value of Selling k Slots In this section, we analyze the operational inefficiency due to information asymmetry. We first reveal a notable gap between the weighted surplus from the optimal ranking with incomplete information ˆ p(E[]) and from that with complete information p () in Section 4.4.1. We show that, as the number of sellers 80 in the market increases, the average surplus loss in the worst case can be arbitrarily bad. Following this discussion, motivated by the sponsored search practices, in Section 4.4.2 we propose the remedy: selling the top few slots using the surplus-ordered ranking (SOR) mechanism. We further prove a distribution-free performance guarantee for SOR in terms of weighted surplus maximization in Section 4.4.3. 4.4.1 Inefficiency of Incomplete Information Recall our setup from Section 4.2, in which seller i arrives with (u i ; p i ) randomly drawn from an i.i.d. distribution. After(u;p) is drawn, is realized with the belief of its distribution formed by the platform. Let us consider the ratioE[W p ()]=E[W ˆ p ()] to shed light on the average performance of the ranking rule ˆ p, especially when the number of sellers in the market becomes large. The expectation operation here is taken over the ex-ante distribution over all random variables. Intuitively, the more sellers in the market, the more potentially useful an informed ranking. We assume a pool of infinite sellers character- ized by i.i.d. sequencef(u i ; p i )g ¥ i=1 . With only the first n sellers from the sequence in the market, we compare the weighted surplus from ranking p and ˆ p. Denote by n the private valuation vector of the first n sellers. We impose a mild technical assumption on the boundedness of the ex-ante distributions thatE h e g 1 q i +g 2 p 2 +g 3 E[V i jV i v ] i h<+¥ for some positive constanth. This assumption is satisfied for all distributions with a finite moment-generating function evaluated at 1. Notice that when g 1 = 0, or when seller valuations are not part of the objective, rankingp and ˆ p generate the same weighted surplus. Thus, we focus on the caseg 1 > 0, and show the following lemma. Proposition 4.4.1 The ratio of the expected weighted surplus from the optimal ranking with complete informationp and that with incomplete information ˆ p satisfies E[W p ( n )]=E[W ˆ p ( n )]= O(logn). We show in the following example that this bound is tight. 3 Example 4.4.2 We assume thatg 1 = 1 andg 2 =g 3 = 0. We simply assume that(u i ; p i ) are i.i.d. distributed such that u i = p i . Therefore, the prior and posterior probabilities of purchasing item i given u i and p i are both 1a for some a such that a2(0;1). Also, q i are always i.i.d. distributed exponential random variables with a mean of 1=2, independent of other random variables and consistent with the belief of the platform. One can verify that all of our previously mentioned assumptions are satisfied. With this setup, 3 Notation-wise, O(logn) means that the term is bounded above by a 1 logn for some positive constant a 1 . Similarly, in Proposition 4.7.4,Q(logn) means that there exist positive constants a 1 and a 2 with a 1 a 2 such that the term is lower bounded by a 1 logn and upper bounded by a 2 logn. 81 we formally show in Proposition 4.7.4 that the bound mentioned in Proposition 4.4.1 is tight. Therefore, the inefficiency due to information asymmetry can deteriorate rapidly as the number of sellers grows. Recall our motivating example of e-commerce platform Amazon, on which each search query typically returns many search results. For example, 50;000 results were returned for the simple search words “selfie stick.” Prior to 2012, Amazon only presented organic search results, and no positions were auctioned off. For such an uninformed ranking, Propositions 4.4.1 and 4.7.4 imply substantial room for improvement. On the technical side, although in Propositions 4.4.1 and 4.7.4 we adopt an ex-ante perspective for compact and clean results, the same insights carry through if we consider the interim problem, under which the platform observes (u;p) and takes expectations only over. In particular, Proposition 4.7.4 holds unchanged in this case. It also holds if we only consider the weighted surplus from the fresh de- mand, since the weighted surplus from the returned demand converges to zero. On the intuition side, the consumer search cost implies that items in top slots are visited much more frequently than other items. Also, the best of the n realized random surplus values is put on top by the optimal ranking,p . For many distributions, exponential distribution included, the maximum of n realizations behaves like logn. As a result, the surplus fromp is roughly logn. In contrast, an uninformative ranking like ˆ p fails to take into account the randomness of private valuations and only generates constant expected surplus values. As a re- sult, as illustrated by Proposition 4.7.4, the uninformative ranking results in a substantial loss. On the other hand, the importance of top slots implies that one only needs to sell the top slots to perform reasonably well, which can be accomplished by the sponsored search, which we illustrate next. 4.4.2 Surplus-Ordered Ranking (SOR) Growing toward a market platform that attracts more third-party sellers, Amazon would like to solicit information from sellers to form a better ranking. The sponsored products program was introduced around 2012 as a tool for sellers to bid for salient positions within search results. The top few positions are sold through auctions while the organic search results fill the remaining positions. Motivated by this business practice, we investigate selling a few slots with a mechanism design approach. Suppose we want to sell k slots, such that 1 k n. We need the ranking rule to decide whom to sell the slots (the allocation function) and how they pay (the payment function). Here we first discuss the ranking rule, which determines the weighted surplus. We defer discussion of the payment function to the next section. By making k sellers prominent, as in the sponsored search 82 program, we seek to make an informed ranking to achieve a higher surplus from the base order. The base order captures the default or off-the-shelf ranking the platform would use without knowing sellers’ private valuations and is independent of. A good choice would be the optimal ranking with incomplete information ˆ p. Selling k slots simply uses the information we have to promote k sellers on top from the base order, subject to the payment collected from them. Furthermore, in light of Theorem 4.3.1, it seems to be a good rule-of-thumb to rank the k sellers greedily in the following sense. Recall that the net surplus of a seller with valuationq is defined asr i (q i )=w i +g 1 q i . Definition 1 (Surplus-Ordered Ranking) Given base order p 0 , the surplus-ordered ranking (SOR), p SOR k () (with inverse mappings SOR k ()), is the ranking rule that satisfies the following properties: (i) p SOR k () ranks the k sellers with the highest net surplus in the top k positions in decreasing order of their net surplus. 4 (ii) The order of the sellers ranked on position k+1 to n is consistent with the base order: s SOR k (;i)> s SOR k (; j) if and only if s 0 (i)>s 0 ( j), for all sellers i and j such that s SOR k (;i);s SOR k (; j) k+ 1. One can see the similarity of SOR and the optimal ranking in the complete information setting: the first k sellers are selected and ranked according to their net surplus values. The rest of the sellers are ranked in the same order as the base order. SOR will be used as our primary tool to sell the top slots. In the next subsection, we will look into the details of its performance. Similar to the treatment ofp and ˆ p, we will omit the dependency on in the notation ofp SOR k (). 4.4.3 The Value of SOR In this part, we study the performance of SOR. Because returned demand and the associated weighted surplus are independent of the ranking, we focus on the weighted surplus with fresh demand,L p SOR k (). We first observe the following. Proposition 4.4.3 The expected weighted surplus is increasing in the number of slots being sold. In other words, if k k 0 , thenE h L p SOR k () i E h L p SOR k 0 () i . 4 For ease of exposition, we break the ties according to base orderp 0 . 83 0.5 0.6 0.7 0.8 0.9 Base-order 2468 10 Optimal k:numberofslotssold(α=0.7) E ! Λ π S k " :surplusvalue std. = 0.2 std. = 0.3 std. = 0.4 0.5 0.6 0.7 0.8 0.9 Base-order 2468 10 Optimal k:numberofslotssold(α=0.8) E ! Λ π S k " :surplusvalue std. = 0.2 std. = 0.3 std. = 0.4 E h ⇤ ⇡ SOR k (✓ ) i E h ⇤ ⇡ SOR k (✓ ) i Figure 4.1: The surplus gained from a different number of slots sold with SOR. When k= n, SOR reduces to the benchmark optimal ranking. However, in practice, we are typically constrained by the number of positions we can auction off. We focus on the performance of SOR in such an environment. For the numerical experiments, we assume that g 1 = 1 and g 2 =g 3 = 0 in order to focus on private information. We assume q i are i.i.d. Beta distributed. Beta distribution is flexible in modeling various shapes of symmetric and bounded distributions. We use the standard deviation ofq i as the main instrument to capture seller heterogeneities. That is, zero variance implies no value of information, but a large variance implies a large difference in sellers as well as a potentially large gap betweenp and ˆ p. Large variances are expected to have a negative influence on the performance of optimal ranking with incomplete information. We assume a commona value for all the sellers, which is set to either 70% or 80%. Recall thata i is the probability of not accepting an item immediately, as defined in (4.7). We plot the expected surplus from selling 1 to 10 slots (n= 20), the expected surplus associated with the optimal ranking with incomplete information (base order), and the optimal expected weighted surplus in Figure 4.1. Note that for Beta distribution with our specification, the range of standard deviation is (0;0:5). Also, we have a = 0:7 on the left anda = 0:8 on the right. The figure shows that selling a few slots is effective: the curves appear to be concave and increasing quickly when the number of slots sold is small. The difference between k = 10 and optimal ranking is marginal. Numerical experiments also suggest that the ratioE h L p SOR k () i =E[L p ()] is decreasing in the standard deviation, ceteris paribus. In addition, seller heterogeneity negatively impacts the SOR performance, and the gap between the complete and incomplete information settings becomes large with 84 0 0.1 0.2 0.3 0.4 0.5 α=0 0.20.40.60.81 Target ratio = 95% Standard deviation k=3 k=4 0 0.1 0.2 0.3 0.4 0.5 α=0 0.20.40.60.81 Target ratio = 90% Standard deviation k=3 k=4 90% 95% Figure 4.2: The trade-off contour of standard deviation anda. the increase of standard deviations. Furthermore, a comparison of left and right panels suggests that the left-panel curves are steeper—smaller valuesa seem to imply a faster increase of the surplus. To further explore these parameters, in Figure 4.2, we plot the highest standard deviation as a function of a such that selling a certain number of slots results in the target ratio ofE h L p SOR k () i =E[L p ()]. Our intuition is that smaller values of a imply more importance of the top slots, in which case SOR can be more useful. Because a higher standard deviation negatively impacts SOR performance, SOR achieves a higher target ratio in the shaded regions. We are ambitious here, and although we only consider selling 3 or 4 slots, we demand a high target ratio of 90% or 95%. Also, we set sellers’ valuations to be Beta distributed with a mean of 0:5, and let n= 20. Note that the standard deviation of Beta distribution is bounded by 0:5, and for most of the standard deviation values, selling a few slots (k= 3 or 4) can capture at least 90% of the surplus, as they fall below the curves. One particular feature of these functions is their non-monotonicity. Whena is small, the top slots are critical, as the consumer is highly likely to buy the items in the top slots. As a result, the corresponding value of the standard deviation is decreasing in a. However, whena is close to one, it is highly unlikely that the consumer stops before evaluating all items, and thus the ranking is irrelevant. We observe the upward trajectory of the curve. Motivated by the encouraging numerical experiments, we investigate the performance guarantee for selling k slots using SOR theoretically. In our model with consumer sequential search, the top positions are very important. As shown in Corollary 4.7.1, the purchase probabilities drop by a multiplicative factor 85 when moving down the list by one position. Carefully leveraging this property allows us to show the following result. Given the quality scoresw 1 ;w 2 ;:::;w n , we definew = min i2N w i . Theorem 4.4.4 (Performance Guarantee of SOR) Suppose that w 0, and that without loss of gener- alitya 1 a 2 a n . For any given k, the ratio of the weighted surplus from the fresh demand of SOR and that of the optimal ranking with complete information satisfies E L p SOR k () E L p () 1 Õ 1ik a i : The assumption w i 0 is mainly introduced to avoid negative surplus values, in which case a bound cannot be possibly obtained. When only the supply-side surplus is concerned, which is usually the case in the previous literature on online adwords auctions, this assumption is automatically satisfied. This tech- nical assumption essentially says that, even with the lowest possible valuation, the platform still benefits society, as measured by the fresh demand. Several characteristics of this performance guarantee are worth noting. First, our bounds only depend ona i s. In particular, the distributional information of and the degree of ex-post seller heterogeneity do not play a role. Second, Theorem 4.4.4 implies that the surplus loss from SOR is at least exponentially decreasing with more slots sold, because 1Õ 1ik a i 1a k 1 . Third, even when the number of sellers n"¥, as long as a i is uniformly bounded by some constant, say a U , we can be certain that selling k slots is effective in the sense thatE h L p SOR k () i =E[L p ()] 1Õ 1ik a i 1a k U . This provides a sharp contrast to the discussions of Section 4.4.1, in which we showed that the inefficiency created from information asymmetry can be arbitrarily large as the number of sellers grows. In particular, the setup of Example 4.4.2 falls into the case of bounded a i values. A key condition that enables the performance guarantee is the fast decay of purchase probability owing to consumer sequential search. As highlighted in the discussion after Proposition 4.4.1, one driven force of the bad performance of ˆ p is the fact that the maximum of n random observations grows like logn. However, as long as we ensure that we sell the top slots to the right sellers, this loss can be largely alleviated. 4.5 Mechanism Design of Selling k Slots We now design the mechanism to implement SOR. By the revelation principle, we will focus on direct mechanisms. In our context, a direct mechanismM(b)=(p(b);t(b)) (orM(b)=(s(b);t(b))) maps the 86 bids from agents (sellers in our case) inQ=Q 1 Q 2 :::Q n , whereQ i =[0; ¯ q i ] is the support for seller i’s valuation, to a ranking rulep(b) and a payment rule t(b) (e.g., see [104]). Usingp SOR k as the ranking rule, we focus on the payment rule in this section. With the envelope theorem representation of incentive compatibility, we show next that there exists a unique payment function for SOR that satisfies the desired properties we need for selling top slots. As common to the literature, we require that our mechanism for selling the top slots satisfies incentive compatibility, individual rationality, and non-negative payments. Furthermore, sellers not ranked in the top positions shall not be charged (zero-payment property). We will first show that with SOR the purchase probabilities from the fresh demand are monotone in purchase private valuations. With this observation, we establish the existence of the payment function that satisfies incentive compatibility by using the well-known envelope theorem representation of incentive compatibility first proposed by [104]. A mechanism is dominant-strategy incentive compatible if and only if the following two conditions hold: (i) the monotonicity condition under which the purchase probability increasing in the own bid, and (ii) the envelope condition under which the seller’s payoff can be represented by an integral form. The envelope condition ensures incentive compatibility. Please refer to Proposition 4.7.5 and [9] for more detailed discussions on this approach. We further observe that individual rationality and non-negative payment can be established once we determine the appropriate payment function. Theorem 4.5.1 (Payment Function of SOR) There exists a unique payment function t SOR k (b;i) = q s SOR k (b);i b i R b i 0 q s SOR k (h;b i );i dh for all i 2 N such that the SOR mechanism M SOR k (b) = fp SOR k (b);t SOR k (b)g satisfies the following four desired properties: (i) dominant-strategy incentive com- patibility;(ii) ex-post individual rationality;(iii) non-negative payment; and(iv)(zero-paymentproperty) ifs SOR k (b;i) k+ 1, then t SOR k (b;i)= 0. The first three properties for the mechanism have been commonly seen in the literature. The last property is desirable and important in our context of only selling the top k slots. In other words, only the sellers promoted to the top k positions by the platform are subject to payments. Other sellers are not part of the sponsored search results. Intuitively, there are two important aspects in the design such that SOR satisfies the zero-payment property. First, the use of base order allows us to fix the order of the sellers not selected into the top k positions. Therefore, the sellers only change the weighted surplus when their valuations change the identity of the top sellers. Second, the choice and order of top sellers follow the simple sorting criterion of SOR, motivated by the optimal ranking,p . 87 The VCG menchanism is non-implementable. A natural question to ask is whether the VCG mechanism can be used in our context. We formally provide a negative answer to this question in Section C.3, in which we define the set of feasible ranking rules that one can adopt to sell the top slots. Intuitively, these rankings should leverage the knowledge of private seller valuations and promote a subset of sellers to the top positions, while the rest of the sellers should be ranked behind, with their relative order the same as the base order, because no payments are collected from them. However, these properties exclude the VCG mechanism. To see this, recall that seller externalities in choice probabilities as discussed in Section 4.2.2 imply that the purchase probability of a seller is influenced by the choice of sellers ranked above him. Therefore, in the VCG mechanism, the valuation change of a seller who is not ranked on top, i.e., an “organic seller,” may alter the selection of top sellers, i.e., “sponsored sellers,” when maximizing the weighted surplus. This way the “organic” seller imposes a negative impact on other sellers, for which he has to pay. Thus, the VCG mechanism violates the zero-payment property. Please see further discussion about this point in Section C.3, in which we construct an counter-example to show that VCG violates the zero-payment property. While Theorem 4.5.1 establishes that SOR satisfies the desired properties we need, the integral form of the payment function can be inconvenient in practice. We discuss an alternative constructive presentation of the payment function in Section C.4, which also provides intuitive characterizations of the payment function form and a compact way to determine it. 4.6 Extension: Consumer Search with List-Page In the base model, the consumer decides whether to pay search cost s and obtain the item information. In practice, websites can interact with the consumer to facilitate her search, and this changes the consumer search strategy. In this section we incorporate such changes and generalize the base model. We show that with some modifications to the simple sorting rule optimal in the base model (see Theorem 4.3.1), one obtains ranking rules with provably good performance guarantees in such settings. We then conduct numerical experiments and show that the simple sorting rule performs well under the generalized setting as well. More specifically, the e-commerce platform may display some item characteristics, such as pictures and prices, through the list-page returned with the consumer’s search query. After evaluating these charac- teristics, the consumer decides whether to click on an item and open its item-page, which provides further 88 details of the item for examination. To incorporate such a refined search process, we denote by s L ( 0) and s I ( 0) the search costs that the consumer incurs when examining an item on the list-page and an item-page, respectively. We introduce a model under which the consumer sequentially searches through the list-page and present an asymptotically optimal ranking rule. Whenever the consumer finds a desirable product, she may visit its item-page and make a purchase immediately. 4.6.1 Model Settings Without loss of generality, we assume for now that the items are ranked on the list-page in decreasing order of their indexes. The consumer acquires information from both the list-page and item-pages. Specifically, she examines items on the list-page sequentially. Upon paying s L and evaluating item i on the list-page, the consumer learns the intrinsic part of item i’s utility—i.e., u i p i . Given this, she can choose to (1) either examine its item-page to learn the idiosyncratic part of the utility e i , or (2) skip its item-page and evaluate item i+ 1 on the list-page, if i6= n, or (3) finalize her purchasing option—choosing one of the examined items or the outside option. After examining the item-page of item i, she may then (1) finalize the purchasing decision, or (2) return to the list-page and evaluate item i+ 1, provided that i6= n. Whenever the consumer finalizes the purchasing decision, the process terminates. We refer to this model as the sequential search model with the list-page. For an illustration of the search process, please see Figure 4.3. In the figure, each dash line represents a possible next action for the consumer. Each circle represents an item on the list-page. We use item 0 to represent the outside option. Each square represents an item-page. 1 2 Leave with V max 1 0 n Pay sL to learn u 1 p 1 Leave with V max Leave with V max Pay sL to learn u2 p2 Pay s I to learn✏1 Pay sL to learn u 2 p 2 n 2 Figure 4.3: A flow chart illustration of the choice process of the sequential search model with the list-page. 89 The consumer maximizes her expected utility and chooses her next option optimally. We assume that before observing any item information, the consumer believes that the triples (u i ; p i ;e i ) are ex-ante i.i.d distributed, similar to Section 2. We assume that s I + s L > 0 in order to exclude the trivial case. 4.6.2 Consumer Search Strategy We next characterize the optimal consumer behavior under this model. We use G() and H() (g() and h()) to denote the c.d.f. (p.d.f.) for u i p i and e i , respectively. Also, letV max be the random variable for the utility of the current best option so far. We can define v k (V max ) as the continuation value of the consumer’s dynamic program when the consumer is about to view the next item on the list-page that has k items left. Note that v k () can be characterized recursively through standard dynamic program techniques. Let us first assume that s I > 0 and s L > 0. Denote by x k (V max ) the unique solution to the following equation ofx , Z +¥ V max x v k (x+e) v k (V max ) h(e)de = s I : Further details about v k (), and the existence and uniqueness of x k (V max ) are discussed in the E- companion, along with the proof of Lemma 4.6.1. We also let e be the unique solution to R +¥ e (z e)h(z)dz= s I and ¯ v be the unique solution to the following equation with variable v Z +¥ ve Z +¥ vx (x+e v)h(e)de s I g(x)dx= s L : (4.5) The existence and uniqueness ofe and ¯ v can be established in the same way to (4.1). When s L = 0, we let ¯ v =+¥. When s I = 0, we lete =+¥ andx k ()=¥. The next result follows. 5 Lemma 4.6.1 (Optimal Consumer Search II) Under the sequential search model with the list-page, the optimal stopping policy of the consumer,G , is as follows. (i) The consumer stops searching and accepts the current best item if and only ifV max > ¯ v . (ii) After the consumer looks at item i on the list-page(whenV max ¯ v ), she will enter the item-page of item i if and only if u i p i x ni (V max ). (iii) WhenV max ¯ v , it follows that x k (V max ) ¯ v e for 0 k n 1. Moreover, x k (V max ) is continuous, and increasing inV max and k. 5 We break ties by assuming that the consumer will always look at one more item on the list page if she is indifferent to taking this action and that the consumer will always enter an item page when she is indifferent. 90 Note that Lemma 4.6.1 is parallel to Lemma 4.2.1. In particular, ¯ v is analogous to v and (4.5) is analogous to (4.1). However, contrary to the base model, the consumer now has the option of not examining the item-page if the list-page shows that the item’s intrinsic utility is low. The threshold for this decision, x k (V max ), depends on both the current best utilityV max and the number of items left, k. Item (iii) of Lemma 4.6.1 concerns the properties of this threshold. The monotonicity property is intuitive. As the current best utility becomes higher, or more items are left to check, the consumer becomes more conservative. Furthermore, as k!¥,x k (V max ) converges to ¯ v e for allV max ¯ v . In fact, as k!¥, the utility-to-go of the search becomes ¯ v whenV max ¯ v . In this case, the consumer enters the list-page of item i if and only if u i p i is above ¯ v e — the unique solution to the utility break-even equation ¯ v = ¯ v ¯ H( ¯ v x)+ Z ¯ v x h(z)dz s I with variablex . Due to the monotonicity property of the utility-to-go in the number of items left, ¯ v e provides an upper bound tox k (). The fact that thresholdx k (V max ) depends on the number of items left k greatly complicates the analysis of weighted surplus maximization. Despite this challenge, the consumer’s behavior with the list-page is still similar to our base model. Specifically, when s I = 0, i.e., checking the item-page is free, viewing an item-page can only weakly increase the consumer’s utility. As a result, under this case, we recover the base model and the consumer always views the item-pages before she leaves with a satisfying item, whose utility exceeds ¯ v = v , where v is the threshold for stopping in Lemma 4.2.1. Our model reduces to the base model discussed in previous sections. When s L = 0 — i.e., it is free to look on the list-page—the consumer never stops searching before finishing the entire list. In other words, ¯ v =+¥. The Value of Sharing Item Information. Through further comparison to the base model, we also remark on the value of sharing item information through the list-page. One can observe that the consumer’s expected utility is increasing with s I =s L if s I + s L is kept as a constant. Intuitively, as s I =s L becomes larger, the consumer only uses a smaller portion of the overall search cost to learn the information about the intrinsic utilities, and she can screen out the bad ones more easily. Formally, a consumer with higher value of s I =s L can always mimic the optimal search strategy of the consumer with a lower value of s I =s L . Note that by doing so, the consumer with a higher value of s I =s L obtains higher utility than the other. Therefore, the consumer’s expected utility is increasing with s I =s L . Combined with our discussion in the preceding 91 paragraph, this observation implies if s= s I + s L , then compared with the baseline model, the consumer’s expected utility is higher with the sequential search model with the list-page. In contrast, the list-page may not benefit the platform because the list-page may reduce the revenue. To show this, it is sufficient to consider only one item. One can construct an outside option distribution such that the item is fully examined without the list-page, while the probability that the consumer does not enter the item-page is positive with the list-page. Therefore, the list-page may reduce the purchase probability of the item. 4.6.3 Ranking under the Sequential Search Model with the List-Page In this part, we discuss how to rank in order to maximize the weighted surplus maximization given by expression (4.2). We assume that the platform observes the complete information, similar to the discussion in Section 4.3.1. Notice that as long as the ranking rule constructed enjoys the property that sellers’ ranking are monotone in their private valuationsq i ’s, we can still adopt the same mechanism design approaches in Sections 4.4 and 4.5 to construct a mechanism to sell the top slots to overcome information asymmetry. Consequently, we primarily consider the ranking problem in this section. In the current setting, contrary to the results in 4.3.1, difficulties arise due to the non-stationary nature of the optimal policy P . The returned demand is no longer independent of the ranking while the nice expression for the weighted surplus from the fresh demand as shown in Lemma 4.7.2 cannot be derived. Furthermore, even if only aggregate revenue is considered — i.e., g 1 =g 3 = 0 — it is easy to provide a counter-example to show that the surplus-ordering ranking as given in Theorem 4.3.1 is no longer optimal. To void the extreme case in which the consumer always finishes the entire list — i.e., ¯ v =+¥ — we assume s L > 0 in this part. To derive a near-optimal ranking, one may consider as an approximation forG the following simple stationary policy G 0 of consumer search: (i) the consumer stops searching on the list-page and accepts the current best item if and only ifV max > ¯ v ; and(ii) provided that the consumers looks at item i on the list-page, the consumer enters the item-page of item i if and only if u i p i ¯ v e . UnlikeG ,G 0 is characterized by two fixed thresholds independent of the number of items remaining. Moreover, when the number of sellers is large, one can show thatG 0 is asymptotically optimal, which is consistent with the discussion following Lemma 4.6.1 and formally proved in the E-Companion with the discussion of Theorem 4.6.2. As our study is motivated by the observation that on many platforms, the 92 search query typically returns a large number of results, we expectG 0 to provide high-quality approxima- tions in practice. Under G 0 , one can show the following ranking rule is optimal. We denote ¯ a i def = P(V i ¯ v u i ; p i ), ¯ w i def = g 2 p i +g 3 (E[V i jV i ¯ v ] s=(1 ¯ a i )), where s= s I + s L , and ¯ r i def = g 1 q i + ¯ w i . These definitions are analogous to (4.7) in Section 4.2, and to (4.3) and (4.4) in Section 4.3. Definition 2 (Pseudo Surplus-Ordered Ranking) Denote G =fi : u i p i ¯ v e g. The pseudo surplus-ordered ranking (PSOR or p PSOR ) is given by ranking the items inG on top in decreasing or- der of their net surplus ¯ w i followed by items inNnG in decreasing order of their net surplus ¯ w i . The asymptotic optimality ofG 0 hints that when n is large,G must converge toP 0 due to the optimality ofG for fixed n. Therefore, the optimality ofp PSOR shown in the first part of Theorem 4.6.2 suggests that it can serve as a good approximation to the optimal ranking when the consumer is searching with policy G when the n is large. Given the following technical assumptions, we show in Theorem 4.6.2 thatp PSOR is near optimal. We make the following assumptions. (A1) If g 1 +g 2 > 0, there exist positive constants C and ¯ C such that for any item i sampled from the distribution of u i p i ande i ,g 1 q i +g 2 p i 2[C; ¯ C]. (A2) Wheng 3 > 0, E V i V i ¯ v s=(1 ¯ a i ) E V 0 V 0 ¯ v 0 if i2G . (A3) M= supg()<+¥. WithP , let us denote byS the surplus from the optimal ranking of items andS PSOR the surplus we obtain fromp PSOR . The next results regarding the performance ofp PSOR follows. Theorem 4.6.2 The following results hold. (i) p PSOR is the optimal ranking if the consumer adopts search policyG 0 . (ii) Assume that the consumer adoptsG and that assumptions (A1), (A2), and (A3) hold. If the expected derived utility u i p i and random terme i of item i (i= 1;;n) are sampled from the distributions G() and H(), respectively, thenS PSOR (1 e W(n) )S with probability at least 1 O(1=n 2 ). Note that the constants in Theorem 4.6.2 depends on C, ¯ C and the distributions G() and H(). As- sumption (A1) is easily satisfied if q i is also sampled a fixed distribution, and the distributions of q i and p i are upper bounded and positively lower bounded. Assumption (A2) essentially states that the system benefits the consumer compared with the outside option, and it is introduced to avoid the bad case in which 93 the surplus value is negative. Assumption (A3) is satisfied for most commonly used distributions. Further- more, Theorem 4.6.2 item(ii) suggests that if the expected derived utilities and random terms are indeed sampled from the distribution of G() and H(), respectively, then with high probability, p PSOR gives a decent performance guarantee. While the performance ofp PSOR can be undesirable for a single consumer search, when the platform interacts with the consumer through numerous search keywords with a large number of items sold for each search keyword, this results suggests that p PSOR enjoys close-to-optimal performance in most cases. Therefore,p PSOR provides a practical solution. Furthermore, as will be shown in the next subsection,p PSOR indeed reveals near optimal performance numerically. 4.7 Auxiliary Results In this part, we provide auxiliary results that support the arguments in the main text. Purchase probabilities. We first provide the expressions of fresh demand and returned demand purchase probabilities. For ease of notation, we expand any ranking rule as p : N[f0g7! N[f0g by defining p(0)= 0. Because(u;p) is observable to the platform, following Lemma 4.2.1, we have Corollary 4.7.1 (Purchase Probabilities) Given the realization of(u;p), and rankingp such thatp(i)= j, the purchase probability from the fresh demand for item i is q j;p = q s;i def = P V p( j) > v \ ( \ 0k< j V p(k) v )! = 1a p( j) Õ 0k< j a p(k) ; (4.6) where a i = P(V i v ): (4.7) The returned demand for item i can be written as q i 0 def = P V i v \ i= minf j : j2 argmaxfV k : k2 N[f0ggg : (4.8) The last equation in (4.6) holds becausee i ande k are independent if i6= k. We emphasize thata i , q j;p , and q i 0 are functions of(u;p). For ease of notation, we suppress the dependence of(u;p). As explained in Section 4.2, seller externalities mean that the purchase probabilities are also influenced by the quality of other sellers. Note that in our model a high-quality seller with a low price gives rise to 94 a high probability of the event V i > v . This “steals” the demand from the sellers ranked behind him. As a result, the purchase probability from the fresh demand of a seller is negatively impacted by the quality of only the sellers ranked above him. The fresh demand thus provides an intuitive characterization of the position effect when seller externalities exist [70, 79]. The returned demand, on the other hand, depends on the intrinsic utilities of all the items. In our work, we use consumer sequential search as the micro- economic foundation to derive seller externalities. On the application side, one may sacrifice the rational assumption of our model while incorporating more practical concerns to build a consumer choice model that keeps the key features of ours. We discuss such a model in Section C.2 in which the consumer performs a “greedy” search in the sense that she stops whenever there is an item with utility above a given threshold. Reformulation of weighted surplus maximization. In preparation for Theorem 4.3.1, we present Lemma 4.7.2 The weighted surplus with fresh demand is equivalent to L p ()= n å j=1 q j;p r p( j) (q p( j) )+g 3 ns Õ 0in a i g 3 a 0 1 Õ 1in a i ! E V 0 V 0 v ; where q j;p = 1a p( j) Õ 0k< j a p(k) . Notice that only the first term in L p (), which is exactly social net surplusW(), depends on the ranking. Consumer surplus maximization. Recall that a distribution G() with p.d.f. g() is said to have a increasing failure rate if g()= ¯ G() is increasing. Following Theorem 4.3.1, we have Corollary 4.7.3 To maximize consumer surplus, it is optimal to rank sellers in increasing order of E V i V i v s=(1a i ). Furthermore, if the distribution of e i has an increasing failure rate (IFR), e.g., Gumbel distributions, it is optimal to rank items in increasing order of expected item utilities—i.e., u i p i . This result says that as far as the consumer surplus is concerned, for many commonly used distributions that are well-known to have an IFR —including normal, uniform, and exponential distributions, it is indeed optimal to rank sellers in increasing order of u i p i . Note that Gumbel distributions are often used to model consumer choice behavior. For example, in the well-known MNL model, the tail distributions of random utilities from each choice are i.i.d. Gumbel [135, 132]. Although in optimizing consumer surplus it is 95 seems intuitive to rank sellers using the u i p i , one can construct examples to show that ranking with u i p i is suboptimal when the distribution ofe i does not possess an IFR. Matching the upper bound. Regarding Example 1, we show Proposition 4.7.4 In Example 4.4.2, the ratio of the expected weighted surplus from the optimal rank- ing with complete information p and the optimal ranking with incomplete information ˆ p satisfies that E[W p ( n )]=E[W ˆ p ( n )]=Q(logn). Envelope theorem representation of incentive compatibility. Now, we present and interpret Proposition 4.7.5, which is used in the proof of Theorem 4.5.1. Proposition 4.7.5 (Envelope Theorem Representation of Incentive Compatibility) It follows that a mechanismM(b)=(s(b);t(b)) is dominant-strategy incentive compatible if and only if the following two conditions hold: (i) (Monotonicity) q s(b i ;b i );i q s(b 0 i ;b i );i if b i b 0 i ; (ii) (Envelope Condition) q s(b);i b i t(b;i)= R b i 0 q s(h;b i );i dh+C i (b i ). We emphasize that C i (b i ) is independent of b i . Dominant-strategy incentive compatibility means that regardless of the reported valuations of other sellers, it is always in the interest of the seller to report his true valuation. For more detailed discussions in a game theoretical setting, please refer to [108]. The first item in this proposition states that the purchase probability determined by the ranking function must be increasing. We prove this for SOR by considering two potential bids, b i and b 0 i , with b i > b 0 i . If seller i stays in the same position for both, then by the definition of SOR, the set of sellers ranked above is the same. Accordingly, q s SOR k (b i ;b i );i = q s SOR k (b 0 i ;b i );i . On the other hand, if seller i is ranked in a higher position with bid b i , the set of sellers ranked above him with b i is also ranked above him with b 0 i , and it must be q s SOR k (b i ;b i );i > q s SOR k (b 0 i ;b i );i by the definition of SOR. 96 Appendix A Additional Technical Arguments for Chapter 2 A.1 Finiteness of the Optimal Prices We show that the optimal prices in the UNCONSTRAINED problem are finite. In the next lemma, we begin by showing that if we increase the prices of some products, then the purchase probabilities of the remaining products, as well as the probability of no-purchase, increase. Lemma A.1.1 For some M N, assume that the prices ^ p = ( ˆ p 1 ;:::; ˆ p n ) and ~ p = ( ˜ p 1 ;:::; ˜ p n ) satisfy ˆ p i ˜ p i for all i2 M and ˆ p i = ˜ p i for all i2 Nn M. Then, we haveQ i (Y(^ p))Q i (Y(~ p)) for all i2 Nn M andQ 0 (Y(^ p))Q 0 (Y(~ p)). Proof: First, we show thatQ i (Y(^ p))Q i (Y(~ p)) for all i2 Nn M. Fix i2 Nn M. Noting that Y j (p j )= exp(a j b p j ), since ˆ p j ˜ p j for all j2 M and ˆ p j = ˜ p j for all j2 Nn M, we have Y j ( ˆ p j ) Y j ( ˜ p j ) for all j2 M and Y j ( ˆ p j )= Y j ( ˜ p j ) for all j2 Nn M. Furthermore, noting that the function G satisfies ¶G i j (Y) 0 for all j2 M,¶G i (Y) is decreasing in Y j for all j2 M. In this case, having Y j ( ˆ p j ) Y j ( ˜ p j ) for all j2 M and Y j ( ˆ p j )= Y j ( ˜ p j ) for all j2 Nn M implies that ¶G i (Y(^ p)) ¶G i (Y(~ p)). Similarly, since the function G satisfies ¶G j (Y) 0 for all j2 N, G(Y) is increasing Y j for all j2 N, in which case, using the fact that Y j ( ˆ p j ) Y j ( ˜ p j ) for all j2 M and Y j ( ˆ p j )= Y j ( ˜ p j ) for all j2 Nn M, we obtain G(Y(^ p)) G(Y(~ p)). Lastly, since i2 NnM, we have Y i ( ˆ p i )= Y i ( ˜ p i ). Because¶G i (Y(^ p))¶G i (Y(~ p)) and G(Y(^ p)) G(Y(~ p)), using the definition of the choice probability under the GEV model, we get Q i (Y(^ p)) = Y i ( ˆ p i )¶G i (Y(^ p)) 1+ G(Y(^ p)) Y i ( ˜ p i )¶G i (Y(~ p)) 1+ G(Y(~ p)) = Q i (Y(~ p)): 97 Second, we show thatQ 0 (Y(^ p))Q 0 (Y(~ p)). By the discussion at the beginning of the proof, we have G(Y(^ p)) G(Y(~ p)). The no-purchase probability at pricesp is given by Q 0 (Y(p))= 1 å i2N Q i (Y(p)) = 1 å i2N Y i (p i )¶G i (Y(p)) 1+ G(Y(p)) = 1 1+ G(Y(p)) ; where the last equality above is by Lemma 2.2.2. In this case, since G(Y(^ p)) G(Y(~ p)), we obtain Q 0 (Y(^ p))= 1=(1+ G(Y(^ p))) 1=(1+ G(Y(~ p)))=Q 0 (Y(~ p)). In the next proposition, we use the lemma above to show that the optimal prices in the UNCONSTRAINED problem are finite. Proposition A.1.2 There exists an optimal solutionp to the UNCONSTRAINED problem such that p i 2 [c i ;¥) for all i2 N. Proof: Let p be an optimal solution to the UNCONSTRAINED problem and set N =fi2 N : p i < c i g to capture the set of products whose prices are below their unit costs. We define the pricesp as p i = c i for all i2 N and p i = p i for all i2 Nn N . Note that we have p i p i for all i2 N and p i = p i for all i2 Nn N , in which case, by Lemma A.1.1, we get Q i (Y(p ))Q i (Y(p )) for all i2 NnN . Therefore, we haveå i2N (p i c i )Q i (Y(p ))å i2NnN (p i c i )Q i (Y(p ))å i2NnN (p i c i )Q i (Y(p ))=å i2N (p i c i )Q i (Y(p )), where the first inequality uses the fact that p i < c i for all i2 N , the second inequality uses the fact that p i = p i c i andQ i (Y(p ))Q i (Y(p )) for all i2 NnN , and the last equality uses the fact that p i = c i for all i2 N . By the last chain of inequalities, the objective value of the UNCONSTRAINED problem corresponding to the pricesp is at least as large as the one corresponding to the pricesp , which implies that there exists an optimal solutionp such that p i c i for all i2 N. Thus, there exists an optimal solution such that the prices of all of the products are lower bounded by their corresponding unit costs. In the rest of the proof, we can assume that p i c i for all i2 N. Let N + =fi2 N : p i =¥g to capture the set of products whose prices are infinite. Noting that Y i (p i )= exp(a i b p i ), we have lim p i !¥ Y i (p i )= 0 and lim p i !¥ p i Y i (p i )= 0, in which case, by the definition of the purchase probabilities in (2.1), we haveQ i (Y(p ))= 0 and p i Q i (Y(p ))= 0 for all i2 N + . Letting ¯ p= maxfp i : i2 Nn N + g<¥, define the pricesp + as p + i = ¯ p+ c i for all i2 N + and p + i = p i for all i2 Nn N + . Note that we have p i p + i for all i2 N + and p i = p + i for all i2 Nn N + . In this case, by Lemma A.1.1, we obtain Q i (Y(p ))Q i (Y(p + )) for all i2 Nn N + and Q 0 (Y(p ))Q 0 (Y(p + )). Using the last two inequalities, we get å i2N +Q i (Y(p + ))+å i2NnN +Q i (Y(p + ))=å i2N Q i (Y(p + ))= 1Q 0 (Y(p + )) 1Q 0 (Y(p ))=å i2N Q i (Y(p ))=å i2NnN +Q i (Y(p )), where the last equality uses 98 the fact thatQ i (Y(p ))= 0 for all i2 N + . Focusing on the first and last expressions in the last chain of inequalities, we haveå i2N +Q i (Y(p + ))å i2NnN +(Q i (Y(p ))Q i (Y(p + ))), which implies that å i2N (p i c i )Q i (Y(p )) = å i2NnN + (p i c i )Q i (Y(p )) = å i2NnN + (p + i c i )Q i (Y(p + )) + å i2NnN + (p i c i )(Q i (Y(p ))Q i (Y(p + ))) å i2NnN + (p + i c i )Q i (Y(p + )) + å i2NnN + ¯ p(Q i (Y(p ))Q i (Y(p + ))) å i2NnN + (p + i c i )Q i (Y(p + )) + å i2N + ¯ pQ i (Y(p + ))= å i2N (p + i c i )Q i (Y(p + )); where the first equality holds because Q i (Y(p )) = 0 and p i Q i (Y(p )) = 0 for all i2 N + , the first inequality is by the fact that ¯ p p i for all i2 Nn N + andQ i (Y(p ))Q i (Y(p + )) for all i2 Nn N + , and the second inequality holds as å i2N +Q i (Y(p + ))å i2NnN +(Q i (Y(p ))Q i (Y(p + ))). Thus, the expected profit at the pricesp + is at least as large as the one at the pricesp , in which case, there exists an optimal solutionp such that p i <¥ for all i2 N. A.2 Proof of Corollary 2.3.3 As a function of the unit product costsc, let R (c) be the optimal expected profit in the UNCONSTRAINED problem. We claim that R (c) R (c+de i ) R (c+de) R (c)d. Note that the chain of inequalities R (c) R (c+d e i ) R (c+de) follows immediately because we havecc+de i c+de, and as the unit costs increase, the optimal expected profits decrease. To establish that R (c+de) R (c)d, observe that R (c+de) = å i2N (p i (c+de) c i d)Q i (p (c+de)) å i2N (p i (c) c i d)Q i (p (c)) = å i2N (p i (c) c i )Q i (p (c))d å i2N Q i (p (c)) R (c)d; where the first inequality follows becausep (c) may not be optimal when the unit costs arec + de and the last inequality follows from the fact that R (c)=å i2N (p i (c) c i )Q i (p (c)) andå i2N Q i (p (c)) 1. Thus, the claim holds. To show the first part of the corollary, as a function of the unit costs of the products, we let m (c) be the optimal markup in the UNCONSTRAINED problem. Noting that R (c+de i ) R (c) 99 d by the discussion at the beginning of the proof, by Theorem 2.3.1, we obtain m (c+de i )= 1=b+R (c+ de i ) 1=b+ R (c)d = m (c)d, which implies that p i (c+d e i )= m (c+de i )+ c i +d m (c)+ c i = p i (c). Similarly, we also have m (c+de i )= 1=b + R (c+de i ) 1=b + R (c)= m (c). Thus, for all j6= i, we get p j (c+d e i )= m (c+de i )+ c j m (c)+ c j = p j (c), which completes the first part. Considering the second part of the corollary, by Theorem 2.3.1 and the discussion at the beginning of the proof, we have m (c+de)= 1=b+ R (c+de) 1=b+ R (c)d = m (c)d. Thus, we get p i (c+de)= m i (c+de)+ c i +d m i (c)+ c i = p i (c), which completes the second part. A.3 Optimal Markup Under Separable Generating Functions We consider the UNCONSTRAINED problem when the products are partitioned into disjoint subsets, the generating function is separable by the partitions, and the products in each partition share the same price sensitivity parameter. We partition the set of products N into the subsets N 1 ;:::;N m such that N =[ m k=1 N k and N k \ N k 0 =? for k6= k 0 . Similarly, we partition the vectorY = (Y 1 ;:::;Y n ) into the subvectorsY 1 ;:::;Y m such that each subvectorY k is given byY k =(Y i : i2 N k ). We assume that the generating function G is separable by the partitions so that G(Y)=å m k=1 G k (Y k ), where the functions G 1 ;:::;G m satisfy the four properties discussed at the beginning of Section 2.2. Furthermore, we assume that the products in each partition N k share the same price sensitivity b k . In the next theorem, we show that the optimal prices for the products in each partition have a constant markup and we give a formula to compute the optimal markup. In this theorem, we letY k (p)=(Y i (p i ) : i2 N k )=(e a i b k p i : i2 N k ), p be the optimal solution to the UNCONSTRAINED problem, andc=(c 1 ;:::;c n ) be the vector of unit product costs. Theorem A.3.1 For all i2 N k , p i c i = 1 b k + R(p ), so that the products in N k have a constant markup. Furthermore, letting R be the unique value of R satisfying R= m å k=1 1 b k e (b k R+1) G k (Y k (c)); the optimal expected profit in the UNCONSTRAINED problem is R(p )= R , in which case, the optimal markup for the products in N k is p i c i = 1 b k + R(p )= 1 b k + R . Proof: For notational brevity, let R k (p) = å j2N k(p j c j )Y j (p j )¶G k j (Y k (p))=(1+ G(Y(p)), so that R(p) =å m k=1 R k (p). Fix product i. Let k be such that i2 N k . Noting that the definition of R k (p) is 100 similar to that of R(p), following precisely the same approach that we use right before Theorem 2.3.1, we can verify that ¶R k (p) ¶ p i =b k Y i (p i )¶G k i (Y k (p)) 1+ G(Y(p)) ( 1 b k (p i c i ) å j2N k (p j c j ) Y j (p j )¶G k ji (Y k (p)) ¶G k i (Y k (p)) + R k (p) ) : One the other hand, consider any ` such that i62 N ` . In the definition of R ` (p), which is given by å j2N `(p j c j )Y j (p j )¶G ` j (Y ` (p))=(1+ G(Y(p)), since i62 N ` , only the expression 1+ G(Y(p)) in the denominator depends on the price of product i. In this case, differentiating R ` (p) with respect to p i , it follows that ¶R ` (p) ¶ p i = å j2N ` (p j c j ) Y j (p j )¶G ` j (Y ` (p)) (1+ G(Y(p))) 2 ¶G k i (Y k (p))b k Y i (p i )=b k Y i (p i )¶G k i (Y k (p)) 1+ G(Y(p)) R ` (p): Since R(p)= R k (p)+å `6=k R ` (p), we have¶R(p)=¶ p i =¶R k (p)=¶ p i +å `6=k ¶R ` (p)=¶ p i . Therefore, the two equalities above yield ¶R(p) ¶ p i =b k Y i (p i )¶G k i (Y k (p)) 1+ G(Y(p)) ( 1 b k (p i c i ) å j2N k (p j c j ) Y j (p j )¶G k ji (Y k (p)) ¶G k i (Y k (p)) + R(p) ) : Once we have the expression above for the derivative of the expected profit function, we can follow pre- cisely the same argument in the proof of Theorem 2.3.1 to get p i c i = 1 b k + R(p ). The right side of the equation in the theorem is decreasing in R, whereas the left side is strictly in- creasing. Furthermore, the right side of the equation evaluated at R = 0 is non-negative, whereas the left side evaluated at R= 0 is zero. Thus, there exists a unique value of R satisfying the equation in the theorem. We proceed to showing that the value of R satisfying this equation corresponds to the optimal expected profit in the UNCONSTRAINED problem. The optimal price of each product i2 N k is of the form p i = c i + 1 b k + R(p ), so that we have Y i (p i )= e a i b k (c i + 1 b k +R(p )) = e (b k R(p )+1) Y i (c i ). In this case, noting that p i c i = 1 b k + R (p ) for all i2 N k and plugging the optimal prices into the expected profit function in the UNCONSTRAINED problem, the optimal expected profit satisfies the equation R(p ) = m å k=1 å i2N k 1 b k + R(p ) e (b k R(p )+1) Y i (c i )¶G k i (e (b k R(p )+1) Y k (c)) 1+ m å k=1 G k (e (b k R(p )+1) Y k (c)) 101 = m å k=1 1 b k + R(p ) G k (e (b k R(p )+1) Y k (c)) 1+ m å k=1 G k (e (b k R(p )+1) Y k (c)) = m å k=1 1 b k + R(p ) e (b k R(p )+1) G k (Y k (c)) 1+ m å k=1 e (b k R(p )+1) G k (Y k (c)) ; where the second equality uses Lemma 2.2.2 and G k is a generating function, and the third equality uses the fact that G 1 ;:::;G m satisfy the properties of a generating function so that these functions are homogeneous of degree one. Focusing on the first and last expressions in the chain of equalities above and rearranging the terms, we get R(p )=å m k=1 1 b k e (b k R(p )+1) G k (Y k (c)). There are instances of existing GEV models with a separable generating function. Under the nested logit model, the generating function is G(Y)=å m k=1 (å j2N k Y 1=l k j ) l k , where (l 1 ;:::;l m ) are constants, each in the interval (0;1]. The products in each partition N k correspond to the products in a particu- lar nest. Letting G k (Y k )=(å j2N k Y 1=l k j ) l k , this generating function is a separable function of the form G(Y)=å m k=1 G k (Y k ). Also, under the d-level nested logit model, the products are organized in a tree. The tree starts at a root node. The products are at the leaf nodes. The degree of overlap between the paths from the root node to the leaf nodes corresponding to different products captures the degree of substitution between the different products. Figure A.1 shows a sample tree. [89] show that the generating function under the d-level nested logit model is of the form G(Y)=å m k=1 G k (Y k ), where 1;:::;m correspond to the children nodes of the root node, and the partition N k corresponds to the subset of products such that if i2 N k , then the path from the root node to the leaf node corresponding to product i passes through the k-th child node of the root node. Thus, if the products in each partition N k share the same price sensi- tivity, then the optimal prices for these products have a constant markup. To our knowledge, this result was not known earlier. Lastly, we note that if G is a generating function, then for q > 0, q G is a gen- erating function as well. Therefore, Theorem A.3.1 applies when G is a separable function of the form G(Y)=å m k=1 q k G k (Y k ) for positive scalarsq 1 ;:::;q m . 102 root node children of root node products 1 2 m 1 2 3 4 5 n N 1 N 2 N m Figure A.1: A sample tree for the d-level nested logit model with n= 16 and m= 4. A.4 Proof of Lemma 2.4.2 Note that Y i (s i )= e a i bs i , so dY i (s i )=ds i =b Y i (s i ). In this case, using the definition of f and differenti- ating, for all i2 N, we obtain ¶ f(s) ¶s i = Y i (s i )¶G i (Y(s)) 1+ G(Y(s)) + q i = q i Q i (Y(s)); where the second equality uses the definition of the selection probabilities in (2.1). The equality above establishes the expression for the gradient f . Letting 1 l fg be the indicator function and differentiating the middle expression above once more, we get 1 b ¶ 2 f(s) ¶s i ¶s j = 1 l fi= jg Y i (s i )¶G i (Y(s)) 1+ G(Y(s)) + Y i (s i )¶G i j (Y(s))Y j (s j ) 1+ G(Y(s)) Y i (s i )¶G i (Y(s))¶G j (Y(s))Y j (s j ) (1+ G(Y(s))) 2 ; where we use the fact that the derivative of h(x)=g(x) with respect to x is given by the formula h 0 (x)=g(x) h(x)g 0 (x)=g(x) 2 . By (2.1), we have 1 l fi= jg Y i (s i )¶G i (Y(s)) 1+ G(Y(s)) = 1 l fi= jg Q i (Y(s)); Y i (s i )¶G i (Y(s))¶G j (Y(s))Y j (s j ) (1+ G(Y(s))) 2 =Q i (Y(s))Q j (Y(s)); and Y i (s i )¶G i j (Y(s))Y j (s j ) is the (i; j)-th entry of the matrix diag(Y(s))Ñ 2 G(Y(s))diag(Y(s)). Putting everything together, we have 1 b Ñ 2 f(s)= diag(Q(Y(s))) + diag(Y(s))Ñ 2 G(Y(s))diag(Y(s)) 1+ G(Y(s)) Q(Y(s))Q(Y(s)) > ; 103 which is the desired result. A.5 Product Prices to Achieve Given Market Shares For fixedq2R n + such that q i > 0 for all i2 N and å i2N q i < 1, let f :R n !R be defined by: for all s2R n , f(s)= 1 b log(1+ G(Y(s)))+å i2N q i s i . In the next lemma, we show that the value f(s) gets arbitrarily large as the norm ofs gets arbitrarily large. Lemma A.5.1 Given any L 0, there exists an M 0, such ifksk M, then f(s) L. Proof: Let D = min i2N fY i (0)¶G i (Y(0))g. Since Y i (0)> 0 for all i2 N, noting the assumption that ¶G i (Y)> 0 when Y i > 0 for all i2 N, we haveD> 0. Given any L 0, we choose M 0 such that it satisfies the equality L= min ( 1 b logD+ min i2N fq i g 1 å i2N q i ! M 2 p n ; min i2N fq i g M 2 p n ) : The right side of the equality above evaluated at M = 0 is non-positive. Furthermore, the right side is strictly increasing in M. Therefore, for any L 0, there exists M 0 that satisfies the equality above. We show that ifksk M, then f(s) L. Consider two cases. Case 1: Assume that there exists j2 N such that s j M p n , and we have s i min `2N fq ` g M 2 p n for all i2 N. In this case, since log(1+ G(Y(s))) 0, we obtain f(s) = 1 b log(1+ G(Y(s)))+ å i2N q i s i å i2Nnf jg q i s i + q j s j å i2Nnf jg q i ! min `2N fq ` g M 2 p n + q j M p n min `2N fq ` g M 2 p n + min `2N fq ` g M p n = min `2N fq ` g M 2 p n L; where the third inequality uses the fact thatå i2N q i < 1 and the last inequality follows from the definition of L. Thus, we have f(s) L, as desired. Case 2: This is the opposite of Case 1. Assume that we have s i < M p n for all i2 N, or there exists j2 N such that s j <min `2N fq ` g M 2 p n . In this case, we claim that there must exist j2 N such that s j min `2N fq ` g M 2 p n . To see the claim, assume on the contrary, we have s j >min `2N fq ` g M 2 p n for all j2 N. Thus, noting the requirements of Case 2, we must have s i < M p n for all i2 N, in which case, we 104 get M p n min `2N fq ` g M 2 p n < s j < M p n , sojs j j< M p n for all j, which contradicts the fact thatksk M. So, the claim holds and there exists j2 N such that s j min `2N fq ` g M 2 p n . In the rest of the proof, we let ` = argmin i2N fs i g and t = min i2N fs i g so that we have t min i2N fq i g M 2 p n , s i t for all i2 N, and s ` = t. Using Lemma 2.2.2, we have G(Y(ste)) = å i2N Y i (s i t)¶G i (Y(ste)) Y ` (s ` t)¶G ` (Y(ste)) Y ` (0)¶G ` (Y(0)) D: In the chain of inequalities above, the first inequality holds because, for any price vector p, we have Y i (p i ) 0 and¶G i (Y(p)) 0 for all i2 N. To see the second inequality, note that Y i (p i ) is decreasing in p i and s i t 0, so that Y i (s i t) Y i (0) for all i6=`. Furthermore, by the properties of a generating function discussed at the beginning of Section 2.2, since¶G `i (Y) 0 for all i6=`,¶G ` (Y) 0 is decreasing in Y i for all i6=`. Also noting that we have Y ` (s ` t)= Y ` (0), we get¶G ` (Y(ste))¶G ` (Y(0)) because Y i (s i t) Y i (0) for all i6=`. The last inequality uses the definition ofD. Thus, the chain of inequalities above yields G(Y(ste))D. In addition, observe that log(G(Y(s+ te)))= log(G(e b t Y(s)))= log(e b t G(Y(s)))=b t+ log(G(Y(s))), where the first equality follows from the fact that Y i (p i )= e a i b p i and the second equality follows from the fact that G is homogeneous of degree one. Thus, we have f(s) = 1 b log(1+ G(Y(s)))+ å i2N q i s i 1 b log(G(Y(ste+te)))+ å i2N q i s i = 1 b b t+ log(G(Y(ste))) + å i2N q i s i = 1 b log(G(Y(ste)))+ å i2N q i (s i t) 1 å i2N q i t 1 b logD 1 å i2N q i t 1 b logD+ min i2N fq i g 1 å i2N q i M 2 p n L; where the second equality is by the fact that log(G(Y(s+ te))) =b t+ log(G(Y(s))), the second inequality holds because G(Y(ste))D and s i t 0 for all i2 N, the third inequality follows since t min i2N fq i g M 2 p n , and the last inequality is by the definition of L. So, f(s) L. 105 A.6 Proof of Lemma 2.4.4 Let f(s)= 1 b log(1+ G(Y(s)))+å i2N q i s i denote the objective function of the optimization problem in Theorem 2.4.1. By Lemma 2.4.2, we haveÑ f(s)=qQ(Y(s)) for alls2R n . Sincep(q) is the unique minimizer of f , it follows that for all i2 N, 0= ¶ f(s) ¶s i s=p(q) = q i Q i (Y(p(q))): Taking the derivative of the above equation with respect to q j , we obtain that for all i; j2 N, 0= 1 l fi= jg å `2N ¶Q i (Y(s)) ¶s ` s=p(q) ¶ p ` (q) ¶q j : SinceÑ f(s)=qQ(Y(s)) by Lemma 2.4.2, we have thatÑ 2 f(s) = ¶Q i (Y(s)) ¶s ` : i;`2 N , and thus, the expressionå `2N ¶Q i (Y(s)) ¶s ` s=p(q) ¶ p ` (q) ¶q j is the inner product between the i-th row of the Hessian matrixÑ 2 f(p(q)) and the j-th column of the Jacobian matrix J(q). Letting I denote the n-by-n identity matrix, we can write the above system of equations in matrix notation as 0= I+Ñ 2 f(p(q))J(q): Note that, by definition,Q(Y(p(q)))=q, and thus, it follows from Lemma 2.4.2 that Ñ 2 f(p(q))=b diag(q)qq > + diag(Y(p(q)))Ñ 2 G(Y(p(q)))diag(Y(p(q))) 1+ G(Y(p(q))) =b(diag(q)qq > + B(q)); where the last equality follows from the definition of B(q). Since Ñ 2 f() is positive definite, and thus, invertible, the desired result follows by the fact that 0= I+Ñ 2 f(p(q))J(q). A.7 Numerical Study for Constrained Pricing We provide a numerical study to understand how efficiently we can solve instances of the CONSTRAINED problem by using our results in Section 2.4. In our numerical study, we consider the network revenue management setting, where we have a set of resources with limited inventories and the sale of a product 106 consumes a combination of resources. We index the set of resources by M. The sale of product i consumes a `i units of resource i. There are C ` units of resource`. The expected number of customer arrivals over the selling horizon is T . The goal is to find the prices to charge for the products to maximize the expected revenue obtained from each customer, while making sure that the expected consumption of each resource does not exceed its inventory. As discussed at the beginning of Section 2.4, we can formulate this problem as an instance of the CONSTRAINED problem, where the function F ` is given by F ` (q)=å i2N a `i T q i C ` . However, the objective function of the CONSTRAINED problem is not necessarily concave in the prices p and its feasible space is not necessarily convex. To obtain the optimal pricesp in the CONSTRAINED problem, we solve the MARKET-SHARE-BASED formulation to obtain the optimal purchase probabilities q . In Theorem 2.4.3, we show that the objective function of the MARKET-SHARE-BASED formulation is concave in the purchase probabilities, and we give an expression that can be used to compute the gradient of the objective function. With the function F ` as given above, the constraints in the MARKET-SHARE-BASED formulation are linear in the purchase probabilities. Therefore, we can obtain the optimal purchase proba- bilities by solving the MARKET-SHARE-BASED formulation through standard convex optimization meth- ods. In our numerical study, we use the primal-dual algorithm for convex programs; see Chapter 11 in [16]. In Theorem 2.4.1, we show how to compute the prices that achieve given purchase probabilities. Once we obtain the optimal purchase probabilitiesq through the MARKET-SHARE-BASED formulation, we compute the optimal pricesp =p(q ) that achieve the optimal purchase probabilities. In our numer- ical study, we generate a large number of test problems with varying sizes. We begin by describing how we generate our test problems. Next, we give our numerical results. Generation of Test Problems: We generate our test problems as follows. In all of our test problems, the choice process of the customers is governed by the paired combinatorial logit model, whose generating function G is given by G(Y)=å (i; j)2N 2 :i6= j Y 1=t (i; j) i +Y 1=t (i; j) j t (i; j) . This generating function differs from the one corresponding to the paired combinatorial logit model in Example 2.2.1 by a constant factor, but if we multiply a generating function by a constant, then the generating function still satisfies the properties at the beginning of Section 2.2. To come up with the parameters of the paired combinatorial logit model, we sample t (i; j) from the uniform distribution over [0:2;1] for all i; j2 N with i6= j. Recalling that the deterministic utility component for product i isa i b p i , to come up with(a 1 ;:::;a n ) andb, we sample a i from the uniform distribution over[2;2] for all i2 N, and we sampleb from the uniform distribution over[1;4]. 107 Param. Combination CPU (m; n; z; k) Secs. (20, 50, 0.02, 0.5) 1.33 (20, 50, 0.02, 0.8) 1.04 (20, 50, 0.2, 0.5) 1.95 (20, 50, 0.2, 0.8) 1.32 (20, 100, 0.02, 0.5) 1.83 (20, 100, 0.02, 0.8) 1.54 (20, 100, 0.2, 0.5) 3.5 (20, 100, 0.2, 0.8) 2.11 (40, 50, 0.02, 0.5) 1.88 (40, 50, 0.02, 0.8) 0.98 (40, 50, 0.2, 0.5) 2.75 (40, 50, 0.2, 0.8) 3.14 (40, 100, 0.02, 0.5) 3.41 (40, 100, 0.02, 0.8) 2.72 (40, 100, 0.2, 0.5) 6.24 (40, 100, 0.2, 0.8) 4.42 Param. Combination CPU (m; n; z; k) Secs. (60, 100, 0.02, 0.5) 4.23 (60, 100, 0.02, 0.8) 3.32 (60, 100, 0.2, 0.5) 6.21 (60, 100, 0.2, 0.8) 4.98 (60, 200, 0.02, 0.5) 8.27 (60, 200, 0.02, 0.8) 6.64 (60, 200, 0.2, 0.5) 12.62 (60, 200, 0.2, 0.8) 9.72 (80, 100, 0.02, 0.5) 6.81 (80, 100, 0.02, 0.8) 4.31 (80, 100, 0.2, 0.5) 5.58 (80, 100, 0.2, 0.8) 5.6 (80, 200, 0.02, 0.5) 11.54 (80, 200, 0.02, 0.8) 10.88 (80, 200, 0.2, 0.5) 18.3 (80, 200, 0.2, 0.8) 15.08 Table A.1: Numerical results for constrained pricing. We index the set of resources by M=f1;:::;mg. For each product i, we randomly choose a resource n i and set a n i ;i = 1. For each other resource`2 Mnfn i g, we set a `i = 1 with probabilityz and a `i = 0 with probability 1z , wherez is a parameter that we vary. In this way, the expected number of resources used by a product is given by 1+(m 1)z , and we varyz to control the expected number of resources used by a product. To come up with the capacities for the resources, we solve the unconstrained pricing problem max p2R nfå i2N p i Q i (Y(p))g. Usingp UNC to denote an optimal solution to the unconstrained problem, we set the capacity of resource` as C ` =kå i2N T a `i Q i (Y(p UNC )), wherek is another parameter that we vary to control the tightness of the resource capacities. Thus, the capacity of resource` is a k fraction of the total expected capacity consumed when we charge the optimal pricesp UNC in the unconstrained problem. We fix the expected number of customers at T = 100. In our test problems, we vary the number of resources m and the number of products n over(m;n)2 f(20;50);(20;100);(40;50);(40;100);(60;100);(60;200);(80;100);(80;200)g, yielding a total of eight combinations. We vary the parametersz andk overz2f0:02;0:2g andk2f0:5;0:8g. This setup yields 32 parameter combinations for(m;n;z;k). In each parameter combination, we generate 100 test problems by using the approach described above. Numerical Results: In Table A.1, we show the average CPU seconds required to solve our test prob- lems, where the average is taken over all 100 test problems in a parameter combination. The first column in the table shows the parameters of the test problem by using the tuple (m;n;z;k). The second column shows the average CPU seconds. Our numerical results indicate that we can solve even the largest test 108 problems with n= 200 products and m= 80 resources within 20 seconds on average. Over all of our test problems, the average CPU seconds is 5.44. Since the MARKET-SHARE-BASED formulation is a convex program, our approach always obtains the optimal purchase probabilities and the corresponding optimal prices. To give some context to our numerical results, we also tried computing the optimal prices by directly solving the CONSTRAINED problem through the fmincon routine in Matlab. Since the ob- jective function of the CONSTRAINED problem may not be concave in the prices and the feasible space may not be convex, directly solving the CONSTRAINED problem may not yield the optimal prices. For economy of space, we provide only summary statistics. In 35% of our test problems, directly solving the CONSTRAINED problem yields optimality gaps of 1% or more. In 27% of our test problems, directly solv- ing the CONSTRAINED problem yields optimality gaps of 5% or more. Over all test problems, the average CPU seconds to directly solve the CONSTRAINED problem by using the fmincon routine in Matlab is 38.28 seconds. Thus, our approach, where we solve the MARKET-SHARE-BASED formulation to find the optimal purchase probabilities and compute the corresponding optimal prices, provides advantages both in terms of solution quality and CPU seconds. A.8 Concavity of Expected Revenue Under Separable Generating Functions We consider the CONSTRAINED problem under the same setup discussed in Appendix A.3. In particular, we partition the set of products N into the subsets N 1 ;:::;N m such that N=[ m k=1 N k and N k \ N k 0 =? for k6= k 0 . Similarly, we partition the vectorY =(Y 1 ;:::;Y n ) into the subvectorsY 1 ;:::;Y m such that each subvectorY k is given byY k =(Y i : i2 N k ). We assume that the generating function G is separable by the partitions such that G(Y)=å m k=1 G k (Y k ), where the functions G 1 ;:::;G m satisfy the four properties discussed at the beginning of Section 2.2. Also, we assume that the products in each partition N k share the same price sensitivity b k . Observe that we can carry out a change of variables to compute the prices that achieve given purchase probabilities. In particular, we let ˆ p i =b k p i for all i2 N k and k= 1;:::;m. Furthermore, we let ˆ Y i ( ˆ p i )= e a i ˆ p i and define vector ˆ Y(^ p)=( ˆ Y 1 ( ˆ p 1 );:::; ˆ Y n ( ˆ p n )). For given purchase probabilitiesq2R n + such that q i > 0 for all i2 N andå i2N q i < 1, we consider the problem min s2R n ( log(1+ G( ˆ Y(s)))+ å i2N q i s i ) : 109 The problem above is a specialized version of the problem given in Theorem 2.4.1, when the price sensitivities of all of the products are one. Thus, by Theorem 2.4.1, the objective function of the problem above is strictly convex and there exists a finite and unique solution to this problem. Letting ^ p(q)=( ˆ p 1 (q);:::; ˆ p n (q)) be the optimal solution to the problem above, by our change of variables, it follows that if we set p i (q)= 1 b k ˆ p i (q) for all i2 N k and k= 1;:::;m, then the pricesp(q) achieve the purchase probabilitiesq. Note that our change of variables applies even when the price sensitivities of the products are completely arbitrary. In other words, even if the price sensitivities of the products are arbitrary, given purchase probabilitiesq2R n + such that q i > 0 for all i2 N and å i2N q i < 1, we can compute the pricesp(q) that achieve these purchase probabilities by solving a convex optimization prob- lem. However, if the price sensitivities are arbitrary, then the expected revenue may not be concave in the purchase probabilities. Going back to the case where the generating function is separable by the partitions and the products in a partition have the same price sensitivity parameter, let2R n + be the vector of price sensitivities, so that=(b 1 ;:::;b n ), where b i =b k for all i2 N k and k= 1;:::;m. In the following theorem, which is the main result of this section, we show that if the generating function is separable by the partitions and the products in each partition share the same price sensitivity parameter, then the expected revenue function is concave in the purchase probabilities. Also, we give an expression for the gradient of the expected revenue function. Theorem A.8.1 For allq2R n + such that q i > 0 for all i andå i2N q i < 1, the Hessian matrixÑ 2 R(q) is negative definite and ÑR(q)=p(q) diag() 1 e e > diag() 1 q 1e > q e: The proof of the theorem above uses a sequence of three lemmas. We begin by defining a matrix that will ultimately be useful to characterize the Jacobian matrix of the vector-valued mappingq7!p(q). Recalling that ˆ Y i ( ˆ p)= e a i ˆ p i , and ^ p(q) is the optimal solution to the optimization problem given at the beginning of this section, we define the matrix ˆ B(q)= diag( ˆ Y(^ p(q)))Ñ 2 G( ˆ Y(^ p(q)))diag( ˆ Y(^ p(q))) 1+ G( ˆ Y(^ p(q))) : 110 The matrix ˆ B(q) is a specialized version of the matrix B(q) defined right before Lemma 2.4.4, when the price sensitivities of all of the products are one. So, by the discussion in the proof of Theorem 2.4.3, ˆ B(q) is positive semidefinite. SinceÑ 2 G( ˆ Y(^ p(q))) is symmetric, ˆ B(q) is symmetric as well. In the following lemma, we give other useful properties of ˆ B(q). Lemma A.8.2 We have ˆ B(q)diag() 1 e= 0, ˆ B(q)diag()e= 0, and ˆ B(q)e= 0. Proof: Note that the(i; j)-th entry of ˆ B(q) is ˆ Y i ( ˆ p i (q))¶G i j ( ˆ Y(^ p(q))) ˆ Y j ( ˆ p j (q))=(1+G( ˆ Y(^ p(q)))). Fur- thermore, since the generating function G is a separable function of the form G(Y)=å m k=1 G k (Y k ), letting ˆ Y k (^ p(q))=( ˆ Y i ( ˆ p i (q)) : i2 N k ), for i2 N k , we have ¶G i j ( ˆ Y(^ p(q)))=¶G k i j ( ˆ Y k (^ p(q))) when j2 N k , whereas¶G i j ( ˆ Y(^ p(q)))= 0 when j62 N k . In this case, noting that we use b i to denote the price sensitivity of product i, for i2 N k , the i-th entry of the vector ˆ B(q)diag() 1 e is given by å j2N ˆ Y i ( ˆ p i (q))¶G i j ( ˆ Y(^ p(q))) ˆ Y j ( ˆ p j (q)) 1 b j 1+ G( ˆ Y(^ p(q))) = ˆ Y i ( ˆ p i (q)) å j2N k ¶G k i j ( ˆ Y k (^ p(q))) ˆ Y j ( ˆ p j (q)) 1 b k 1+ G( ˆ Y(^ p(q))) = 0: In the equalities above, the first equality is by the fact that¶G i j ( ˆ Y(^ p(q)))=¶G k i j ( ˆ Y k (^ p(q))) when j2 N k , whereas ¶G i j ( ˆ Y(^ p(q)))= 0 when j62 N k , along with the fact that b j =b k for all j2 N k . The second equality follows from Lemma 2.2.2 and the fact that G k is a generating function. Since our choice of the entry i is arbitrary, it follows that ˆ B(q)diag() 1 e= 0, showing the first equality in the lemma. The other two equalities in the lemma follow by using the same approach after replacing 1=b j , respectively, with b j and 1 in the chain of equalities above. In the following lemma, we use the matrix ˆ B(q) to give an expression for the Jacobian matrix J(q) associated with the vector-valued mappingq7!p(q), which maps any purchase probabilitiesq to the pricesp(q) that achieve these purchase probabilities. Lemma A.8.3 The Jacobian matrix J(q)= ¶ p i (q) ¶q j : i; j2 N is given by J(q)=diag() 1 (diag(q)qq > + ˆ B(q)) 1 : Proof: The optimization problem given at the beginning of this section is a specialized version of the one in Theorem 2.4.1, when the price sensitivities of all of the products are one. Noting that the unique optimal solution to this problem is denoted by ^ p(q), Lemma 2.4.4 immediately implies that 111 the Jacobian matrix ˆ J(q)= ¶ ˆ p i (q) ¶q j : i; j2 N associated with the vector-valued mappingq7! ^ p(q) is given by ˆ J(q)=(diag(q)qq > + ˆ B(q)) 1 . By the discussion at the beginning of this section, we have p(q) = diag() 1 ^ p(q), in which case, by the chain rule, we obtain J(q) = diag() 1 ˆ J(q) = diag() 1 (diag(q)qq > + ˆ B(q)) 1 , which is the desired result. In the following lemma, we consider a matrix that will appear in the computation of the Hessian matrix of the expected revenue function. We define D(q)= 1 1e > q (ee > diag() 1 + diag() 1 ee > )+ e > diag() 1 q (1e > q) 2 ee > : One can verify thatD(q) is not necessarily positive semidefinite, but as we show next, if we add the matrix diag() 1 diag(q) 1 toD(q), then we obtain a positive definite matrix. Lemma A.8.4 For allq2R n + such that q i > 0 for all i and å i2N q i < 1, the matrix C(q)=D(q)+ diag() 1 diag(q) 1 is positive definite. Proof: Define g(q) =å k2M 1 b k å i2N k q i (logq i log(1å j2N q j )). By direct differentiation, it can be verified that Ñ 2 g(q)= C(q). So, it is enough to show that g is strictly convex. Let H :R 2n + !R be defined by: for allx;y2R n + , H(x;y)=å k2M 1 b k å i2N k x i (logx i logy i ). The relative entropy function f(x;y)= x(logx logy) is strictly convex in (x;y)2R 2 + ; see Section 3.1.5 in [16]. Therefore, H(x;y) is strictly convex in (x;y)2R 2n + . Let the vector-valued mappingh :R n + !R 2n + be defined by: for all q2R n + ,h(q)=(h 1 (q);:::;h 2n (q)) with h i (q)= q i for all i= 1;:::;n, and h i (q)= 1å j2N q j for all i= n+1;:::;2n. Then, the definitions of g, H andh imply that g(q)= H(h(q)). Thus, g is the composition of a strictly convex function with an affine mapping, which only implies that g is convex. To show that g is strictly convex, we let J h (q) be the Jacobian matrix of the vector-valued mappingh. Note that J h (q) is a 2n-by-n matrix. For i= 1;:::;n,(i;i)-th entry of J h (q) is 1. For i= n+ 1;:::;2n and j= 1;:::;n, the (i; j)-th entry of J h (q) is1. The other entries are zero. Therefore, the Jacobian matrix J h (q) includes the n-by-n identity matrix as a submatrix. Noting that g(q)= H(h(q)) andh is linear, by the chain rule, we haveÑ 2 g(q)= J h (q) > Ñ 2 H(h(q))J h (q). Since J h (q) includes the n-by-n identity matrix as a submatrix, for anyx2R n withx6= 0, we have J h (q)x6= 0. In this case, for anyx2R n withx6= 0, we obtain x > Ñ 2 g(q)x=(J h (q)x) > Ñ 2 H(h(q))(J h (q)x)> 0, where the inequality follows from the fact that H is strictly convex and J h (q)x6= 0. Here is the proof of Theorem A.8.1. 112 Proof of Theorem A.8.1: First, we show the expression for ÑR(q). In the proof of Theorem 2.4.3, we use the Sherman-Morrison formula to obtain (diag(q)qq > + B(q)) 1 = (diag(q)+ B(q)) 1 + ee > =(1e > q). Using the same argument, it follows that we have (diag(q)qq > + ˆ B(q)) 1 = (diag(q)+ ˆ B(q)) 1 +ee > =(1e > q). Furthermore, since ˆ B(q) is positive semidefinite, the inverse of diag(q)+ ˆ B(q) exists, in which case, we have (diag(q)+ ˆ B(q)) 1 (diag(q)+ ˆ B(q))diag() 1 e = diag() 1 e. Noting that ˆ B(q)diag() 1 e= 0 by Lemma A.8.2, the last equality implies that(diag(q)+ ˆ B(q)) 1 diag(q)diag() 1 e= diag() 1 e. Since R(q)=å i2N p i (q)q i , we get ÑR(q) = p(q)+ J(q) > q = p(q)(diag(q)qq > + ˆ B(q)) 1 diag() 1 q = p(q) (diag(q)+ ˆ B(q)) 1 + 1 1e > q ee > diag() 1 q = p(q) diag() 1 e e > diag() 1 q 1e > q e; where the second equality follows from Lemma A.8.3 and the last equality uses the fact that (diag(q)+ ˆ B(q)) 1 diag() 1 q=(diag(q)+ ˆ B(q)) 1 diag(q)diag() 1 e= diag() 1 e. Thus, we have the desired expression forÑR(q). Second, we show that the inverse ofÑ 2 R(q) is negative definite, which implies thatÑ 2 R(q) is negative definite as well. Differentiating the last expression above for the gradient of the expected revenue function, we have Ñ 2 R(q) = J(q) 1 1e > q ee > diag() 1 e > diag() 1 q (1e > q) 2 ee > =diag() 1 (diag(q)qq > + ˆ B(q)) 1 1 1e > q ee > diag() 1 e > diag() 1 q (1e > q) 2 ee > =diag() 1 (diag(q)+ ˆ B(q)) 1 1 1e > q h (diag() 1 ee > +ee > diag() 1 i e > diag() 1 q (1e > q) 2 ee > =diag() 1 (diag(q)+ ˆ B(q)) 1 D(q) =diag() 1 (diag(q)+ ˆ B(q)) 1 h I+(diag(q)+ ˆ B(q))diag()D(q) i ; where the third equality follows from the fact that (diag(q)qq > + ˆ B(q)) 1 =(diag(q)+ ˆ B(q)) 1 + ee > =(1e > q). Noting the definition ofD(q), multiplying both sides of the equality in this definition by ˆ B(q)diag() from left, Lemma A.8.2 implies that ˆ B(q)diag()D(q)= 0. In this case, considering 113 the term I+(diag(q)+ ˆ B(q))diag()D(q) on the right side of the chain of equalities above, we have I+(diag(q)+ ˆ B(q))diag()D(q)= I+ diag(q)diag()D(q), in which case, the last chain of equalities above implies thatÑ 2 R(q) is given by Ñ 2 R(q) =diag() 1 (diag(q)+ ˆ B(q)) 1 (I+ diag(q)diag()D(q)) =diag() 1 (diag(q)+ ˆ B(q)) 1 diag(q)diag() h diag() 1 diag(q) 1 +D(q) i : For notational brevity, we define the diagonal matrix = diag(q)diag(), in which case, we write the Hessian of R(q) as Ñ 2 R(q)=diag() 1 (diag(q)+ ˆ B(q)) 1 ( 1 +D(q)): Noting Lemma A.8.4, the inverse of 1 +D(q) exists. By the Sherman-Morrison formula, it is given by ( 1 +D(q)) 1 = (I+D(q)) 1 D(q): By Lemma A.8.2, noting the definition ofD(q), we have ˆ B(q)D(q)= 0. Since ˆ B(q) andD(q) are both symmetric, taking the transpose, it follows thatD(q) ˆ B(q)= 0. Using the last expression for the Hessian matrix of R(q) and taking the inverse, we obtain Ñ 2 R(q) 1 =( 1 +D(q)) 1 1 (diag(q)+ ˆ B(q))diag() =( 1 +D(q)) 1 1 diag(q)diag()( 1 +D(q)) 1 1 ˆ B(q)diag() =( 1 +D(q)) 1 h (I+D(q)) 1 D(q) i 1 ˆ B(q)diag() =( 1 +D(q)) 1 ˆ B(q)diag(); where third equality uses the fact that = diag(q)diag(), whereas the last equality holds since D(q) ˆ B(q)= 0. By Lemma A.8.4,( 1 +D(q)) 1 is positive definite. Also, recall that ˆ B(q) is positive semidefinite. Therefore, the inverse ofÑ 2 R(q) is negative definite. 114 Appendix B Supporting Arguments for Chapter 3 B.1 Computational Complexity In this section, we give a proof for Theorem 3.2.1 by using a polynomial-time reduction from the max-cut problem. In the max-cut problem, we have an undirected graph G=(V;E), where V is the set of vertices and E is the set of edges. We denote the edge between vertex i and j as(i; j). The goal is to find a subset of vertices S such that the number of edges in the setf(i; j)2 E : i2 S; j2 Vn Sg is maximized. In other words, the goal is to find a subset of vertices S such that the number of edges that connect a vertex in S and a vertex in VnS is maximized. We refer to a subset of vertices S as a cut andjf(i; j)2 E : i2 S; j2 VnSgj as the objective value provided by the cut S. The max-cut problem is strongly NP-hard; see Section A.2.2 in [54]. Here, we focus on graphs where the degrees of all vertices are even. We show that the max-cut problem over graphs with even vertex degrees continues to be strongly NP-hard. For any d > 0, [39] show that the max-cut problem over the graphs G=(E;V) withjEj=W(jVj 2d ) is hard to approximate within a constant factor. Therefore, defining the class of graphsG 1 (b)=fG=(V;E) :jEjbjVj 1:5 g, there exist constants a2(0;1) and b > 0 such that it is hard to approximate the max-cut problem over the graphs inG 1 (b) within a factor of 1a. Defining the class of graphsG 2 (b)=fG=(V;E) :jEj bjVj 1:5 and all vertices in G have even degreesg, in the next lemma, we show that the same result holds over the graphs inG 2 (b). Lemma B.1.1 There exist constantsa2(0;1) andb > 0 such that it is hard to approximate the max-cut problem over the graphs inG 2 (b) within a factor of 1a. Proof: Leta 1 2(0;1) andb 1 > 0 be such that it is hard to approximate the max-cut problem over the graphs inG 1 (b 1 ) within a factor of 1a 1 . Note that the existence of a 1 and b 1 is guaranteed by the discussion 115 right before the lemma. Fix anye< 2a 1 =3. Seta=a 1 3e=2 andb =b 1 =2 1:5 . Assume that there exists a(1a)-approximation algorithm for the max-cut problem over the graphs inG 2 (b). We will show that the existence of this approximation algorithm directly implies the existence of a (1a 1 )-approximation algorithm for the max-cut problem over the graphs inG 1 (b 1 ), which is a contradiction. Choose any graph G 1 = (V 1 ;E 1 )2G 1 (b 1 ) and let n =jV 1 j. If we have n (2=(b 1 e)) 2 , then we can enumerate all of the cuts in G 1 to solve the max-cut problem over G 1 in polynomial time, since the number of vertices is bounded by a constant. Also, if all of the vertices in G 1 have even degrees, noting that G 1 2G 1 (b 1 ) andb 1 b, we havejE 1 jb 1 jV 1 j 1:5 bjV 1 j 1:5 , which implies that G 1 2G 2 (b). Therefore, we can use the (1a)-approximation algorithm for the max-cut problem over the graphs in G 2 (b) to find a(1a)-approximate cut in G 1 . Sincea 1 a, a(1a)-approximate cut is also a(1a 1 )- approximate cut. Therefore, we can obtain a (1a 1 )-approximate cut in G 1 in polynomial time, which is a contradiction. In the rest of the proof, we assume that n>(2=(b 1 e)) 2 and some of the vertices in G 1 have odd degrees. Let k be the number of vertices in G 1 with odd degrees. Note that k must be an even number, otherwise the sum of the degrees of all vertices in G 1 would be an odd number, but we know that this sum is equal to twice the number of edges. We add k auxiliary vertices to G 1 . Using additional k edges, we connect each one of these auxiliary vertices to one of the k vertices with odd degrees. Since k is an even number, we also use k=2 additional edges to form a perfect matching between the auxiliary vertices. We denote this new graph by ¯ G=( ¯ V; ¯ E). By our construction, all of the vertices in ¯ G have even degrees. Furthermore, we havej ¯ EjjE 1 jb 1 jV 1 j 1:5 = b 1 2 1:5 (2jV 1 j) 1:5 b 1 2 1:5 (jV 1 j+ k) 1:5 = b 1 2 1:5 j ¯ Vj 1:5 =bj ¯ Vj 1:5 , which implies that ¯ G2G 2 (b). We use OPT 1 and OPT to, respectively, denote the optimal objective values of the max-cut problems in G 1 and ¯ G. We have OPT OPT 1 , because ¯ V V 1 and ¯ E E 1 . Since ¯ G2G 2 (b), we use the(1a)- approximation algorithm for the max-cut problem over the graphs inG 2 (b) to find a(1a)-approximate cut in ¯ G. That is, letting CUT be the objective value provided by the cut, we have CUT(1a)OPT. By removing the auxiliary vertices in the(1a)-approximate cut in ¯ G, we obtain the corresponding cut in G 1 . We use CUT 1 to denote the objective value provided by the cut in G 1 . Since the graphs G 1 and ¯ G differ in 3k=2 edges, the objective values of the two cuts cannot differ by more than 3k=2, so CUT 1 CUT3k=2. Therefore, we obtain CUT 1 CUT 3 2 k (1a)OPT 3 2 k (1a)OPT 1 3 2 k (1a)OPT 1 3 2 jV 1 j: 116 It is well-known that the optimal objective value of the max-cut problem over any graph is at least half of the number of edges; see Section 12.4 in [82]. Thus, noting that n>(2=(b 1 e)) 2 , we havejV 1 j= n 1 2 eb 1 n 1:5 = 1 2 eb 1 jV 1 j 1:5 1 2 ejE 1 je OPT 1 , where the second inequality uses the fact that G 1 2G 1 (b 1 ). In this case, we get (1a)OPT 1 3 2 jV 1 j(1a 3 2 e)OPT 1 =(1a 1 )OPT 1 , so using the chain of inequalities above, it follows that CUT 1 (1a 1 )OPT 1 . Therefore, if we use the (1a)-approximate approximation algorithm to find a(1a)-approximate cut in ¯ G and drop the auxiliary vertices, then we obtain a(1a 1 )-approximate cut in G 1 in polynomial time, which is a contradiction. By the lemma above, the max-cut problem is strongly NP-hard when the degrees of all vertices are even. In the proof Theorem 3.2.1, we will need the fact that the max-cut problem is strongly NP- hard when the degrees of all vertices are divisible by four. We define the class of graphs G 4 (b) = fG=(V;E) :jEjbjVj 1:5 and all vertices in G have degrees divisible by fourg. In the next lemma, we repeat an argument similar to the one in the proof of Lemma B.1.1 to show that an analogue of the result in Lemma B.1.1 holds for the graphs inG 4 (b). Lemma B.1.2 There exist constantsa2(0;1) andb > 0 such that it is hard to approximate the max-cut problem over the graphs inG 4 (b) within a factor of 1a. Proof: Let a 2 2(0;1) and b 2 > 0 be such that it is hard to approximate the max-cut problem over the graphs inG 2 (b 2 ) within a factor of 1a 2 . The existence ofa 2 andb 2 is guaranteed by Lemma B.1.1. Fix anye<a 2 =11. Seta =a 2 11e andb =b 2 =6 1:5 . To get a contradiction, we assume that there exists a (1a)-approximation algorithm for the max-cut problem over the graphs inG 4 (b). We will show that the existence of this approximation algorithm implies the existence of a(1a 2 )-approximation algorithm for the max-cut problem over the graphs inG 2 (b 2 ). Choose any graph G 2 =(V 2 ;E 2 )2G 2 (b 2 ) and let n=jV 2 j. If n(2=(b 2 e)) 2 or all of the vertices in G 2 have degrees divisible by four, then we reach a contradiction by the same argument in the second paragraph of the proof of Lemma B.1.1. Therefore, we assume that n>(2=(b 2 e)) 2 and the degrees of some of the vertices in G 2 are not divisible by four. Let k be the number of vertices in G 2 with degrees not divisible by four. Since G 2 2G 2 (b 2 ), these vertices must have even degrees. If k 3, then we can add k vertices and 3k edges to G 2 to obtain a graph ¯ G=( ¯ V; ¯ E) with all the vertices having degrees divisible by four. In particular, letfi 1 ;:::;i k g be the vertices with even degree but not divisible by four. We add k auxiliary verticesf j 1 ;:::; j k g to the graph G 2 . Using additional edges, for s= 1;:::;k 1, we connect i s to j s and j s+1 . Also, we connect i k to j k and j 1 . Finally, we add the edges( j 1 ; j 2 );( j 2 ; j 3 );:::;( j k1 ; j k );( j k ; j 1 ). We denote this new graph by ¯ G=( ¯ V; ¯ E). 117 In Figure B.1.a, we show the k verticesfi 1 ;:::;i k g in G 2 with even degrees but not divisible by four, along with the k auxiliary verticesf j 1 ;:::; j k g. The solid edges are the ones that we add to G 2 to get ¯ G. The dotted edges are already in G 2 . By our construction, all of the vertices in ¯ G have degrees divisible by four. If k = 1, then we can add 5 vertices and 11 edges to G 2 get a graph ¯ G =( ¯ V; ¯ E) with all vertices having degrees divisible by four. In Figure B.1.b, we show the only vertex in G 2 with even degree but not divisible by four, along with the 5 vertices and 11 edges that we add to get the graph ¯ G. If k= 2, then we can add 5 vertices and 12 edges to G 2 to get a graph ¯ G=( ¯ V; ¯ E) with all vertices having degrees divisible by four. In Figure B.1.b, we show the two vertices in G 2 with even degrees but not divisible by four, along with the 5 vertices and 12 edges that we add to get the graph ¯ G. Collecting the three cases discussed in the previous and this paragraph together, if k is the number of vertices in G 2 with even degrees but not divisible by four, then we can add at most 5k vertices and 11k edges to G 2 to obtain ¯ G. Note thatj ¯ EjjE 2 jb 2 jV 2 j 1:5 = b 2 6 1:5 (6jV 2 j) 1:5 b 2 6 1:5 (jV 2 j+5k) 1:5 b 2 6 1:5 j ¯ Vj 1:5 =bj ¯ Vj 1:5 , which implies that ¯ G2G 4 (b). In this case, we can use precisely the same argument in the last two paragraphs of the proof of Lemma B.1.1 to show that if we have a(1a)-approximate cut in ¯ G, then we can obtain a(1a 2 ) approximate cut in G 2 in polynomial time. Since our choice of the graph G 2 2G(b 2 ) is arbitrary, we obtain a contradiction Lemma B.1.1. Therefore, by Lemma B.1.2, the max-cut problem is strongly NP-hard when the degrees of the vertices are divisible by four. We use the decision variable y i 2f1;+1g to capture whether vertex i is included in the cut, where y i =+1 if and only if the vertex is included. If vertex i is included in the cut and vertex j is not, then we have y i y j =1. Thus, we can formulate the max-cut problem over the graph G=(V;E) as max y2f1;+1g jVj ( 1 2 å (i; j)2E (1 y i y j ) ) ; where we use the decision variablesy=fy i : i2 Vg. Using the change of variables y i = 2x i 1 with x i 2 f0;1g and letting d i be the degree of vertex i, the objective function of the problem above is 1 2 å (i; j)2E (1 (2x i 1)(2x j 1))= 1 2 å (i; j)2E (2x i + 2x j 4x i x j )=å i2V d i x i 2å (i; j)2E x i x j , which implies that the problem max x2f0;1g jVj få i2V d i x i 2å (i; j)2E x i x j g is strongly NP-hard when d i is divisible by four for all i2 V . To show Theorem 3.2.1, we use a feasibility version of the max-cut problem. Given an undirected graph G=(V;E), we assume that d i is divisible by four for all i2 V . For a fixed target objective value 118 i 2 i 3 i 1 i k j k j 3 j 1 j 2 ... ... (a) (b) (c) 1 2 3 4 5 1 2 3 4 5 Figure B.1: Constructing a graph with all vertices having degrees divisible by four. 119 K, we consider the problem of whether there existsx =fx i : i2 Vg2f0;1g jVj such that å i2V d i x i 2å (i; j)2E x i x j K. We refer to this problem as the max-cut feasibility problem. By the discussion in the previous paragraph, the max-cut feasibility problem is strongly NP-complete. Below is the proof of Theorem 3.2.1. Proof of Theorem 3.2.1: Throughout the proof, we use the formulation of the PCL model, where the set of nests is given by M=f(i; j)2 N 2 : i< jg. We observe that the formulation that we use in the paper is equivalent to the formulation with the set of nests M=f(i; j)2 N 2 : i< jg. In particular, if we are given an assortment problem with the set of nests M=f(i; j)2 N 2 : i< jg, the dissimilarity parameters fg i j :(i; j)2 Mg and the no purchase preference weight v 0 , then we can define an assortment problem with the set of nests M 0 =f(i; j)2 N 2 : i6= jg, the dissimilarity parametersfg 0 i j :(i; j)2 M 0 g and the no purchase preference weight v 0 0 , where g 0 i j =g 0 ji =g i j and v 0 0 = 2v 0 . In this case, the expected revenues obtained by any subset of products are identical in the two problems. Assume that we have an instance of the max-cut feasibility problem over the graph G=(V;E) with target objective value K. Letting d i be the degree of vertex i, we assume that d i is divisible by four for all i2 V . We construct an instance of our assortment problem in such a way that there existsx=fx i : i2 Vg2 f0;1g jVj that satisfieså i2V d i x i 2å (i; j)2E x i x j K if and only if there exists a subset of products in our assortment problem that provides an expected revenue of K+ 8(jVj 1) 2 or more. Thus, an instance of the max-cut feasibility problem can be reduced to an instance of the feasibility version of our assortment problem, in which case, the desired result follows. We construct the instance of our assortment problem as follows. Let n=jVj. In the instance of our assortment problem, there are 2n 1 products. We partition the products into two subsets V and W so that the set of products is N= V[W. SincejNj= 2n1 andjVj= n, we havejWj= n1. We index the products in V byf1;:::;ng and the products in W byfn+1;:::;2n1g. The set of nests is M=f(i; j)2 N 2 : i< jg. Letting T = K+ 8(n 1) 2 , the revenues of the products are given by p i = 1+ T for all i2 V and p i = 4+ T for all i2 W. The preference weights of the products are given by v i = 2 for all i2 V and v i = 1 for all i2 W. The preference weight of the no purchase option 120 is v 0 = 1. Since E is the set of edges in an undirected graph, we follow the convention that i< j for all (i; j)2 E. The dissimilarity parameters of the nests are g i j = 8 > > > > > > > > > > > > > > < > > > > > > > > > > > > > > : 0 if i2 V; j2 V; (i; j)2 E 1 if i2 V; j2 V; (i; j)62 E 1 if i2 V; j2 W; j2fn+ 1;:::;n+ d i =4g 0 if i2 V; j2 W; j2fn+ 1+ d i =4:::;2n 1g 1 if i2 W; j2 W. In the feasibility version of the assortment problem, we are interested in whether there exists a vector x2f0;1g 2n1 such thatå (i; j)2M V i j (x) g i j R i j (x)=(v 0 +å (i; j)2M V i j (x) g i j ) T . In other words, arranging the terms in this inequality, we are interested in whether there exists a vectorx2f0;1g 2n1 such that å (i; j)2M V i j (x) g i j (R i j (x) T) v 0 T . By the definition of R i j (x) and V i j (x), we observe that we have R i j (x) T = p i v 1=g i j i x i + p j v 1=g i j j x j V i j (x) T = (p i T)v 1=g i j i x i +(p j T)v 1=g i j j x j V i j (x) : Note that p i T = 1 or p i T = 4 for all i2 N. For notational brevity, let ˆ R i j (x)= R i j (x) T . In this case, we are interested in whether there exists a vectorx2f0;1g 2n1 such thatå (i; j)2M V i j (x) g i j ˆ R i j (x) v 0 T . It is simple to check that we can offer all of the products with the largest revenue without degrading the ex- pected revenue from a subset of products. Noting that the products in the set W have the largest revenue, we can set x i = 1 for all i2 W in the feasibility version of the assortment problem. Therefore, the only question is which of the products in V to include in the subset of productsx such thatå (i; j)2M V i j (x) g i j ˆ R i j (x) v 0 T . We proceed to computing V i j (x) g i j ˆ R i j (x) for all (i; j)2 M. Since we offer the products in the set W, if i2 W and j2 W, then noting that g i j = 1, p i = p j = 4+ T and v i = v j = 1, we have V i j (x) g i j = 2 and ˆ R i j = 4. Therefore, we haveå (i; j)2M 1(i2 W; j2 W)V i j (x) g i j ˆ R i j (x)= 8(n 1)(n 2)=2, where we use the fact thatjWj= n 1 and we have i< j for all (i; j)2 M. Similarly, considering the dissimilarity 121 parameters of the other nests, along with the revenues and the preference weights of the products in these nests, if i2 V and j2 W, then we have V i j (x) g i j ˆ R i j (x)= 8 > > > > > > < > > > > > > : 6 if x i = 1; j2fn+ 1;:::;n+ d i =4g 2 if x i = 1; j2fn+ 1+ d i =4;:::;2n 1g 4 if x i = 0; where, once again, we use the fact that we offer all of the products in the set W. We can write the expression above succinctly as 4+2x i for all j2fn+1;:::;n+d i =4g and 42x i for all j2fn+ 1+ d i =4;:::;2n 1g. Therefore, we haveå (i; j)2M 1(i2 V; j2 W)V i j (x) g i j ˆ R i j (x)=å i2V å j2W 1( j2fn+1;:::;n+d i =4g)(4+ 2x i )+ 1( j2fn+ 1+ d i =4;:::;2n 1g)(4 2x i ) =å i2V d i 4 (4+ 2x i )+(n 1 d i 4 )(4 2x i ) . Next, if i2 V and j2 V , then we have V i j (x) g i j ˆ R i j (x)= 8 > > > > > > > > > > > > > > > > > > > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > > : 2 if x i = 1; x j = 1; (i; j)2 E 2 if x i = 1; x j = 0; (i; j)2 E 2 if x i = 0; x j = 1; (i; j)2 E 0 if x i = 0; x j = 0; (i; j)2 E 4 if x i = 1; x j = 1; (i; j)62 E 2 if x i = 1; x j = 0; (i; j)62 E 2 if x i = 0; x j = 1; (i; j)62 E 0 if x i = 0; x j = 0; (i; j)62 E. If (i; j)2 E, then we can write the expression above succinctly as 2x i + 2x j 2x i x j . So, we have å (i; j)2M 1(i2 V; j2 V; (i; j)2 E)V i j (x) g i j ˆ R i j (x) = å (i; j)2E (2x i + 2x j 2x i x j ), where we use the fact that having (i; j)2 E implies that i2 V , j 2 V and i < j, which, in turn, implies that (i; j) 2 M. On the other hand, if (i; j) 62 E, then we can write the expression above succinctly as 2x i + 2x j . Note that å i2V å j2V 1(i < j; (i; j)62 E)(x i + x j ) = å i2V x iå j2V 1(i< j; (i; j)62 E)+ 1( j< i; ( j;i)62 E) =å i2V x i (n 1 d i ), where the last equality uses the fact that å j2V 1(i< j; (i; j)62 E)+ 1( j< i; ( j;i)62 E) corresponds to the number of vertices that are not connected to vertex i with an edge, which is given by n 1 d i . Thus, we obtain å (i; j)2M 1(i2 122 V; j2 V; (i; j)62 E)V i j (x) g i j ˆ R i j (x)=å i2V å j2V 1(i< j; (i; j)62 E)(2x i + 2x j )=å i2V 2(n 1 d i )x i . Putting the discussion so far together, we have å (i; j)2M V i j (x) g i j ˆ R i j (x)= å (i; j)2M 1(i2 W; j2 W)V i j (x) g i j ˆ R i j (x) + å (i; j)2M 1(i2 V; j2 W)V i j (x) g i j ˆ R i j (x)+ å (i; j)2M 1(i2 V; j2 V; (i; j)2 E)V i j (x) g i j ˆ R i j (x) + å (i; j)2M 1(i2 V; j2 V; (i; j)62 E)V i j (x) g i j ˆ R i j (x) = 4(n 1)(n 2)+ å i2V ( d i 4 (4+ 2x i )+ n 1 d i 4 ! (4 2x i ) ) + å (i; j)2E (2x i + 2x j 2x i x j )+ å i2V 2(n 1 d i )x i = 4(n 1)(n 2)+ å i2V d i + å i2V d i 2 x i + 4n(n 1) å i2V 2(n 1)x i å i2V d i + å i2V d i 2 x i + å i2V 2d i x i å (i; j)2E 2x i x j + å i2V 2(n 1)x i å i2V 2d i x i = 8(n 1) 2 + å i2V d i x i 2 å (i; j)2E x i x j : Therefore, there existsx2f0;1g jVj that satisfieså i2V d i x i 2å (i; j)2E x i x j K if and only if there exists x=f0;1g 2n1 that satisfieså (i; j)2M V i j (x) g i j ˆ R i j (x) K+ 8(n 1) 2 . B.2 Simplifying the Upper Bound In the next lemma, we show that we can drop some of the decision variables from the Upper Bound problem without changing the optimal objective value of this problem. This lemma becomes useful to obtain the Compact Upper Bound problem. Lemma B.2.1 There exists an optimal solutionx =fx i : i2 Ng andy =fy i j :(i; j)2 Mg to the Upper Bound problem such that x i = 0 for all i= 2 N(z) and y i j = 0 for all(i; j)= 2 M(z). Proof: Letting(x ;y ) be an optimal solution to the Upper Bound problem, we define the solution(^ x; ^ y) as follows. We set ˆ x i = x i for all i2 N(z), ˆ x i = 0 for all i62 N(z), ˆ y i j = y i j for all(i; j)2 N(z) 2 with i6= j and ˆ y i j = 0 for all(i; j)62 N(z) 2 with i6= j. Observe that ˆ x i x i and ˆ y i j y i j . We claim that the solution (^ x; ^ y) is feasible to the Upper Bound problem. To establish the claim, we note that if(i; j)2 N(z) 2 , then 123 we have ˆ y i j = y i j x i + x j 1= ˆ x i + ˆ x j 1, where the two equalities follow from the definition of(^ x; ^ y) and the inequality follows from the fact that(x ;y ) is a feasible solution to the Upper Bound problem, so it satisfies the first constraint in this problem. Also, if(i; j)62 N(z) 2 , then we have ˆ y i j = 0 ˆ x i + ˆ x j 1, where the equality follows from the definition of (^ x; ^ y) and the inequality follows from the fact that if (i; j)62 N(z) 2 , then we have ˆ x i = 0 or ˆ x j = 0, along with ˆ x i x i 1 and ˆ x j x j 1. Therefore, the solution (^ x; ^ y) satisfies the first constraint in the Upper Bound problem. On the other than, if(i; j)2 N(z) 2 , then we have ˆ y i j = y i j x i = ˆ x i , where the two equalities follow from the definition of(^ x; ^ y) and the inequality follows from the fact that (x ;y ) is a feasible solution to the Upper Bound problem. If (i; j)62 N(z) 2 , then we have ˆ y i j = 0 ˆ x i , where the equality is by the definition of (^ x; ^ y) and the inequality is simply by the fact that ˆ x i 0. Therefore, the solution (^ x; ^ y) satisfies the second constraint in the Upper Bound problem. We can use the same approach to show that the solution (^ x; ^ y) satisfies the third constraint in the Upper Bound problem. Finally, since ˆ x i x i for all i2 N, we haveå i2N ˆ x i å i2N x i c. Thus, the solution(^ x; ^ y) is feasible to the Upper Bound problem. Next, we claim that the objective value provided by the solution(^ x; ^ y) for the Upper Bound problem is at least as large as the objective value provided by the solution (x ;y ). Nest (i; j) contributes the quantitym i j (z)y i j +q i (z)x i +q j (z)x j to the objective function of the Upper Bound problem. To establish the claim, we show that the the contribution of each nest under the solution (^ x; ^ y) is at least as large as the contribution under the solution(x ;y ). If i2 N(z) and j2 N(z), then we have m i j (z)y i j +q i (z)x i + q j (z)x j =m i j (z) ˆ y i j +q i (z) ˆ x i +q j (z) ˆ x j . If i62 N(z) and j62 N(z), then by the definition of N(z), we have r i j (z) 0, q i (z) 0 and q j (z) 0. In this case, we obtain m i j (z)y i j +q i (z)x i +q j (z)x j =r i j (z)y i j + q i (z)(x i y i j )+q j (z)(x i y i j ) 0=m i j (z) ˆ y i j +q i (z) ˆ x i +q j (z) ˆ x j , where the inequality follows by noting that(x ;y ) is a feasible solution to the Upper Bound problem so that y i j x i and y i j x j , whereas the equality is by the fact that ˆ y i j = 0, ˆ x i = 0 and ˆ x j = 0 whenever i62 N(z) and j62 N(z). If i2 N(z) and j62 N(z), then we have v 1=g i j i v 1=g i j i + v 1=g i j j ! g i j (p i z) v 1=g i j i v 1=g i j i + v 1=g i j j (p i z) (p i z)v 1=g i j i +(p j z)v 1=g i j j v 1=g i j i + v 1=g i j j ; where the first inequality follows by noting that a g a for a2[0;1] andg2[0;1], along with the fact that i2 N(z) so that p i z, whereas the second inequality follows from the fact that j62 N(z) so that p j < z. Focusing on the first and last expressions in the chain of inequalities above and noting the definitions of r i j (z) andq i (z), we obtainq i (z)r i j (z). In this case, we have m i j (z)y i j +q i (z)x i +q j (z)x j =(r i j (z) 124 q i (z))y i j +q i (z)x i +q j (z)(x i y i j )q i (z)x i =m i j (z) ˆ y i j +q i (z) ˆ x i +q j (z) ˆ x j , where the inequality follows from the fact that r i j (z)q i (z) 0 and j62 N(z) so that q j (z)< 0, whereas the second equality follows from the fact that i2 N(z) and j62 N(z), in which case, we have ˆ y i j = 0, ˆ x i = x i and ˆ x j = 0. If i62 N(z) and j2 N(z), then we can use the same approach to show thatm i j (z)y i j +q i (z)x i +q j (z)x j m i j (z) ˆ y i j + q i (z) ˆ x i +q j (z) ˆ x j . The only difference is that we need to interchange the roles of the decision variables x i and x j . Therefore, in all of the four cases considered, the contribution of nest (i; j) under the solution (^ x; ^ y) is at least as large as the contribution of nest(i; j) under the solution(x ;y ). In the next lemma, we show thatm i j (z) 0 for all(i; j)2 M(z). This lemma is used at multiple places throughout the paper. Lemma B.2.2 For all z2R and(i; j)2 M(z), we havem i j (z) 0. Proof: Since (i; j)2 M(z), we have i2 N(z) and j2 N(z), which implies that p i z 0 and p j z 0. Using the fact that a a g for a2[0;1] andg2[0;1], we obtain (p i z)v 1=g i j i +(p j z)v 1=g i j j v 1=g i j i + v 1=g i j j (p i z) v 1=g i j i v 1=g i j i + v 1=g i j j ! g i j +(p j z) v 1=g i j j v 1=g i j i + v 1=g i j j ! g i j : Multiplying both sides of this inequality with(v 1=g i j i +v 1=g i j j ) g i j and using the definitions ofr i j (z) andq i (z), we getr i j (z)q i (z)+q j (z). Thus, we havem i j (z)=r i j (z)q i (z)q j (z) 0. B.3 Improving the Performance Guarantee We give the full proof for Theorem 3.4.1 with the 0.6-approximation guarantee. The proof is lengthy, so we begin with an outline. In particular, the proof uses the following four steps. Step 1: Using the vector=fq i : i2 ˆ Ng, we will construct a function F :R j ˆ Nj + !R + that satisfies å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g f R 1 2 F(). Step 2: We will also construct a function G :R j ˆ Nj + Z 3 + !R + that satisfies f R G(;k 1 ;k 2 ;j ˆ Nj) for any 1 k 1 k 2 j ˆ Nj. To establish Step 1, we observe that å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g f R + 1 4 å (i; j)2 ˆ M m i j by the discussion in the proof of Theorem 3.4.1 in the main body of the paper. Thus, it will be enough to give a lower bound on m i j as a function of q i and q j . To establish Step 2, we construct a feasible solution to the problem that computes f R at the beginning of Section 3.4. In particular, recalling that q i 0 for all 125 i2 ˆ N, we index the products so that q 1 q 2 :::q j ˆ Nj 0. The solution (^ x; ^ y) obtained by setting ˆ x i = 1 for all i2f1;:::;k 1 g, ˆ x i = 1 2 for all i2fk 1 + 1;:::;k 2 g and ˆ x i = 0 for all i2fk 2 + 1;:::;j ˆ Njg along with ˆ y i j =[ ˆ x i + ˆ x j 1] + for all(i; j)2 ˆ M is feasible to the problem that computes f R at the beginning of Section 3.4. In this case, f R will be lower bounded by the objective function of the problem evaluated at this solution. Using Steps 1 and 2, we get å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g f R 1 2 F() = 1 F() 2 f R f R 1 F() 2max (k 1 ;k 2 ): 1k 1 k 2 j ˆ Nj G(;k 1 ;k 2 ;j ˆ Nj) ! f R = 1 1 2 min (k 1 ;k 2 ): 1k 1 k 2 j ˆ Nj ( F() G(;k 1 ;k 2 ;j ˆ Nj) )! f R 1 1 2 min (k 1 ;k 2 ): 1k 1 k 2 j ˆ Nj ( max (q 1 ;:::;q j ˆ Nj ): q 1 :::q j ˆ Nj 0 ( F() G(;k 1 ;k 2 ;j ˆ Nj) ))! f R : (B.1) Step 3: Letting a_ b= maxfa;bg, we will construct functions G 1 :Z 3 + !R + , G 2 :Z 3 + !R + and G 3 :Z 3 + !R + that, for any 1 k 1 k 2 j ˆ Nj, satisfy max (q 1 ;:::;q j ˆ Nj ): q 1 :::q j ˆ Nj 0 ( F() G(;k 1 ;k 2 ;j ˆ Nj) ) G 1 (k 1 ;k 2 ;j ˆ Nj)_G 2 (k 1 ;k 2 ;j ˆ Nj)_G 3 (k 1 ;k 2 ;j ˆ Nj): Step 4: For anyj ˆ Nj, we will show that there exist k 1 and k 2 such that 1 k 1 k 2 j ˆ Nj,G 1 (k 1 ;k 2 ;j ˆ Nj) 0:8,G 2 (k 1 ;k 2 ;j ˆ Nj) 0:8 andG 3 (k 1 ;k 2 ;j ˆ Nj) 0:8. To establish Step 3, we note that F() and G(;k 1 ;k 2 ;j ˆ Nj) are linear in in our construction. Thus, the objective function of the problem max (q 1 ;:::;q j ˆ Nj ):q 1 :::q j ˆ Nj 0 n F() G(;k 1 ;k 2 ;j ˆ Nj) o is quasi-linear in , so an optimal solution occurs at an extreme point of the set of feasible solutions. In this case, we con- struct the functions G 1 (;;), G 2 (;;) and G 3 (;;) by checking the objective value of the last maxi- mization problem at the possible extreme points of the set of feasible solutions. To establish Step 4, we show that ifj ˆ Nj is large enough, then we can choose k 1 and k 2 as fixed fractions ofj ˆ Nj to obtain G 1 (k 1 ;k 2 ;j ˆ Nj) 0:8,G 2 (k 1 ;k 2 ;j ˆ Nj) 0:8 andG 3 (k 1 ;k 2 ;j ˆ Nj) 0:8. In particular, usingde to denote the round up function and fixing ˆ b 1 = 0:088302 and ˆ b 2 = 0:614542 we show that ifj ˆ Nj 786, then we have G 1 (d ˆ b 1 j ˆ Nje;d ˆ b 2 j ˆ Nje;j ˆ Nj) 0:8,G 2 (d ˆ b 1 j ˆ Nje;d ˆ b 2 j ˆ Nje;j ˆ Nj) 0:8 andG 3 (d ˆ b 1 j ˆ Nje;d ˆ b 2 j ˆ Nje;j ˆ Nj) 0:8. On 126 the other hand, ifj ˆ Nj< 786, then we enumerate all values of(k 1 ;k 2 )2Z 2 with 1 k 1 k 2 j ˆ Nj to nu- merically check that G 1 (k 1 ;k 2 ;j ˆ Nj) 0:8, G 2 (k 1 ;k 2 ;j ˆ Nj) 0:8 and G 3 (k 1 ;k 2 ;j ˆ Nj) 0:8. Using Steps 3 and 4, we get min (k 1 ;k 2 ): 1k 1 k 2 j ˆ Nj ( max (q 1 ;:::;q j ˆ Nj ): q 1 :::q j ˆ Nj 0 ( F() G(;k 1 ;k 2 ;j ˆ Nj) )) min (k 1 ;k 2 ): 1k 1 k 2 j ˆ Nj n G 1 (k 1 ;k 2 ;j ˆ Nj)_G 2 (k 1 ;k 2 ;j ˆ Nj)_G 3 (k 1 ;k 2 ;j ˆ Nj) o 0:8: (B.2) Therefore, by (B.1) and (B.2), we get å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 0:6 f R , which is the desired result. In Appendix B.3.1, we establish Steps 1 and 2. In Appendix B.3.2, we establish Step 3. In Appendix B.3.3, we establish Step 4. B.3.1 Preliminary Bounds In this section, we establish Steps 1 and 2 in our outline of the proof of Theorem 3.4.1. Throughout this section, we let (x ;y ) be an optimal solution to the LP that computes f R at the beginning of Section 3.4. Also, we recall that the random subset of products ^ X =f ˆ X i : i2 Ng is defined as follows. For all i2 ˆ N, we have ˆ X i = 1 with probability x i and ˆ X i = 0 with probability 1 x i . Lastly, we have ˆ X i = 0 for all i2 Nn ˆ N. Different components of the vector ^ X are independent of each other. For notational brevity, we let m=j ˆ Nj. In this case, since n=jNj, we write the LP that computes f R at the beginning of Section 3.4 as f R = max ( å (i; j)2 ˆ M (m i j y i j +q i x i +q j x j )+ 2(n m) å i2 ˆ N q i x i : y i j x i + x j 18(i; j)2 ˆ M; 0 x i 1 8i2 ˆ N; y i j 08(i; j)2 ˆ M ) : (B.3) We index the elements of ˆ N asf1;:::;mg and the elements of Nn ˆ N asfm+ 1;:::;ng. Without loss of generality, we assume that the products in ˆ N are indexed such thatq 1 q 2 :::q m . In the next lemma, we give a lower bound onå (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g. Lemma B.3.1 We haveå (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 1 å m i=1 (mi)q i 2 f R f R . 127 Proof: Noting the discussion in the proof of Theorem 3.4.1 in the main body of the paper, we have å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g f R + 1 4 å (i; j)2 ˆ M m i j . Lemma B.3.2 given below shows that m i j maxfq i ;q j g for all(i; j)2 ˆ M. In this case, we obtain å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g f R + 1 4 å (i; j)2 ˆ M m i j f R 1 4 å (i; j)2 ˆ M maxfq i ;q j g = f R 1 4 å i2 ˆ N å j2 ˆ N 1(i< j) maxfq i ;q j g 1 4 å i2 ˆ N å j2 ˆ N 1(i> j) maxfq i ;q j g = f R 1 4 å i2 ˆ N å j2 ˆ N 1(i< j)q i 1 4 å i2 ˆ N å j2 ˆ N 1(i> j)q j = f R 1 4 å i2 ˆ N ( å j2 ˆ N 1(i< j) ) q i 1 4 å j2 ˆ N ( å i2 ˆ N 1(i> j) ) q j = f R 1 4 å i2 ˆ N (m i)q i 1 4 å j2 ˆ N (m j)q j = 1 å i2 ˆ N (m i)q i 2 f R ! f R ; where the second equality uses the fact that q 1 q 2 :::q m and the fourth equality uses the fact that j ˆ Nj= m. We use the next lemma in the proof of Lemma B.3.1. Lemma B.3.2 For all z2R and(i; j)2 M(z), we havem i j (z)maxfq i (z);q j (z)g. Proof: For(i; j)2 M(z), we have i2 N(z) and j2 N(z), which implies that p i z and p j z. Using the definitions ofr i j (z) andq i (z), we get r i j (z) = (v 1=g i j i + v 1=g i j j ) g i j (p i z)v 1=g i j i +(p j z)v 1=g i j j v 1=g i j i + v 1=g i j j = (p i z)v 1=g i j i +(p j z)v 1=g i j j (v 1=g i j i + v 1=g i j j ) 1g i j = (p i z)v i v 1=g i j i v 1=g i j i + v 1=g i j j ! 1g i j +(p j z)v j v 1=g i j j v 1=g i j i + v 1=g i j j ! 1g i j = q i (z) v 1=g i j i v 1=g i j i + v 1=g i j j ! 1g i j +q j (z) v 1=g i j j v 1=g i j i + v 1=g i j j ! 1g i j q i (z) v 1=g i j i v 1=g i j i + v 1=g i j j ! +q j (z) v 1=g i j j v 1=g i j i + v 1=g i j j ! minfq i (z);q j (z)g; where the first inequality uses the fact that a 1g a for a2[0;1] and g2[0;1]. In this case, we get m i j (z)=r i j (z)q i (z)q j (z) minfq i (z);q j (z)gq i (z)q j (z)=maxfq i (z);q j (z)g. 128 Using the vector = (q 1 ;:::;q m ), the function å m i=1 (m i)q i in Lemma B.3.1 corresponds to the function F() in Step 1 in our outline of the proof of Theorem 3.4.1. Therefore, Lemma B.3.1 establishes Step 1. Next, we focus on establishing Step 2. In particular, we construct a function G :R m + Z 3 + !R + that satisfies f R G(;k 1 ;k 2 ;m) for any 1 k 1 k 2 m. Lemma B.3.3 We have f R max (k 1 ;k 2 ): 1k 1 k 2 m ( k 1 å i=1 (2n k 1 k 2 + 2i 2)q i + k 2 å i=k 1 +1 (n 1)q i ) : Proof: Consider the solution(^ x; ^ y)2R j ˆ Nj + R j ˆ Mj + to problem (B.3) that is obtained by letting ˆ x i = 1 for all i2f1;:::;k 1 g, x i = 1 2 for all i2fk 1 +1;:::;k 2 g and ˆ x i = 0 for all i2fk 2 +1;:::;mg and ˆ y i j =[ ˆ x i + ˆ x j 1] + . The solution(^ x; ^ y) is feasible but not necessarily optimal to problem (B.3), in which case, noting that the optimal objective value of problem (B.3) is f R , we get f R å (i; j)2 ˆ M (m i j ˆ y i j +q i ˆ x i +q j ˆ x j )+ 2(n m) å i2 ˆ N q i ˆ x i = å (i; j)2 ˆ M m i j ˆ y i j + å i2 ˆ N å j2 ˆ N 1(i6= j)(q i ˆ x i +q j ˆ x j )+ 2(n m) å i2 ˆ N q i ˆ x i : Since j ˆ Nj = m, we have å i2 ˆ N å j2 ˆ N 1(i6= j)q i ˆ x i = (m 1)å i2 ˆ N q i ˆ x i . Similarly, we have å j2 ˆ N å i2 ˆ N 1(i6= j)q j ˆ x j =(m 1)å j2 ˆ N q j ˆ x j . Thus, the chain of inequalities above yields f R å (i; j)2 ˆ M m i j ˆ y i j + 2(m 1) å i2 ˆ N q i ˆ x i + 2(n m) å i2 ˆ N q i ˆ x i = å (i; j)2 ˆ M m i j ˆ y i j + 2(n 1) å i2 ˆ N q i ˆ x i å (i; j)2 ˆ M maxfq i ;q j g ˆ y i j + 2(n 1) å i2 ˆ N q i ˆ x i ; where the last inequality follows from the fact that we have m i j maxfq i ;q j g for all (i; j)2 ˆ M by Lemma B.3.2. We compute each one of the two sums on the right side of the inequality above separately. Considering the sumå i2 ˆ N q i ˆ x i , the definition of ˆ x i implies that å i2 ˆ N q i ˆ x i = k 1 å i=1 q i + k 2 å i=k 1 +1 q i 2 : 129 On the other hand, considering the sum å (i; j)2 ˆ M maxfq i ;q j g ˆ y i j , we have å (i; j)2 ˆ M maxfq i ;q j g ˆ y i j = å (i; j)2 ˆ M 1(i< j)q i ˆ y i j +å (i; j)2 ˆ M 1(i> j)q j ˆ y i j , where we use the fact thatq 1 q 2 :::q m . Noting the definition of ^ x and using the fact that ˆ y i j =[ ˆ x i + ˆ x j 1] + at the beginning of the proof, we have ˆ y i j = 1 for all i2f1;:::;k 1 g and j2f1;:::;k 1 g. Similarly, we have ˆ y i j = 1 2 for all i2f1;:::;k 1 g and j2fk 1 +1;:::;k 2 g. Lastly, we have ˆ y i j = 1 2 for all i2fk 1 + 1;:::;k 2 g and j2f1;:::;k 1 g. For the other cases not considered by the preceding three conditions, we have ˆ y i j = 0. Thus, we get å (i; j)2 ˆ M 1(i< j)q i ˆ y i j = å (i; j)2 ˆ M 1(i< j k 1 )q i + å (i; j)2 ˆ M 1(i k 1 < j k 2 )q i 1 2 = å i2 ˆ N q iå j2 ˆ N 1(i< j k 1 )+ 1 2 å i2 ˆ N q iå j2 ˆ N 1(i k 1 < j k 2 ) = å i2 ˆ N q i 1(i k 1 )(k 1 i)+ 1 2 å i2 ˆ N q i 1(i k 1 )(k 2 k 1 ) = å i2 ˆ N 1(i k 1 ) ( 1 2 k 1 + 1 2 k 2 i ) q i = k 1 å i=1 ( 1 2 k 1 + 1 2 k 2 i ) q i : By the same computation in the chain of equalities above, we also have å (i; j)2 ˆ M 1(i> j)q j ˆ y i j = å k 1 j=1 k 1 2 + k 2 2 j q j . Therefore, we obtain f R å (i; j)2 ˆ M maxfq i ;q j g ˆ y i j + 2(n 1) å i2 ˆ N q i ˆ x i = k 1 å i=1 ( 1 2 k 1 + 1 2 k 2 i ) q i k 1 å j=1 ( 1 2 k 1 + 1 2 k 2 j ) q j + 2(n 1) k 1 å i=1 q i +(n 1) k 2 å i=k 1 +1 q i = k 1 å i=1 (2n k 1 k 2 + 2i 2)q i + k 2 å i=k 1 +1 (n 1)q i : The inequality above holds for all choices of k 1 and k 2 such that 1 k 1 k 2 m. In this case, the desired follows by taking the maximum of the expression on the right side above over all (k 1 ;k 2 ) that satisfies 1 k 1 k 2 m. Viewing the objective function of the maximization problem in Lemma B.3.3 as a function of = (q 1 ;:::;q m ), k 1 , k 2 and m, this objective function corresponds to the function G(;k 1 ;k 2 ;m) in Step 2. Thus, Lemma B.3.3 establishes Step 2. In the proof of Lemma B.3.3, we construct a feasi- ble solution (^ x; ^ y)2R j ˆ Nj + R j ˆ Mj + to problem (B.3) by setting ˆ x i = 1 for all i2f1;:::;k 1 g, x i = 1 2 for all i2fk 1 + 1;:::;k 2 g and ˆ x i = 0 for all i2fk 2 + 1;:::;mg and ˆ y i j =[ ˆ x i + ˆ x j 1] + for all(i; j)2 ˆ M. Our choice of this solution is motivated by the fact that if we maximize the functionå (i; j)2 ˆ M maxfq i ;q j g ˆ y i j + 130 2(n 1)å i2 ˆ N q i ˆ x i over the feasible set of problem (B.3), then there exists an optimal solution to problem (B.3) of this form for some choices of k 1 and k 2 . Our development does not require showing this result explicitly, so we do not dwell on it further. B.3.2 Removing Dependence on Product Revenues and Preference Weights In this section, we establish Step 3. In Lemma B.3.1,å m i=1 (mi)q i is a function of(q 1 ;:::;q m ). Similarly, the optimal objective value of the maximization problem on the right side of the inequality in Lemma B.3.3 also depends on(q 1 ;:::;q m ). Next, we remove the dependence of these bounds on(q 1 ;:::;q m ). In particular, by Lemma B.3.3, we have å m i=1 (m i)q i f R å m i=1 (m i)q i max (k 1 ;k 2 ): 1k 1 k 2 m ( å k 1 i=1 (2n k 1 k 2 + 2i 2)q i +å k 2 i=k 1 +1 (n 1)q i ) = min (k 1 ;k 2 ): 1k 1 k 2 m ( å m i=1 (m i)q i å k 1 i=1 (2n k 1 k 2 + 2i 2)q i +å k 2 i=k 1 +1 (n 1)q i ) min (k 1 ;k 2 ): 1k 1 k 2 m ( å m i=1 (m i)q i å k 1 i=1 (2m k 1 k 2 + 2i 2)q i +å k 2 i=k 1 +1 (m 1)q i ) min (k 1 ;k 2 ): 1k 1 k 2 m ( max (q 1 ;:::;q m ): q 1 :::q m 0 ( å m i=1 (m i)q i å k 1 i=1 (2m k 1 k 2 + 2i 2)q i +å k 2 i=k 1 +1 (m 1)q i )) ; (B.4) where the second inequality is by the fact that n m and 2mk 1 k 2 +2i2 0 whenever k 1 k 2 m. There are two features of the maximization problem on the right side of (B.4). First, if (q 1 ;:::;q m ) is an optimal solution to this problem, then (aq 1 ;:::;aq m ) is also an optimal solution for any a > 0. Thus, we can assume that q 1 1. Second, the objective function of the maximization problem on the right side above is quasi-linear. Thus, an optimal solution occurs at an extreme point of the polyhedron f(q 1 ;:::;q m )2R m : 1q 1 q 2 :::q m 0g. It is simple to check that an extreme point( ˆ q 1 ;:::; ˆ q m ) of this polyhedron is of the form ˆ q i = 1 for all i= 1;:::;` and ˆ q i = 0 for all i=`+ 1;:::;m for some `2f0;:::;mg. In particular, if we have 0<q i < 1 for some i2f0;:::;mg, then we can express( ˆ q 1 ;:::; ˆ q m ) as a convex combination of two points in the polyhedron. This argument shows that an optimal solution ( ˆ q 1 ;:::; ˆ q m ) to the maximization problem is of the form ˆ q i = 1 for all i = 1;:::;` and ˆ q i = 0 for all i=`+ 1;:::;m for some`2f0;:::;mg. Building on these observations, we give an upper bound on the 131 optimal objective value of the maximization problem on the right side of (B.4). Throughout the rest of our discussion, we will use the functions G 1 (k 1 ;k 2 ;m) = m(m 1)=2 mk 1 + mk 2 k 1 k 2 k 2 ; (B.5) G 2 (k 1 ;k 2 ;m) = m 1 2m k 1 k 2 ; G 3 (k 1 ;k 2 ;m) = max q2R + ( mq q(q+ 1)=2 mk 1 k 1 k 2 + mq q ) : In the next lemma, we use G 1 (k 1 ;k 2 ;m), G 2 (k 1 ;k 2 ;m) and G 3 (k 1 ;k 2 ;m) to give an upper bound on the optimal objective value of the maximization problem on the right side of (B.4). Lemma B.3.4 We have max (q 1 ;:::;q m ): q 1 :::q m 0 ( å m i=1 (m i)q i å k 1 i=1 (2m k 1 k 2 + 2i 2)q i +å k 2 i=k 1 +1 (m 1)q i ) G 1 (k 1 ;k 2 ;m)_G 2 (k 1 ;k 2 ;m)_G 3 (k 1 ;k 2 ;m): (B.6) Proof: By the discussion right before the lemma, there exists an optimal solution( ˆ q 1 ;:::; ˆ q m ) to the max- imization problem in (B.6) such that ˆ q i = 1 for all i= 1;:::;` and ˆ q i = 0 for all i=`+ 1;:::;m for some `2f0;:::;mg. The denominator of the objective function of the maximization problem in (B.6) does not depend onfq i : i= k 2 + 1;:::;mg, which implies that if`> k 2 , then we can assume that`= m. In particu- lar, if`> k 2 , then setting`= m increases the nominator without changing the denominator. So, we assume that` k 2 or`= m. If`= m so that ˆ q 1 =:::= ˆ q m = 1, then the objective value of the maximization problem in (B.6) at the optimal solution( ˆ q 1 ;:::; ˆ q m ) is å m i=1 (m i) å k 1 i=1 (2m k 1 k 2 + 2i 2)+å k 2 i=k 1 +1 (m 1) = m(m 1)=2 k 1 (2m k 1 k 2 )+ k 1 (k 1 1)+(k 2 k 1 )(m 1) = m(m 1)=2 mk 1 + mk 2 k 1 k 2 k 2 = G 1 (k 1 ;k 2 ;m): (B.7) On the other hand, if` k 1 so that ˆ q 1 =:::= ˆ q ` = 1 and ˆ q `+1 =::: ˆ q m = 0, then the objective value of the maximization problem in (B.6) is å ` i=1 (m i) å ` i=1 (2m k 1 k 2 + 2i 2)q i = m``(`+ 1)=2 `(2m k 1 k 2 )+`(` 1) = m(`+ 1)=2 2m k 1 k 2 +` 1 ; 132 which is decreasing in`. Therefore, if we maximize the expression above over all` satisfying 1` k 1 , then the maximizer occurs at`= 1. Thus, we get å ` i=1 (m i) å ` i=1 (2m k 1 k 2 + 2i 2)q i m 1 2m k 1 k 2 = G 2 (k 1 ;k 2 ;m): (B.8) Finally, if k 1 + 1` k 2 so that ˆ q 1 =:::= ˆ q k 1 = ˆ q k 1 +1 =:::= ˆ q ` = 1 and ˆ q `+1 =::: ˆ q m = 0, then the objective value of the maximization problem in (B.6) is å ` i=1 (m i) å k 1 i=1 (2m k 1 k 2 + 2i 2)+å ` i=k 1 +1 (m 1) = m``(`+ 1)=2 k 1 (2m k 1 k 2 )+ k 1 (k 1 1)+(` k 1 )(m 1) = m``(`+ 1)=2 mk 1 k 1 k 2 + m`` G 3 (k 1 ;k 2 ;m): (B.9) Putting (B.7), (B.8) and (B.9) together, the optimal objective value of the maximization problem in (B.6) is no larger thanG 1 (k 1 ;k 2 ;m)_G 2 (k 1 ;k 2 ;m)_G 3 (k 1 ;k 2 ;m). In the objective function of the maximization problem in (B.6), recalling that F() corresponds to the function in the numerator and G(;k 1 ;k 2 ;m) corresponds to the function in the denominator, Lemma B.3.4 establishes Step 3. Note thatG 1 (k 1 ;k 2 ;m) andG 2 (k 1 ;k 2 ;m) in (B.5) have explicit expressions. Next, we give an explicit expression forG 3 (k 1 ;k 2 ;m) as well. Since we are interested in the values of k 2 satisfying k 2 m, the denominator of the fraction in the definition of G 3 (k 1 ;k 2 ;m) is non-negative. Furthermore, the numerator in this fraction is concave in q. Therefore, the objective function of the maximization problem in the definition ofG 3 (k 1 ;k 2 ;m) is quasi-concave, which implies that we can use the first order condition to characterize the optimal solution to this problem. In particular, differentiating the fraction in the maximization problem in the definition ofG 3 (k 1 ;k 2 ;m) with respect to q, the first order condition is (m q 1 2 )(mk 1 k 1 k 2 + mq q)(mq q(q+ 1)=2)(m 1) (mk 1 k 1 k 2 + mq q) 2 = (m 1)q 2 =2 k 1 (m k 2 )q+ k 1 (m k 2 )(m 1 2 ) (mk 1 k 1 k 2 + mq q) 2 = 0: There is only one positive solution to the second equality above. Using q(k 1 ;k 2 ;m) to denote this positive solution, we have q(k 1 ;k 2 ;m)= q k 2 1 (m k 2 ) 2 + 2k 1 (m k 2 )(m 1)(m 1 2 ) k 1 (m k 2 ) m 1 : (B.10) 133 To obtain an explicit expression forG 3 (k 1 ;k 2 ;m), consider the function h(q)= f(q)=g(q). Assume that the derivative of h(q) at ˆ q is zero. In other words, using f 0 ( ˆ q) and q 0 ( ˆ q) to, respectively, denote the derivatives of f() and g() evaluated at ˆ q, we have f 0 ( ˆ q)g( ˆ q) f( ˆ q)g 0 ( ˆ q)= 0. In this case, we obtain f( ˆ q)=g( ˆ q)= f 0 ( ˆ q)=g 0 ( ˆ q), which implies that h( ˆ q)= f( ˆ q)=g( ˆ q)= f 0 ( ˆ q)=g 0 ( ˆ q). To use this observation, we note that G 3 (k 1 ;k 2 ;m) is given by the value of the fraction in the definition of G 3 (k 1 ;k 2 ;m) evaluated at q(k 1 ;k 2 ;m). Furthermore, the derivative of this fraction with respect to q evaluated at q(k 1 ;k 2 ;m) is zero. Since the derivative of the numerator and denominator of this fraction with respect to q are, respectively, m q 1 2 and m 1, it follows that G 3 (k 1 ;k 2 ;m)= m q(k 1 ;k 2 ;m) 1 2 m 1 = 1+ 1 2(m 1) q(k 1 ;k 2 ;m) m 1 ; (B.11) which, noting (B.10), yields an explicit expression forG 3 (k 1 ;k 2 ;m). Therefore, we have explicit expres- sions forG 1 (k 1 ;k 2 ;m),G 2 (k 1 ;k 2 ;m) andG 3 (k 1 ;k 2 ;m). B.3.3 Uniform Bounds In this section, we establish Step 4. In particular, for any value of m, we show that there exist k 1 and k 2 satisfying 1 k 1 k 2 m,G 1 (k 1 ;k 2 ;m) 0:8,G 2 (k 1 ;k 2 ;m) 0:8 andG 3 (k 1 ;k 2 ;m) 0:8. Since we have explicit expressions forG 1 (k 1 ;k 2 ;m),G 2 (k 1 ;k 2 ;m) andG 3 (k 1 ;k 2 ;m), if m is small, then we can enumerate over all possible values of k 1 and k 2 that satisfy 1 k 1 k 2 m to ensure that there exist k 1 and k 2 such thatG 1 (k 1 ;k 2 ;m) 0:8,G 2 (k 1 ;k 2 ;m) 0:8 andG 3 (k 1 ;k 2 ;m) 0:8. In particular, through complete enumeration, it is simple to numerically verify that if m< 786, then there exist k 1 and k 2 satisfying 1 k 1 k 2 m,G 1 (k 1 ;k 2 ;m) 0:8,G 2 (k 1 ;k 2 ;m) 0:8 andG 3 (k 1 ;k 2 ;m) 0:8. Thus, we only need to consider the case where m 786. We begin with some intuition for our approach. Assume that we always choose k 1 and k 2 as a fixed fraction of m. In particular, we always choose k 1 and k 2 as k 1 = ˆ b 1 m and k 2 = ˆ b 2 m for some ˆ b 1 2(0;1], ˆ b 2 2(0;1] and ˆ b 1 ˆ b 2 . Recall that we want to find some k 1 and k 2 satisfying 1 k 1 k 2 m, G 1 (k 1 ;k 2 ;m) 0:8,G 2 (k 1 ;k 2 ;m) 0:8 andG 3 (k 1 ;k 2 ;m) 0:8. Thus, there is no harm in trying to choose k 1 = ˆ b 1 m and k 2 = ˆ b 2 m. Naturally, k 1 and k 2 need to be integers and we shortly address this issue. By (B.5) and (B.11), if we choose k 1 and k 2 as k 1 = ˆ b 1 m and k 2 = ˆ b 2 m, then we have G 1 ( ˆ b 1 m; ˆ b 2 m;m) = m(m 1)=2 ˆ b 1 m 2 + ˆ b 2 m 2 ˆ b 1 ˆ b 2 m 2 ˆ b 2 m ; G 2 ( ˆ b 1 m; ˆ b 2 m;m) = m 1 2m ˆ b 1 m ˆ b 2 m ; 134 G 3 ( ˆ b 1 m; ˆ b 2 m;m) = 1+ 1 2(m 1) + ˆ b 1 m(m ˆ b 2 m) (m 1) 2 q ˆ b 2 1 m 2 (m ˆ b 2 m) 2 + 2 ˆ b 1 m(m ˆ b 2 m)(m 1)(m 1 2 ) (m 1) 2 : We let g 1 ( ˆ b 1 ; ˆ b 2 ) = lim m!¥ G 1 ( ˆ b 1 m; ˆ b 2 m;m), g 2 ( ˆ b 1 ; ˆ b 2 ) = lim m!¥ G 2 ( ˆ b 1 m; ˆ b 2 m;m) and g 3 ( ˆ b 1 ; ˆ b 2 ) = lim m!¥ G 3 ( ˆ b 1 m; ˆ b 2 m;m). Thus, taking limits in the expressions above, we get g 1 ( ˆ b 1 ; ˆ b 2 ) = 1 2( ˆ b 1 + ˆ b 2 ˆ b 1 ˆ b 2 ) ; g 2 ( ˆ b 1 ; ˆ b 2 ) = 1 2 ˆ b 1 ˆ b 2 ; g 3 ( ˆ b 1 ; ˆ b 2 ) = 1+ ˆ b 1 (1 ˆ b 2 ) q ˆ b 2 1 (1 ˆ b 2 2 )+ 2 ˆ b 1 (1 ˆ b 2 ): Roughly speaking, if m is large and we choose k 1 and k 2 as k 1 = ˆ b 1 m and k 2 = ˆ b 2 m, thenG 1 (k 1 ;k 2 ;m)_ G 2 (k 1 ;k 2 ;m)_G 3 (k 1 ;k 2 ;m) behaves similarly to g 1 ( ˆ b 1 ; ˆ b 2 )_g 2 ( ˆ b 1 ; ˆ b 2 )_g 3 ( ˆ b 1 ; ˆ b 2 ). We want to find some k 1 and k 2 with 1 k 1 k 2 m such that G 1 (k 1 ;k 2 ;m)_G 2 (k 1 ;k 2 ;m)_G 3 (k 1 ;k 2 ;m) 0:8. Us- ingg 1 ( ˆ b 1 ; ˆ b 2 )_g 2 ( ˆ b 1 ; ˆ b 2 )_g 3 ( ˆ b 1 ; ˆ b 2 ) as an approximation toG 1 (k 1 ;k 2 ;m)_G 2 (k 1 ;k 2 ;m)_G 3 (k 1 ;k 2 ;m), we choose ˆ b 1 and ˆ b 2 to ensure thatg 1 ( ˆ b 1 ; ˆ b 2 )_g 2 ( ˆ b 1 ; ˆ b 2 )_g 3 ( ˆ b 1 ; ˆ b 2 ) is as small as possible. In particular, we choose ˆ b 1 and ˆ b 2 as the solution to the system of equations g 1 ( ˆ b 1 ; ˆ b 2 )=g 2 ( ˆ b 1 ; ˆ b 2 ) and g 2 ( ˆ b 1 ; ˆ b 2 )= g 3 ( ˆ b 1 ; ˆ b 2 ). Solving this system of equations numerically, we obtain ˆ b 1 0:088302 and ˆ b 2 0:614542, yielding g 1 ( ˆ b 1 ; ˆ b 2 )_g 2 ( ˆ b 1 ; ˆ b 2 )_g 3 ( ˆ b 1 ; ˆ b 2 ) 0:770917. Since we need to ensure that k 1 k 2 , we also need to ensure that ˆ b 1 ˆ b 2 . Fortunately, the solution to the system of equationsg 1 ( ˆ b 1 ; ˆ b 2 )=g 2 ( ˆ b 1 ; ˆ b 2 ) and g 2 ( ˆ b 1 ; ˆ b 2 )=g 3 ( ˆ b 1 ; ˆ b 2 ) already satisfies this requirement. Also, we do not need the precise solution to the last system of equations, since our goal is to find some upper bound ong 1 ( ˆ b 1 ; ˆ b 2 )_g 2 ( ˆ b 1 ; ˆ b 2 )_g 3 ( ˆ b 1 ; ˆ b 2 ). An imprecise solution simply yields a slightly looser upper bound. Lastly, since the best upper bound we can find on g 1 ( ˆ b 1 ; ˆ b 2 )_g 2 ( ˆ b 1 ; ˆ b 2 )_g 3 ( ˆ b 1 ; ˆ b 2 ) is roughly equal to 0.8, we will be able to show that there exist k 1 and k 2 with 1 k 1 k 2 m andG 1 (k 1 ;k 2 ;m)_G 2 (k 1 ;k 2 ;m)_G 3 (k 1 ;k 2 ;m) 0:8. The preceding discussion provides some intuition, but it is not precise. We need an upper bound on G 1 (k 1 ;k 2 ;m)_G 2 (k 1 ;k 2 ;m)_G 3 (k 1 ;k 2 ;m), not ong 1 ( ˆ b 1 ; ˆ b 2 )_g 2 ( ˆ b 1 ; ˆ b 2 )_g 3 ( ˆ b 1 ; ˆ b 2 ). These two quantities are different for finite values of m. Also, k 1 and k 2 need to be integers, but choosing k 1 = ˆ b 1 m and k 2 = ˆ b 2 m does not necessarily provide integer values for k 1 and k 2 . To address these issues, we choose k 1 and k 2 as k 1 =d ˆ b 1 me and k 2 =d ˆ b 1 me. In this case, setting ˆ b 1 = 0:088302 and ˆ b 2 = 0:614542, we proceed to showing that G 1 (d ˆ b 1 me;d ˆ b 2 me;m) 0:8, G 2 (d ˆ b 1 me;d ˆ b 2 me;m) 0:8 and G 3 (d ˆ b 1 me;d ˆ b 2 me;m) 0:8, as long as m 786. Therefore, for any value of m 786, there exist values of k 1 and k 2 satisfying 135 1 k 1 k 2 m,G 1 (k 1 ;k 2 ;m) 0:8,G 2 (k 1 ;k 2 ;m) 0:8 andG 3 (k 1 ;k 2 ;m) 0:8, which establishes Step 4. Throughout this section, we fix ˆ b 1 = 0:088302 and ˆ b 2 = 0:614542. In the next lemma, we give a bound onG 1 (d ˆ b 1 me;d ˆ b 2 me;m). Lemma B.3.5 If m 786, then we haveG 1 (d ˆ b 1 me;d ˆ b 2 me;m) 0:8. Proof: We have ˆ b 1 md ˆ b 1 me ˆ b 1 m+1 and ˆ b 2 md ˆ b 2 me ˆ b 2 m+1. In this case, noting the definition ofG 1 (k 1 ;k 2 ;m), it follows that G 1 (d ˆ b 1 me;d ˆ b 2 me;m)= m(m 1)=2 md ˆ b 1 me+ md ˆ b 2 med ˆ b 1 med ˆ b 2 med ˆ b 2 me m 2 =2 ˆ b 1 m 2 + ˆ b 2 m 2 ( ˆ b 1 m+ 1)( ˆ b 2 m+ 1) ˆ b 2 m 1 1 2 ˆ b 1 + ˆ b 2 ˆ b 1 ˆ b 2 ( ˆ b 1 + 2 ˆ b 2 + 2)=m : The expression on the right side above is decreasing in m. Computing this expression with ˆ b 1 = 0:088302, ˆ b 2 = 0:614542 and m= 786, we get a value that does not exceed 0.78. Therefore, we have G 1 (d ˆ b 1 me;d ˆ b 2 me;m) 0:8 for all m 786. In the proof of Lemma B.3.5, we can check thatG 1 (d ˆ b 1 me;d ˆ b 2 me;m) 0:8 for all m 141, but we need to impose a lower bound of 786 on m anyway when dealing withG 3 (d ˆ b 1 me;d ˆ b 2 me;m) shortly. In the next lemma, we give an upper bound onG 2 (d ˆ b 1 me;d ˆ b 2 me;m). Lemma B.3.6 If m 786, then we haveG 2 (d ˆ b 1 me;d ˆ b 2 me;m) 0:8. Proof: Similar to our approach in the proof of Lemma B.3.5, we have ˆ b 1 md ˆ b 1 me ˆ b 1 m+ 1 and ˆ b 2 md ˆ b 2 me ˆ b 2 m+ 1. Noting the definition ofG 2 (k 1 ;k 2 ;m), it follows that G 2 (d ˆ b 1 me;d ˆ b 2 me;m)= m 1 2md ˆ b 1 med ˆ b 2 me m 2m( ˆ b 1 m+ 1)( ˆ b 2 m+ 1) = 1 2 ˆ b 1 ˆ b 2 2=m : The expression on the right side above is decreasing in m. If we compute this expression with ˆ b 1 = 0:088302, ˆ b 2 = 0:614542 and m= 786, then we get a value that does not exceed 0.78. In the next lemma, we come up with an upper bound onG 3 (d ˆ b 1 me;d ˆ b 2 me;m). Lemma B.3.7 If m 786, then we haveG 3 (d ˆ b 1 me;d ˆ b 2 me;m) 0:8. 136 Proof: We begin by providing bounds for several quantities. These bounds become useful later in the proof. For m 786, we boundd ˆ b 1 me=(m 1) and(md ˆ b 2 me)=(m 1) as ˆ b 1 d ˆ b 1 me m 1 ˆ b 1 + 2 m (B.12) 1 ˆ b 2 1 m md ˆ b 2 me m 1 1 ˆ b 2 + 1 m : (B.13) In particular, we haved ˆ b 1 me=(m1)( ˆ b 1 m+1)=(m1)= ˆ b 1 +( ˆ b 1 +1)=(m1) ˆ b 1 +2=m, where the last inequality uses the fact that ˆ b 1 = 0:088302 so that we have( ˆ b 1 + 1)=(m 1) 2=m for all m 3, but we already assume that m 786. Also, we haved ˆ b 1 me=(m1) ˆ b 1 m=(m 1) ˆ b 1 . Therefore, the chain of inequalities in (B.12) holds. On the other hand, we have(md ˆ b 2 me)=(m 1)(m ˆ b 2 m)=(m 1)= 1 ˆ b 2 +(1 ˆ b 2 )=(m 1) 1 ˆ b 2 + 1=m, where the last inequality uses the fact that ˆ b 2 = 0:614542 so that we have(1 ˆ b 2 )=(m1) 1=m for all m 2, but once again, we already assume that m 786. Also, we have(md ˆ b 2 me)=(m 1)(m ˆ b 2 m 1)=(m 1)= 1 ˆ b 2 ˆ b 2 =(m 1) 1 ˆ b 2 1=m, where the last inequality uses the fact that ˆ b 2 =(m 1) 1=m for all m 3. Therefore, the chain of inequalities in (B.13) holds as well. Next, we define the functionL(k 1 ;k 2 ;m) and the constantl( ˆ b 1 ; ˆ b 2 ) as L(k 1 ;k 2 ;m)= k 2 1 (m k 2 ) 2 + 2k 1 (m k 2 )(m 1)(m 1 2 ) (m 1) 4 l( ˆ b 1 ; ˆ b 2 )= ˆ b 2 1 (1 ˆ b 2 ) 2 + 2 ˆ b 1 (1 ˆ b 2 ): RelatingL(d ˆ b 1 me;d ˆ b 2 me;m) to l( ˆ b 1 ; ˆ b 2 ), we will relateG 3 (d ˆ b 1 me;d ˆ b 2 me;m) to g 3 ( ˆ b 1 ; ˆ b 2 ). We claim that q L(d ˆ b 1 me;d ˆ b 2 me;m) q l( ˆ b 1 ; ˆ b 2 ) 8=m. To see the claim, note that d ˆ b 1 me 2 (md ˆ b 2 me) 2 (m 1) 4 ˆ b 2 1 1 ˆ b 2 1 m ! 2 = ˆ b 2 1 (1 ˆ b 2 ) 2 2 m (1 ˆ b 2 )+ 1 m 2 ! ˆ b 2 1 (1 ˆ b 2 ) 2 2 m ˆ b 2 1 (1 ˆ b 2 ) ˆ b 2 1 (1 ˆ b 2 ) 2 1 m ; where the first inequality uses (B.12) and (B.13), whereas the third inequality uses the fact that ˆ b 2 1 (1 ˆ b 2 ) 2 1 2 . Furthermore, we have d ˆ b 1 me(md ˆ b 2 me)(m 1)(m 1 2 ) (m 1) 4 ˆ b 1 1 ˆ b 2 1 m ! m 1 2 m 1 ! 137 ˆ b 1 1 ˆ b 2 1 m ! ˆ b 1 (1 ˆ b 2 ) 1 m ; where the first inequality uses (B.12) and (B.13) and the third inequality uses the fact that ˆ b 1 1. Multiplying the inequality above by two, adding the last two inequalities and noting the definition of L(k 1 ;k 2 ;m), we getL(d ˆ b 1 me;d ˆ b 2 me;m)l( ˆ b 1 ; ˆ b 2 )3=m. Since ˆ b 1 = 0:088302 and ˆ b 2 = 0:614542, we can compute the value of l( ˆ b 1 ; ˆ b 2 ) to check that q l( ˆ b 1 ; ˆ b 2 ) 1=4. Therefore, for all m 64, we have 16 q l( ˆ b 1 ; ˆ b 2 ) 64=m 4 64=m 3, in which case, we obtain L(d ˆ b 1 me;d ˆ b 2 me;m) l( ˆ b 1 ; ˆ b 2 ) 3 m l( ˆ b 1 ; ˆ b 2 ) 16 m q l( ˆ b 1 ; ˆ b 2 )+ 64 m 2 = q l( ˆ b 1 ; ˆ b 2 ) 8 m ! 2 : Taking the square root above, we obtain q L(d ˆ b 1 me;d ˆ b 2 me;m) q l( ˆ b 1 ; ˆ b 2 ) 8=m, which establishes the desired claim. Noting (B.10) and the definition ofL(k 1 ;k 2 ;m), we have q(d ˆ b 1 me;d ˆ b 2 me;m) m 1 = q L(d ˆ b 1 me;d ˆ b 2 me;m) d ˆ b 1 me(md ˆ b 2 me) (m 1) 2 q L(d ˆ b 1 me;d ˆ b 2 me;m) ˆ b 1 + 2 m ! 1 ˆ b 2 + 1 m ! q l( ˆ b 1 ; ˆ b 2 ) 8 m ˆ b 1 + 2 m ! 1 ˆ b 2 + 1 m ! = q l( ˆ b 1 ; ˆ b 2 ) ˆ b 1 (1 ˆ b 2 ) 1 m (8+ ˆ b 1 + 2(1 ˆ b 2 )) 2 m 2 q l( ˆ b 1 ; ˆ b 2 ) ˆ b 1 (1 ˆ b 2 ) 10 m ; where the first inequality uses (B.12) and (B.13), whereas the third inequality holds since 8+ ˆ b 1 + 2(1 ˆ b 2 ) 9 and 2=m 2 1=m for all m 2. Using (B.11) and the inequality above, we get G 3 (d ˆ b 1 me;d ˆ b 2 me;m)= 1+ 1 2(m 1) q(d ˆ b 1 me;d ˆ b 2 me;m) m 1 1+ 1 2(m 1) q l( ˆ b 1 ; ˆ b 2 )+ ˆ b 1 (1 ˆ b 2 )+ 10 m 1 q l( ˆ b 1 ; ˆ b 2 )+ ˆ b 1 (1 ˆ b 2 )+ 11 m 0:786+ 11 m ; 138 where the second inequality uses the fact that 1=(2(m 1)) 1=m for all m 2 and the third inequality follows from the fact that q l( ˆ b 1 ; ˆ b 2 ) 1=4 and ˆ b 1 0:09 and ˆ b 2 0:6. The result follows by noting that 0:786+ 11=m 0:8 for all m 786. Putting Lemmas B.3.5, B.3.6 and B.3.7 together establishes Step 4. B.4 Method of Conditional Expectations for the Uncapacitated Problem Assume that we have a random subset of products ^ X =f ˆ X i : i2 Ng that satisfies the inequality å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 0:6 f R . In the method of condition expectations, we inductively construct a subset of productsx (k) =( ˆ x 1 ;:::; ˆ x k ; ˆ X k+1 ;:::; ˆ X n ) for all k2 N, where the first k products in this subset are deterministic and the last n k products are random variables. Each one of these sub- sets of products is constructed to ensure that we haveå (i; j)2M EfV i j (x (k) ) g i j (R i j (x (k) ) ˆ z)g 0:6 f R for all k2 N. In this case, the subset of productsx (n) =( ˆ x 1 ;:::; ˆ x n ) is a deterministic subset of products that satisfies å (i; j)2M V i j (x (n) ) g i j (R i j (x (n) ) ˆ z) 0:6 f R , as desired. To inductively construct the subset of productsx (k) =( ˆ x 1 ;:::; ˆ x k ; ˆ X k+1 ;:::; ˆ X n ) for all k2 N, we start withx (0) = ^ X. By Theorem 3.4.1, we haveå (i; j)2M EfV i j (x (0) ) g i j (R i j (x (0) ) ˆ z)g 0:6 f R . Assuming that we have a subset of productsx (k) that satisfies å (i; j)2M EfV i j (x (k) ) g i j (R i j (x (k) ) ˆ z)g 0:6 f R , we show how to construct a subset of products x (k+1) that satisfieså (i; j)2M EfV i j (x (k+1) ) g i j (R i j (x (k+1) ) ˆ z)g 0:6 f R . By the induction assumption, we have 0:6 f R å (i; j)2M EfV i j (x (k) ) g i j (R i j (x (k) ) ˆ z)g. Conditioning on ˆ X k+1 , we write the last inequality as 0:6 f R Pf ˆ X k+1 = 1g å (i; j)2M EfV i j (x (k) ) g i j (R i j (x (k) ) ˆ z)j ˆ X k+1 = 1g +Pf ˆ X k+1 = 0g å (i; j)2M EfV i j (x (k) ) g i j (R i j (x (k) ) ˆ z)j ˆ X k+1 = 0g: We define the two subsets of products as ˜ x (k) = ( ˆ x 1 ;:::; ˆ x k ;1; ˆ X k+2 ;:::; ˆ X n ) and as ¯ x (k) = ( ˆ x 1 ;:::; ˆ x k ;0; ˆ X k+2 ;:::; ˆ X n ). By the definition ofx (k) , given that ˆ X k+1 = 1, we havex (k) = ˜ x (k) . Given that ˆ X k+1 = 0, we havex (k) = ¯ x (k) . So, we write the inequality above as 0:6 f R Pf ˆ X k+1 = 1g å (i; j)2M EfV i j ( ˜ x (k) ) g i j (R i j ( ˜ x (k) ) ˆ z)g 139 +Pf ˆ X k+1 = 0g å (i; j)2M EfV i j ( ¯ x (k) ) g i j (R i j ( ¯ x (k) ) ˆ z)g max n å (i; j)2M EfV i j ( ˜ x (k) ) g i j (R i j ( ˜ x (k) ) ˆ z)g; å (i; j)2M EfV i j ( ¯ x (k) ) g i j (R i j ( ¯ x (k) ) ˆ z)g o : Thus, either å (i; j)2M EfV i j ( ˜ x (k) ) g i j (R i j ( ˜ x (k) ) ˆ z)g or å (i; j)2M EfV i j ( ¯ x (k) ) g i j (R i j ( ¯ x (k) ) ˆ z)g is at least 0:6 f R , indicating that we can use ˜ x (k) or ¯ x (k) asx (k+1) . In both ˜ x (k) and ¯ x (k) , the first k+ 1 products are deterministic and the last n k 1 products are random variables, as desired. Considering the computational effort for the method of conditional expectations, we can compute å (i; j)2M EfV i j (x (0) ) g i j (R i j (x (0) ) ˆ z)g in O(n 2 ) operations. The subset of products ˜ x (k) differs from the subset of productsx (k) only in product k+ 1, which implies that the quantityEfV i j ( ˜ x (k) ) g i j (R i j ( ˜ x (k) ) ˆ z)g differs from EfV i j (x (k) ) g i j (R i j (x (k) ) ˆ z)g only for the nests that include product k+ 1. There are O(n) such nests. Therefore, if we know the value ofå (i; j)2M EfV i j (x (k) ) g i j (R i j (x (k) ) ˆ z)g, then we can compute å (i; j)2M EfV i j ( ˜ x (k) ) g i j (R i j ( ˜ x (k) ) ˆ z)g in O(n) operations. Similarly, if we know the value of å (i; j)2M EfV i j (x (k) ) g i j (R i j (x (k) ) ˆ z)g, then we can computeå (i; j)2M EfV i j ( ¯ x (k) ) g i j (R i j ( ¯ x (k) ) ˆ z)g in O(n) operations. Therefore, given the subset of productsx (k) and the value ofå (i; j)2M EfV i j (x (k) ) g i j (R i j (x (k) ) ˆ z)g, we can construct the subset of productsx (k+1) in O(n) operations. In the method of conditional expectations, we construct O(n) subsets of products of the formx (k) =( ˆ x 1 ;:::; ˆ x k ; ˆ X k+1 ;:::; ˆ X n ). Thus, the method of conditional expectations takes O(n 2 ) operations. B.5 Semidefinite Programming Relaxation We describe an approximation algorithm for the uncapacitated problem that provides an a-approximate solution witha = 2 p min q2[0;arccos(1=3)] (2p 3q)=(1+ 3 cosq) 0:79. Our development generally fol- lows the one in the main text. We develop an upper bound f R () on f(). This upper bound is based on an SDP relaxation of the Function Evaluation problem. Next, we show how to obtain a random subset of products ^ X such thatå (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) ˆ z)g 0:79 f R (ˆ z), where ˆ z satisfies f R (ˆ z)= v 0 ˆ z. We also show how to find the value of ˆ z that satisfies f R (ˆ z)= v 0 ˆ z. Lastly, we discuss how to de-randomize the random subset of products ^ X. 140 B.5.1 Constructing an Upper Bound We build on an approximation algorithm for quadratic optimization problems given in [57]. Recall that we can represent V i j (x) g i j (R i j (x) z) in the objective function of the Function Evaluation problem by r i j (z)x i x j +q i (z)x i (1 x j )+q j (z)x j (1 x i ). Instead of using the decision variablesx=(x 1 ;:::;x n )2 f0;1g n to capture the subset of offered products, we usey=(y 0 ;y 1 ;:::;y n )2f1;1g n+1 , where we have y 0 y i = 1 if we offer product i, whereas y 0 y i =1 if we do not offer product i. In this case, the decision variable x i is captured by(1+y 0 y i )=2. Thus, the expressionr i j (z)x i x j +q i (z)x i (1x j )+q j (z)x j (1x i ) is equivalent to r i j (z) 1+ y 0 y i 2 1+ y 0 y j 2 +q i (z) 1+ y 0 y i 2 1 y 0 y j 2 +q j (z) 1 y 0 y i 2 1+ y 0 y j 2 = r i j (z) 4 (1+ y 0 y i + y 0 y j + y i y j )+ q i (z) 4 (1+ y 0 y i y 0 y j y i y j ) + q i (z) 4 (1 y 0 y i + y 0 y j y i y j ): We define the function q(y 0 ;y i ;y j )= 1+ y 0 y i + y 0 y j + y i y j so that the expression above can be written asr i j (z)q(y 0 ;y i ;y j )=4+q i (z)q(y 0 ;y i ;y j )=4+q j (z)q(y 0 ;y i ;y j )=4. In this case, if there is no capacity constraint, then the Function Evaluation problem is equivalent to f(z)= max y2f1;1g n+1 : y i =y 0 8i2NnN(z) ( 1 4 å (i; j)2M (r i j (z)q(y 0 ;y i ;y j )+q i (z)q(y 0 ;y i ;y j )+q j (z)q(y 0 ;y i ;y j )) ) ; (B.14) where the constraint y i =y 0 for all i2 NnN(z) follows from the fact that we can use an argument similar to the one in the proof of Lemma B.2.1 to show that if i62 N(z), then there exists an optimal solution to the Function Evaluation problem that does not offer product i. To construct an upper bound on f(), letting ab denote the scalar product of the two vectorsa andb, for(u;v;w)2R n+1 R n+1 R n+1 , we define the function p :R n+1 R n+1 R n+1 !R as p(u;v;w)= 1+uv+uw+vw. In this case, using the decision variablesv=(v 0 ;v 1 ;:::;v n ) withv i 2R n+1 for all i= 0;1;:::;n, we define f R (z) f R (z) = max 1 4 å (i; j)2M (r i j (z) p(v 0 ;v i ;v j )+q i (z) p(v 0 ;v i ;v j )+q j (z) p(v 0 ;v i ;v j )) (B.15) s:t: v i v i = 18i2 N[f0g; v i =v 0 8i2 Nn N(z) p(v 0 ;v i ;v j ) 08(i; j)2 M; p(v 0 ;v i ;v j ) 08(i; j)2 M 141 p(v 0 ;v i ;v j ) 0 8(i; j)2 M: Using a feasible solutiony2f1;1g n+1 to problem (B.14), we can come up with a feasible solution v =(v 0 ;v 1 ;:::;v n ) to problem (B.15) such that the two solutions provide the same objective values. In particular, we can set v i k = y i = p n+ 1 for all i;k2 N[f0g. Thus, we have f R (z) f(z). Next, we formulate problem (B.15) as an SDP. We define the (n+ 1)-by-(n+ 1) symmetric matrix (z)=fL i j (z) :(i; j)2(N[f0g)(N[f0g)g as L i j (z)= 8 > > > > > > > < > > > > > > > : 0 if i= j 1 4 (r i j (z)q i (z)+q j (z)) if(i; j)2 N 2 and i< j å k2Nnf jg 1 4 (r k j (z)+q j (z)q k (z)) if i= 0 and j2 N. Since (z) is symmetric, we give only the entries that are above the diagonal. We useS n+1 + to denote the set of(n+1)-by-(n+1) symmetric positive semidefinite matrices. In this case, using the decision variables X =fX i j :(i; j)2(N[f0g)(N[f0g)g2R (n+1)(n+1) , we can equivalently formulate problem (B.15) as the SDP given by f R (z)= max X2S n+1 + n tr((z)X)+ 1 4 å (i; j)2M (r i j (z)+q i (z)+q j (z)) : X ii = 18i2 N[f0g; X i0 = X 0i =1 8i2 Nn N(z); X 0i + X 0 j + X i j 1 8(i; j)2 N 2 with i< j; X 0i X 0 j X i j 1 8(i; j)2 N 2 with i< j; X 0i + X 0 j X i j 18(i; j)2 N 2 with i< j; X i0 + X j0 + X i j 1 8(i; j)2 N 2 with i> j; X i0 X j0 X i j 1 8(i; j)2 N 2 with i> j; X i0 + X j0 X i j 1 8(i; j)2 N 2 with i> j o : (B.16) Problem (B.16) is useful to demonstrate that we can compute the upper bound f R (z) at any point z by solving an SDP, but to show the performance guarantee for the approximation algorithm we propose, we primarily work with problem (B.15). Later in our discussion, we use the dual of problem (B.16) to find the value of ˆ z that satisfies f R (ˆ z)= v 0 ˆ z. 142 B.5.2 Randomized Rounding and Performance Guarantee Fix any z2R + . To obtain a random assortment ^ X such that å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) z)g 0:79 f R (z), we study the following randomized rounding algorithm. Usingkk to denote the Euclidean norm, the inputs of the algorithm arev=(v 0 ;v 1 ;:::;v n ) andu=(u 0 ;u 1 ;:::;u n )2R n+1 , where we have v i 2R n+1 andkv i k= 1 for all i= 0;1;:::;n. Randomized Rounding Step 1: Ifv 0 u 0, then set y 0 = 1. Otherwise, set y 0 =1. Step 2: For all i2 Nn N(z), set y i =y 0 . Step 3: For all i2 N(z), ifv i u 0, then set y i = 1; otherwise, set y i =1. Step 4: LetX =(X 1 ;:::;X n )2f0;1g n be such that X i = 1 if y 0 y i = 1; otherwise, X i = 0. As a function of its input(v;u), we letX RR (v;u) be the output of the randomized rounding algorithm. To get a random subset of products ^ X satisfyingå (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) z)g 0:79 f R (z), we will use the input (^ v; ^ u), where ^ v is an optimal solution to problem (B.15) and the components of ^ u are independent and have the standard normal distribution. Considering the input(v;u) for the randomized rounding algorithm, we will write(v;u)2I if and only if for each i2f0;1;:::;ng, there exists some k2f0;1;:::;ng such that v i k 6= 0, u k has a normal distribution with non-zero variance and u k is independent offu j : j2(N[f0g)nfkgg. Observe that if we have(v;u)2I , then for each i2f0;1;:::;ng, there exists some k2f0;1;:::;ng such that v i k u k has a normal distribution with non-zero variance and v i k u k is independent offv i j u j : j2(N[f0g)nfkgg, in which case, it follows thatv i u=å j2N[f0g v i j u j is non-zero with probability one. Letting ^ v be an optimal solution to problem (B.15), by the first constraint in this problem, we have kv i k= 1 for each i= 0;1;:::;n. Therefore, for each i2f0;1;:::;ng, there exists some k2f0;1;:::;ng such that ˆ v i k 6= 0. In this case, letting ^ u be a vector with all components being independent and hav- ing the standard normal distribution, we have (^ v; ^ u)2I . In this section, we show that if we ex- ecute the randomized rounding algorithm with the input (^ v; ^ u), then its output X RR (^ v; ^ u) satisfies å (i; j)2M EfV i j (X RR (^ v; ^ u)) g i j (R i j (X RR (^ v; ^ u))z)g 0:79 f R (z). We use the following two lemmas from [57]. Lemma B.5.1 For all y2[1;1], we have 1 p arccos(y)c(1 y)=2 for some fixedc2[0:87;¥). 143 The lemma above is from Lemmas 3.4 and 3.5 in [57]. For any (v;u)2I , we define S u (v i ;v j ) as S u (v i ;v j )=Pfsign(v i u)= sign(v j u)g, where sign(x)= 1 if x> 0, whereas sign(x)=1 if x< 0. Since (v;u)2I , for all i= 0;1;:::;n, v i u is non-zero with probability one. Therefore, we do not specify sign(x) for x= 0. For(v;u)2I , an elementary computation in probability yields the identity Pfsign(v 0 u)= sign(v i u)= sign(v j u)g= 1 2 h S u (v 0 ;v i )+ S u (v 0 ;v j )+ S u (v i ;v j ) 1 i : (B.17) [57] show this identity in the proof of Lemma 7.3.1 in their paper. In the next lemma, letting p(;;) be as defined right before problem (B.15), we give a lower bound on the probability above when the components of the vectoru are standard normal. Lemma B.5.2 Assume that the components of the vectoru are independent and have the standard normal distribution andkv 0 k= 1,kv i k= 1 andkv j k= 1. For some fixeda2[0:79;0:87], we have 1 2 h S u (v 0 ;v i )+ S u (v 0 ;v j )+ S u (v i ;v j ) 1 i = 1 1 2p (arccos(v 0 v i )+ arccos(v 0 v j )+ arccos(v i v j )) a 4 p(v 0 ;v i ;v j ): The lemma above is from Lemmas 7.3.1 and 7.3.2 in [57]. In the next lemma, we give an equivalent expression forå (i; j)2M EfV i j (X RR (v;u)) g i j (R i j (X RR (v;u)) z)g. Lemma B.5.3 For any input(v;u)2I of the randomized rounding algorithm, the output of the algorithm X RR (v;u) satisfies å (i; j)2M EfV i j (X RR (v;u)) g i j (R i j (X RR (v;u)) z)g = 1 2 å (i; j)2M(z) ( r i j (z) h S u (v 0 ;v i )+ S u (v 0 ;v j )+ S u (v i ;v j ) 1 i + q i (z) h S u (v 0 ;v i )+ S u (v 0 ;v j )+ S u (v i ;v j ) 1 i + q j (z) h S u (v 0 ;v i )+ S u (v 0 ;v j )+ S u (v i ;v j ) 1 i ) + 2jNn N(z)j å i2N(z) q i (z)S u (v 0 ;v i ): (B.18) 144 Proof: Fixing the input (v;u), for notational brevity, we use ~ X to denote the output of the random- ized rounding algorithm for the fixed input. By the discussion at the beginning of Section 3.3.2, we haveå (i; j)2M EfV i j ( ~ X) g i j (R i j ( ~ X) z)g=å (i; j)2M (r i j (z)Pf ˜ X i = 1; ˜ X j = 1g+q i (z)Pf ˜ X i = 1; ˜ X j = 0g+ q j (z)Pf ˜ X i = 0; ˜ X j = 1g). We consider four cases. Case 1: Suppose i2 N(z) and j2 N(z) with i6= j. By Steps 3 and 4 of the randomized rounding algorithm, to have ˜ X i = 1, we need to have y 0 y i = 1, which, in turn, requires that we have sign(y 0 ) = sign(y i ). The last equality holds if and only if sign(v 0 u) = sign(v i u). In this case, by (B.17), it follows that Pf ˜ X i = 1; ˜ X j = 1g =Pfsign(v 0 u) = sign(v i u) = sign(v j u)g = 1 2 (S u (v 0 ;v i )+ S u (v 0 ;v j )+ S u (v i ;v j ) 1). Similarly, to have ˜ X j = 0, we need to have y 0 y i =1, which, in turn, requires that we have sign(y 0 ) =sign(y i ). The last equality holds if and only if sign(v 0 u) =sign(v j u), which we can equivalently write as sign(v 0 u) = sign(v j u). Thus, by (B.17), we get Pf ˜ X i = 1; ˜ X j = 0g = Pfsign(v 0 u) = sign(v i u) = sign(v j u)g = 1 2 (S u (v 0 ;v i )+ S u (v 0 ;v j )+ S u (v i ;v j ) 1). Interchanging the roles of ˜ X i and ˜ X j in the last chain of equalities, we also have Pf ˜ X i = 0; ˜ X j = 1g =Pfsign(v 0 u) = sign(v i u) = sign(v j u)g = 1 2 (S u (v 0 ;v i )+ S u (v 0 ;v j )+ S u (v i ;v j ) 1). Case 2: Suppose i2 N(z) and j62 N(z). By Steps 2 and 4 of the randomized rounding algo- rithm, we have ˜ X j = 0. By an argument similar to the one in Case 1, we also have Pf ˜ X i = 1g = Pfsign(v 0 u)= sign(v i u)g= S u (v 0 ;v i ). So, we getPf ˜ X i = 1; ˜ X j = 0g= S u (v 0 ;v i ). Case 3: Suppose i62 N(z) and j2 N. By the same argument in Case 2, we havePf ˜ X i = 0; ˜ X j = 1g= S u (v 0 ;v j ). Case 4: Suppose i62 N(z) and j62 N(z) with i6= j. In this case, we have ˜ X i = 0 and ˜ X j = 0. Putting all of the cases together, under Case 1, if ˜ X i or ˜ X j is non-zero, then we may have ˜ X i = 1, ˜ X j = 1, or ˜ X i = 1, ˜ X j = 0, or ˜ X i = 0, ˜ X j = 1. Under Case 2, if ˜ X i or ˜ X j is non-zero, then we must have ˜ X i = 1 and ˜ X j = 0. Under Case 3, if ˜ X i or ˜ X j is non-zero, then we must have ˜ X i = 0 and ˜ X j = 1. Collecting these observations, we obtain å (i; j)2M EfV i j ( ~ X) g i j (R i j ( ~ X) z)g = å (i; j)2M n r i j (z)Pf ˜ X i = 1; ˜ X j = 1g+q i (z)Pf ˜ X i = 1; ˜ X j = 0g+q j (z)Pf ˜ X i = 0; ˜ X j = 1g o = å (i; j)2M 1(i2 N(z); j2 N(z)) n r i j (z)Pf ˜ X i = 1; ˜ X j = 1g 145 +q i (z)Pf ˜ X i = 1; ˜ X j = 0g+q j (z)Pf ˜ X i = 0; ˜ X j = 1g o + å (i; j)2M 1(i2 N(z); j62 N(z))q i (z)Pf ˜ X i = 1; ˜ X j = 0g + å (i; j)2M 1(i62 N(z); j2 N(z))q j (z)Pf ˜ X i = 0; ˜ X i = 1g; in which case, plugging the expressions for the probabilitiesPf ˜ X i = 1; ˜ X j = 1g,Pf ˜ X i = 1; ˜ X j = 0g and Pf ˜ X i = 0; ˜ X j = 1g that we have under Cases 1, 2 and 3 above yields the desired result. In the next theorem, we give a performance guarantee for the subset of products obtained by the randomized rounding algorithm. Throughout our discussion,a is as given in Lemma B.5.2. Theorem B.5.4 For a fixed value of z2R + , let the subset of productsX RR (^ v; ^ u) be the output of the randomized rounding algorithm with the input(^ v; ^ u), where we have ^ v=(^ v 0 ;^ v 1 ;:::; ^ v n ) with ^ v i 2R n+1 andk^ v i k= 1 for all i= 0;1;:::;n and the components of the vector ^ u are independent and have the standard normal distribution. If ^ v i =^ v 0 for all i2 Nn N(z), then we have å (i; j)2M EfV i j (X RR (^ v; ^ u)) g i j (R i j (X RR (^ v; ^ u)) z)g a 4 å (i; j)2M n r i j (z) p(^ v 0 ;^ v i ;^ v j )+q i (z) p(^ v 0 ;^ v i ;^ v j )+q j (z) p(^ v 0 ;^ v i ;^ v j ) o : In particular, if we choose ^ v in the input (^ v; ^ u) as an optimal solution to problem (B.15), then we have å (i; j)2M EfV i j (X RR (^ v; ^ u)) g i j (R i j (X RR (^ v; ^ u)) z)g 0:79 f R (z). Proof: For notational brevity, we use ^ X to denote the output of the randomized rounding algorithm with the input(^ v; ^ u). Note that(^ v; ^ u)2I . We consider four cases. Case 1: Suppose i2 N(z) and j2 N(z) with i6= j. By Lemma B.5.2, we have the inequal- ity 1 2 (S ^ u (^ v 0 ;^ v i )+ S ^ u (^ v 0 ;^ v j )+ S ^ u (^ v i ;^ v j ) 1) a 4 p(^ v 0 ;^ v i ;^ v j ), in which case, since r i j (z) 0 whenever i2 N(z) and j 2 N(z), we obtain 1 2 r i j (z)(S ^ u (^ v 0 ;^ v i )+ S ^ u (^ v 0 ;^ v j )+ S ^ u (^ v i ;^ v j ) 1) a 4 r i j (z) p(^ v 0 ;^ v i ;^ v j ). We also get 1 2 q i (z)(S ^ u (^ v 0 ;^ v i ) + S ^ u (^ v 0 ;^ v j ) + S ^ u (^ v i ;^ v j ) 1) a 4 q i (z) p(^ v 0 ;^ v i ;^ v j ) and 1 2 q j (z)(S ^ u (^ v 0 ;^ v i )+ S ^ u (^ v 0 ;^ v j )+ S ^ u (^ v i ;^ v j ) 1) a 4 q j (z) p(^ v 0 ;^ v i ;^ v j ) by following the same reasoning. Adding the last three inequalities, we have 1 2 n r i j (z) h S ^ u (^ v 0 ;^ v i )+ S ^ u (^ v 0 ;^ v j )+ S ^ u (^ v i ;^ v j ) 1 i + q i (z) h S ^ u (^ v 0 ;^ v i )+ S ^ u (^ v 0 ;^ v j )+ S ^ u (^ v i ;^ v j ) 1 i 146 + q j (z) h S ^ u (^ v 0 ;^ v i )+ S ^ u (^ v 0 ;^ v j )+ S ^ u (^ v i ;^ v j ) 1 io a 4 n r i j (z) p(^ v 0 ;^ v i ;^ v j )+q i (z) p(^ v 0 ;^ v i ;^ v j )+q j (z) p(^ v 0 ;^ v i ;^ v j ) o : (B.19) Case 2: Suppose i2 N(z) and j62 N(z). By Lemma B.5.1, for all y2 [1;1], we have 1 1 p arccos(y)= 1 p (p arccos(y))= arccos(y)=pc(1+ y)=2. The definition of S u (v i ;v j ) implies that S ^ u (^ v i ;^ v i )= 1. In this case, by Lemma B.5.2, we obtain S ^ u (^ v i ;^ v j ) = 1 2 h S ^ u (^ v 0 ;^ v i )+ S ^ u (^ v 0 ;^ v i )+ S ^ u (^ v i ;^ v i ) 1 i = 1 1 2p (2 arccos(^ v 0 ^ v i )+ arccos(^ v i ^ v i )) = 1 1 p arccos(^ v 0 ^ v i ) c 2 (1+ ^ v 0 ^ v i ); where the third equality uses the fact thatk^ v i k = 1. On the other hand, since j2 Nn N(z), we have ^ v j =^ v 0 . Therefore, we obtain p(^ v 0 ;^ v i ;^ v j )= p(^ v 0 ;^ v i ;^ v 0 )= 1^ v 0 ^ v 0 = 0, where the second equality uses the definition of p(;;) and the last equality uses the fact thatk^ v 0 k= 1. By the same argument, we have p(^ v 0 ;^ v i ;^ v j ) = 0. Also, we have p(^ v 0 ;^ v i ;^ v j ) = p(^ v 0 ;^ v i ;^ v 0 ) = 2+ 2^ v 0 ^ v i 0, where the inequality uses the fact thatk^ v 0 k=k^ v i k= 1 so that ^ v 0 ^ v i 1. Since i2 N(z), we haveq i (z) 0. In this case, multiplying the chain of inequalities above byq i (z), we get q i (z)S ^ u (^ v i ;^ v j ) q i (z) c 2 (1+ ^ v 0 ^ v i ) q i (z) a 2 (1+ ^ v 0 ^ v i ) = a 4 n r i j (z) p(^ v 0 ;^ v i ;^ v j )+q i (z) p(^ v 0 ;^ v i ;^ v j )+q j (z) p(^ v 0 ;^ v i ;^ v j ) o ; (B.20) where the second inequality holds since c2[0:87;¥), a2[0:79;0:87] and ^ v 0 ^ v i 1, whereas the equality uses the fact that p(^ v 0 ;^ v i ;^ v j )= 0= p(^ v 0 ;^ v i ;^ v j ) and p(^ v 0 ;^ v i ;^ v j )= 2+ 2^ v 0 ^ v i . Case 3: Suppose i62 N(z) and j2 N. Interchanging the roles of products i and j in Case 2, the same reasoning in Case 2 yields q j (z)S ^ u (^ v i ;^ v j ) a 4 n r i j (z) p(^ v 0 ;^ v i ;^ v j )+q i (z) p(^ v 0 ;^ v i ;^ v j )+q j (z) p(^ v 0 ;^ v i ;^ v j ) o : (B.21) Case 4: Suppose i62 N(z) and j62 N(z) with i6= j. Since i62 N(z) and j62 N(z), we have ^ v i = ^ v j =^ v 0 . In this case, we obtain p(^ v 0 ;^ v i ;^ v j ) = p(^ v 0 ;^ v 0 ;^ v 0 ) = 1 ^ v 0 ^ v 0 = 0, where the second equality uses the definition of p(;;) and the third equality uses the fact thatk^ v 0 k = 1. Similarly, we have p(^ v 0 ;^ v i ;^ v j ) = 0 and p(^ v 0 ;^ v i ;^ v j ) = 0 as well. Therefore, it follows that 147 a 4 (r i j (z) p(^ v 0 ;^ v i ;^ v j )+q i (z) p(^ v 0 ;^ v i ;^ v j )+q j (z) p(^ v 0 ;^ v i ;^ v j ))= 0. To put the four cases considered above together, recalling that we use ^ X to denote the output of the randomized rounding algorithm with the input(^ v; ^ u), by Lemma B.5.3, we have å (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) z)g = 1 2 å (i; j)2M 1(i2 N(z); j2 N(z)) ( r i j (z) h S ^ u (^ v 0 ;^ v i )+ S ^ u (^ v 0 ;^ v j )+ S ^ u (^ v i ;^ v j ) 1 i + q i (z) h S ^ u (^ v 0 ;^ v i )+ S ^ u (^ v 0 ;^ v j )+ S ^ u (^ v i ;^ v j ) 1 i + q j (z) h S ^ u (^ v 0 ;^ v i )+ S ^ u (^ v 0 ;^ v j )+ S ^ u (^ v i ;^ v j ) 1 i ) + å (i; j)2M 1(i2 N(z); j62 N(z))q i (z)S ^ u (^ v 0 ;^ v i )+ å (i; j)2M 1(i62 N(z); j2 N(z))q j (z)S ^ u (^ v 0 ;^ v j ) a 4 å (i; j)2M n r i j (z) p(^ v 0 ;^ v i ;^ v j )+q i (z) p(^ v 0 ;^ v i ;^ v j )+q j (z) p(^ v 0 ;^ v i ;^ v j ) o ; where the inequality follows from (B.19), (B.20) and (B.21), along with the fact that p(^ v 0 ;^ v i ;^ v j ) = p(^ v 0 ;^ v i ;^ v j )= p(^ v 0 ;^ v i ;^ v j )= 0 when i62 N(z) and j62 N(z). The chain of inequalities above es- tablishes the first inequality in the lemma. To see the second inequality in the lemma, choosing ^ v as an optimal solution to problem (B.15) in the chain of inequalities above and noting the objective function of problem (B.15), we obtainå (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) z)ga f R (z). The optimal objective value of problem (B.15) is non-negative since setting ^ v i =^ v 0 for all i2 N provides a feasible solution to this problem with an objective value of zero. Therefore, noting that f R (z) 0 anda 0:79, the last inequality yieldså (i; j)2M EfV i j ( ^ X) g i j (R i j ( ^ X) z)g 0:79 f R (z). Thus, by Theorem B.5.4, letting ^ v be an optimal solution to problem (B.15) and ^ u be a vector whose components are independent and have the standard normal distribution, if we use the random- ized rounding algorithm with the input (^ v; ^ u), then the output of the algorithm X RR (^ v; ^ u) satisfies å (i; j)2M EfV i j (X RR (^ v; ^ u)) g i j (R i j (X RR (^ v; ^ u)) z)g 0:79 f R (z). The vector ^ u is a random variable, so the subset of products X RR (^ v; ^ u) is a random variable as well, but to use Theorem 3.3.1 to get an approximate solution, we need a deterministic subset of products ˆ x that satisfies the Sucient Condition. Since we construct the upper bound f R () by us- ing an SDP relaxation, the method of conditional expectations discussed in Section 3.4 does not work. 148 Nevertheless, [96] give a procedure to de-randomize the solutions that are obtained through SDP re- laxations. We shortly adopt their de-randomization procedure to de-randomize the subset of products X RR (^ v; ^ u). This de-randomization procedure is rather involved. As an alternative, we can simply sim- ulate many realizations of the random variableX RR (^ v; ^ u). In particular, since we know the distribu- tion of ^ u, we can simulate many realizations of the random variable ^ u and compute X RR (^ v; ^ u) for each realization. Therefore, simulating many realizations of X RR (^ v; ^ u) is straightforward. Since we have å (i; j)2M EfV i j (X RR (^ v; ^ u)) g i j (R i j (X RR (^ v; ^ u)) z)g 0:79 f R (z), there must be realizations ^ x of the random variableX RR (^ v; ^ u) with strictly positive probability that satisfy å (i; j)2M V i j (^ x) g i j (R i j (^ x) z) 0:79 f R (z). Also, since we know the value of f R (z), if we find a realization ^ x that satisfies å (i; j)2M V i j (^ x) g i j (R i j (^ x) z)a f R (z) for some a other than 0:79, then we can be sure that this sub- set is an a-approximate solution. Therefore, it is entirely possible that simulating many realizations of the random variable X RR (^ v; ^ u) may provide a deterministic subset of products ^ x that satisfies å (i; j)2M V i j (^ x) g i j (R i j (^ x) z)a f R (z) for a value of a that is larger than 0.79. Furthermore, since we know the values of bothå (i; j)2M V i j (^ x) g i j (R i j (^ x) z) and f R (z), we can compute the value of a. In the next section, we show how to compute the fixed point of f R ()=v 0 by solving an SDP. After this discussion, we show how to de-randomize the outputX RR (^ v; ^ u) of the randomized rounding algorithm, if desired. B.5.3 Computing the Fixed Point In Section 3.3.3, we use the dual of the Upper Bound problem to find the value of ˆ z satisfying f R (ˆ z)= v 0 ˆ z, where f R (z) is given by the optimal objective value of the Upper Bound problem. In this section, we use the dual of problem (B.16) to find the value of ˆ z satisfying f R (ˆ z)= v 0 ˆ z, where the f R (z) is given by the optimal objective value of the SDP in (B.16). To formulate the dual of problem (B.16), we let =fb i : i2 N[f0gg be the dual variables associated with the first constraint in problem (B.16). Writing the second and third constraints as X i0 =1 and X 0i =1, we letfy i0 : i2 Nn N(z)g andfy 0i : i2 Nn N(z)g be the dual variables associated with the second and third constraints in problem (B.16). Also, we letfg 1 i j :(i; j)2 N 2 with i< jg,fg 2 i j :(i; j)2 N 2 with i< jg andfg 3 i j :(i; j)2 N 2 with i< jg be the dual variables associated with the fourth, fifth and sixth constraints. Similarly, we letfg 1 i j :(i; j)2 N 2 with i> jg,fg 2 i j :(i; j)2 N 2 with i> jg andfg 3 i j : (i; j)2 N 2 with i> jg be the dual variables associated with 149 the last three constraints. We define the (n+ 1)-by-(n+ 1) symmetric matrix of decision variables = fG i j :(i; j)2(N[f0g)(N[f0g)g as G i j = 8 > > > > > > > < > > > > > > > : 0 if i= j g 1 i j g 2 i j g 3 i j if(i; j)2 N 2 and i< j å k2Nnf jg g 1 i j +g 2 i j g 3 i j if i= 0 and j2 N. Shortly, we restrict to be a symmetric matrix. Therefore, we give only the entries that are above the diagonal. Also, we define the (n+ 1)-by-(n+ 1) matrix of decision variables =fy i j :(i; j)2(N[f0g)(N[f0g)gg, where all entries other than fy i0 : i2 Nn N(z)g and fy 0i : i2 Nn N(z)g are set to zero. For fixed value of z, the dual of problem (B.16) is given by min å i2N[f0g b i å i2NnN(z) (y i0 +y 0i ) + å (i; j)2M (g 1 i j +g 2 i j +g 3 i j ) + 1 4 å (i; j)2M (r i j (z)+q i (z)+q j (z)) s.t. diag()+ (z)2S n+1 + 2R n+1 ; 2R (n+1)(n+1) ; 2R (n+1)(n+1) + ; where we use diag() to denote the diagonal matrix with diagonal entriesfb i : i2 N[f0gg. Similar to our approach in Section 3.3.3, to find the value of ˆ z that satisfies f R (ˆ z)= v 0 ˆ z, we solve min å i2N[f0g b i å i2NnN(z) (y i0 +y 0i ) + å (i; j)2M (g 1 i j +g 2 i j +g 3 i j ) + 1 4 å (i; j)2M (r i j (z)+q i (z)+q j (z)) s.t. diag()+ (z)2S n+1 + å i2N[f0g b i å i2NnN(z) (y i0 +y 0i ) + å (i; j)2M (g 1 i j +g 2 i j +g 3 i j ) + 1 4 å (i; j)2M (r i j (z)+q i (z)+q j (z)) = v 0 z 2R n+1 ; 2R (n+1)(n+1) ; 2R (n+1)(n+1) + ; z2R: By using precisely the same argument in the proof of Theorem 3.3.3, we can show that if( ˆ ; ˆ ; ˆ ; ˆ z) is an optimal solution to the SDP above, then f R (ˆ z)= v 0 ˆ z. 150 B.5.4 Preliminary Bounds for De-Randomizing the Subset of Products In this section and the next, we discuss how to de-randomize the output of our randomized rounding algorithm. In this section, we provide preliminary bounds that will be useful in the analysis of the de- randomization approach. In the next section, we give the de-randomization approach and its analysis. In our de-randomization approach, we follow [96], where the authors de-randomize an SDP relaxation- based approximation algorithm for the 3-vertex coloring problem. We adopt the approach in [96] for our assortment optimization setting. Letting ^ v=(^ v 0 ;^ v 1 ;:::; ^ v n ) be an optimal solution to problem (B.15), the starting point in the de-randomization approach is to compute a so-called discretized version of ^ v. The discretized version is discussed in Section 3.1 and Appendix 1 in [96]. The next lemma summarizes this discussion. Here, for any vectorv =(v 0 ;v 1 ;:::;v n )2R n+1 , we usev[k;:::;`]2R `k+1 to denote the vector(v k ;v k+1 ;:::;v ` ). Lemma B.5.5 Letting ^ v=(^ v 0 ;^ v 1 ;:::; ^ v n ) be an optimal solution to problem (B.15), in polynomial time, we can obtain the solution v=( v 0 ; v 1 ;:::; v n ) that satisfies the following properties. (a) We havejj v i jj= 1 for all i2 N[f0g and v i = v 0 for all i2 Nn N(z). (b) We havej v i v j ^ v i ^ v j j= O( 1 n ) for all i; j2 N[f0g; that is, the scalar product of any pair of vectors changes by O( 1 n ). (c) Letting v i =( ¯ v i 0 ; ¯ v i 1 ;:::; ¯ v i n ), we havej ¯ v i j j=W( 1 n 2 ) for all i; j2 N[f0g. (d) For all i; j2 N(z)[f0g and h2 N[f0g, if we rotate the coordinate system so that v i [h:::n]= (b 1 ;0;:::;0) and v j [h:::n]=(b 0 1 ;b 0 2 ;:::;0), then we havejb 1 j=W( 1 n 2 ) andjb 0 2 j=W( 1 n 4 ). Throughout our discussion, we use v to denote the discretized version of ^ v as discussed in the lemma above, where ^ v is an optimal solution to problem (B.15). We define C(z) as C(z)= 1 4 ( å (i; j)2M(z) (r i j (z)+q i (z)+q j (z))+ 2jNn N(z)j å i2N(z) q i (z) ) : In the next lemma, we give a simple bound on f R (z). Lemma B.5.6 We have 1 4 f R (z) C(z) f R (z): Proof: Usinge i 2R n+1 to denote the unit vector with a one in the i-th component, we define the so- lution ~ v = (~ v 0 ;~ v 1 ;:::; ~ v n ) to problem (B.15) as follows. For all i2 N(z)[f0g, we set ~ v i =e i . For 151 all i2 Nn N(z), we set ~ v i =~ v 0 . If i2 N(z) and j2 N(z), then we have ~ v 0 ~ v i = 0, ~ v 0 ~ v j = 0 and ~ v i ~ v j = 0, so that p(~ v 0 ;~ v i ;~ v j ) = p(~ v 0 ;~ v i ;~ v j ) = p(~ v 0 ;~ v i ;~ v j ) = 1. On the other hand, if i2 N(z) and j2 Nn N(z), then we have ~ v 0 ~ v i = 0, ~ v 0 ~ v j =1 and ~ v i ~ v j = 0, so we get p(~ v 0 ;~ v i ;~ v j ) = 0, p(~ v 0 ;~ v i ;~ v j )= 2 and p(~ v 0 ;~ v i ;~ v j )= 0. Similarly, if i2 Nn N(z) and j2 N(z), then we have p(~ v 0 ;~ v i ;~ v j )= 0, p(~ v 0 ;~ v i ;~ v j )= 0 and p(~ v 0 ;~ v i ;~ v j )= 2. Lastly, if i2 Nn N(z) and j2 Nn N(z), then we have p(~ v 0 ;~ v i ;~ v j )= p(~ v 0 ;~ v i ;~ v j )= p(~ v 0 ;~ v i ;~ v j )= 0. Thus, the solution ~ v is feasible to problem (B.15). Also, it is simple to check that this solution provides an objective value of 1 4 få (i; j)2M(z) (r i j (z)+q i (z)+q j (z))+ 4jNn N(z)jå i2N(z) q i (z)g for problem (B.15). Since ~ v is a feasible but not necessarily an optimal solution to problem (B.15), we obtain f R (z) 1 4 ( å (i; j)2M(z) (r i j (z)+q i (z)+q j (z))+ 4jNn N(z)j å i2N(z) q i (z) ) C(z): Let ^ v = (^ v 0 ;^ v 1 ;:::; ^ v n ) be an optimal solution to problem (B.15). Since k^ v i k = 1 for all i2 N[f0g, we have ^ v i ^ v j 1 for all i; j2 N[f0g, so we get p(^ v 0 ;^ v i ;^ v j ) 4, p(^ v 0 ;^ v i ;^ v j ) 4 and p(^ v 0 ;^ v i ;^ v j ) 4. Also, if i2 N(z) and j2 Nn N(z), then ^ v 0 ^ v 0 = 1 and ^ v j =^ v 0 by the first two constraints in problem (B.15), which imply that p(^ v 0 ;^ v i ;^ v j ) = p(^ v 0 ;^ v i ;^ v 0 ) = 0 and p(^ v 0 ;^ v i ;^ v j )= p(^ v 0 ;^ v i ;^ v 0 )= 0. Similarly, if i2 NnN(z) and j2 N(z), then we have p(^ v 0 ;^ v i ;^ v j )= p(^ v 0 ;^ v 0 ;^ v j )= 0 and p(^ v 0 ;^ v i ;^ v j )= p(^ v 0 ;^ v 0 ;^ v j )= 0. Lastly, if i2 Nn N(z) and j2 Nn N(z), then we have p(^ v 0 ;^ v i ;^ v j )= p(^ v 0 ;^ v i ;^ v j )= p(^ v 0 ;^ v i ;^ v j )= 0. So, we get f R (z) = 1 4 å (i; j)2M (r i j (z) p(^ v 0 ;^ v i ;^ v j )+q i (z) p(^ v 0 ;^ v i ;^ v j )+q j (z) p(^ v 0 ;^ v i ;^ v j )) = 1 4 å (i; j)2M 1(i2 N(z); j2 N(z)) n r i j (z) p(^ v 0 ;^ v i ;^ v j )+q i (z) p(^ v 0 ;^ v i ;^ v j )+q j (z) p(^ v 0 ;^ v i ;^ v j ) o + 1 4 å (i; j)2M 1(i2 N(z); j62 N(z))q i (z) p(^ v 0 ;^ v i ;^ v j ) + 1 4 å (i; j)2M 1(i62 N(z); j2 N(z))q j (z) p(^ v 0 ;^ v i ;^ v j ) å (i; j)2M 1(i2 N(z); j2 N(z))(r i j (z)+q i (z)+q j (z)) + å (i; j)2M 1(i2 N(z); j62 N(z))q i (z)+ å (i; j)2M 1(i62 N(z); j2 N(z))q j (z) = 4C(z); (B.22) 152 where the inequality holds since p(^ v 0 ;^ v i ;^ v j ) 4, p(^ v 0 ;^ v i ;^ v j ) 4 and p(^ v 0 ;^ v i ;^ v j ) 4 for all(i; j)2 M, along withr i j (z) 0 for all(i; j)2 M(z) andq i (z) 0 for all i2 N(z). In the next lemma, we use C(z) to bound the loss in the objective value of problem (B.15) when we use the discretized solution v instead of the optimal solution ^ v. We define g R (z) as g R (z) = 1 4 å (i; j)2M (r i j (z) p( v 0 ; v i ; v j )+q i (z) p( v 0 ; v i ; v j )+q j (z) p( v 0 ; v i ; v j )); which is the objective value of problem (B.15) evaluated at v. Lemma B.5.7 We have g R (z) f R (z) O( 1 n )C(z). Proof: By the second part of Lemma B.5.5, we have v i v j ^ v i ^ v j O( 1 n ), which implies that p( v 0 ; v i ; v j ) p(^ v 0 ;^ v i ;^ v j ) O( 1 n ), p( v 0 ; v i ; v j ) p(^ v 0 ;^ v i ;^ v j ) O( 1 n ) and p( v 0 ; v i ; v j ) p(^ v 0 ;^ v i ;^ v j ) O( 1 n ). Furthermore, noting thatjj v i jj = 1 for all i2 N[f0g and v i = v 0 for all i2 Nn N(z) by the first part of Lemma B.5.5, using the same argument that we use to obtain the sec- ond equality in (B.22), we obtain g R (z)= 1 4 å (i; j)2M 1(i2 N(z); j2 N(z)) n r i j (z) p( v 0 ; v i ; v j )+q i (z) p( v 0 ; v i ; v j )+q j (z) p( v 0 ; v i ; v j ) o + 1 4 å (i; j)2M 1(i2 N(z); j62 N(z))q i (z) p( v 0 ; v i ; v j ) + 1 4 å (i; j)2M 1(i62 N(z); j2 N(z))q j (z) p( v 0 ; v i ; v j ): Since r i j (z) 0 for all (i; j)2 M(z) and q i (z) 0 for all i2 N(z), the desired result follows by noting that p( v 0 ; v i ; v j ) p(^ v 0 ;^ v i ;^ v j ) O( 1 n ), p( v 0 ; v i ; v j ) p(^ v 0 ;^ v i ;^ v j ) O( 1 n ) and p( v 0 ; v i ; v j ) p(^ v 0 ;^ v i ;^ v j )O( 1 n ), along with the definition of C(z) and the equivalent definition of f R (z) given by the second equality in (B.22). B.5.5 De-Randomization Algorithm and Analysis In this section, we give the de-randomization algorithm and show that we can use this algo- rithm to de-randomize the output of our randomized rounding algorithm. For any vector w = (w 0 ;w 1 ;:::;w n ;w n+1 ;:::;w 2n+1 )2 R 2n+2 , we define w(1)2 R n+1 and w(2)2R n+1 as w(1) = 153 (w 0 ;w 1 ;:::;w n ) andw(2)=(w n+1 ;w n+2 ;:::;w 2n+1 ). Thus, the vectorsw(1) andw(2), respectively, cor- respond to the first and last n+ 1 components ofw. For a vectorw2R 2n+2 , note that we can express w(1)w(2) as Dw for an appropriate matrix D2R (n+1)(2n+2) . In particular, indexing the elements of D byfd i j : i= 0;1;:::;n; j= 0;1;:::;2n+ 1g, it is enough to set d i j = 1 when i= j, d i j =1 when i+ n+ 1= j and d i j = 0 otherwise. Throughout this section, for notational brevity, we will write Dw instead ofw(1)w(2). Consider using the input ( v; ^ u) in the randomized rounding algorithm, where v is the discretized version of the optimal solution ^ v to problem (B.15) and the components of the vector ^ u are independent and have the standard normal distribution. If we multiply ^ u by a positive constant, then the output of the randomized rounding algorithm does not change. LetW be a vector taking values inR 2n+2 such that its components are independent and have the standard normal distribution. In this case, the components of the vectorW(1)W(2)= DW are independent and have normal distribution with mean zero and variance 2. Therefore, using the input ( v;DW) in the randomized rounding algorithm is equivalent to using the input( v; ^ u). In our de-randomization approach, we start with the vector W taking values in R 2n+2 , where the components ofW are independent and have the standard normal distribution. Iteratively, we fix one addi- tional component of this vector. Therefore, after 2n+ 2 iterations, we obtain a deterministic vector. Using w2R 2n+2 to denote the deterministic vector obtained after 2n+ 2 iterations, we use the input ( v;D w) in the randomized rounding algorithm. Since D w is deterministic, the output of the randomized rounding algorithm is also deterministic. We will show thatå (i; j)2M V i j (X RR ( v;D w)) g i j R i j (X RR ( v;D w) z) (0:79 O( 1 n )) f R (z), so the subset of products X RR ( v;D w) satisfies the Sucient Condition with a = 0:79 O( 1 n ). We give the de-randomization algorithm below. In this algorithm, for any random vectorW taking values inR 2n+2 , we letW(`;d) be the vector also taking values inR 2n+2 that is obtained by fixing the`-th component ofW atd. Also, for any random vectorW taking values inR 2n+2 , we define F(W)= å (i; j)2M E n V i j (X RR ( v;DW)) g i j R i j (X RR ( v;DW)) z o ; where v is the discretized version of the optimal solution to problem (B.15). Lastly, we use the operator precedence DW(`;d)= D(W(`;d)), not DW(`;d)=(DW)(`;d). De-Randomization 154 Step 1: Set`= 0. Define the random vectorW (0) =(W 0 ;W 1 ;:::;W n ;W n+1 ;:::;W 2n+1 ), where W i has the standard normal distribution for all i= 0;1;:::;2n+ 1 andfW i : i= 0;1;:::;2n+ 1g are inde- pendent. Step 2: If`< 2n+ 1, then define the set S as S= n d2[3 p lnn;3 p lnn] : d is a multiple of 1 n 9 o [ n d2[3 p lnn;3 p lnn] : F(W (`) (`;d)) is not differentiable ind o : If`= 2n+ 1, then lettingD=fd2R : v i DW (`) (`;d)= 0 for some i2 N[f0gg,d max = max d2D d andd min = min d2D , define the set S as S=D[fd min e;d max +eg for anye> 0. Step 3: For each d2 S, find f(d) such thatj f(d)F(W (`) (`;d))j = O 1 n 5 C(z). Set ¯ w ` = argmax d2S f(d). Step 4: Define the random vectorW (`+1) =( ¯ w 0 ; ¯ w 1 ;:::; ¯ w ` ;W `+1 ;:::;W 2n+1 ). Increase` by one. If ` 2n+ 1, then go to Step 2; otherwise, return w=( ¯ w 0 ; ¯ w 1 ;:::; ¯ w 2n+1 ). If we have `= 2n+ 1, then the vectorW (`) (`;d) is of the form ( ¯ w 0 ; ¯ w 1 ;:::; ¯ w 2n ;d), which is de- terministic. In this case, we can compute the elements of D in Step 2 by solving the linear equa- tion v i DW (`) (`;d) = 0 for d for all i = N[f0g. Thus, we can obtain the elements of D explic- itly andjDj = O(n). Therefore, we can execute Step 2 in the de-randomization algorithm efficiently when ` = 2n+ 1. On the other hand, if ` < 2n+ 1, then the vector W (`) (`;d) is of the form ( ¯ w 0 ; ¯ w 1 ;:::; ¯ w `1 ;d;W `+1 ;:::;W 2n+1 ) and the last 2n+ 1` components of this vector are independent and have the standard normal distribution. By the third part of Lemma B.5.5, we have v i k 6= 0 for all i;k2 N[f0g. Therefore, we have ( v;DW (`) (`;d))2I . In Section 5 in [96], the authors discuss how to compute the points of non-differentiability forF(W (`) (`;d)) efficiently. At the end of this section, we argue that the number of points of non-differentiability forF(W (`) (`;d)) is polynomial in n. Thus, we can execute Step 2 in the de-randomization algorithm efficiently when`< 2n+ 1 as well. At the end of this section, we also discuss how to construct f(d) for eachd2 S. In this case, we can find ¯ w ` in Step 3 of the de-randomization algorithm by checking the value of f(d) for eachd2 S. Next, establish the performance guarantee for the output of the de-randomization algorithm. In the random vectorW (`) (`;d), the`-th component is fixed atd, whereas in the random vectorW (`) , the`-th 155 component has the standard normal distribution. In the next lemma, we show thatF(W (`) (`;d)) is not too much smaller F(W (`) ), as long as we choose some d2[3 p lnn;3 p lnn] to maximize the former quantity. After the next lemma, we build on this result to show thatF(W (`) (`;d)) is not too much smaller F(W (`) ), as long as we choose somed2 S. Lemma B.5.8 For any`< 2n+ 1, we have max d2[3 p lnn;3 p lnn] n F(W (`) (`;d)) o F(W (`) ) O 1 n 4:5 C(z): Proof: Letting v be the discretized version of the optimal solution to problem (B.15), for any random vectorW taking values inR 2n+2 such that( v;DW)2I , we define T(W) as T(W) = 1 2 å (i; j)2M(z) ( r i j (z) h S DW ( v 0 ; v i )+ S DW ( v 0 ; v j )+ S DW ( v i ; v j ) i + q i (z) h S DW ( v 0 ; v i )+ S DW ( v 0 ; v j )+ S DW ( v i ; v j ) i + q j (z) h S DW ( v 0 ; v i )+ S DW ( v 0 ; v j )+ S DW ( v i ; v j ) i ) + 2jNn N(z)j å i2N(z) q i (z)S DW ( v 0 ; v i ): (B.23) Note thatF(W) can be obtained by settingu= DW andv i = v i for all i2 N[f0g in the expression on the right side of (B.18). Furthermore, since ( v;DW)2I , the definition of S u (v i ;v j ) implies that S DW ( v i ; v j )+ S DW ( v i ; v j )= 1 for all i; j2 N[f0g with i6= j. In this case, noting the definition of C(z), it follows that T(W)+F(W)= 4C(z). Also, sincer i j (z) 0 for all(i; j)2 M(z) andq i (z) 0 for all i2 N(z), we have T(W) 0. In this case, noting that ( v;DW (`) (`;d))2I by the discussion right after the de-randomization algorithm, we obtain min d2[3 p lnn;3 p lnn] T(W (`) (`;d)) Z 3 p lnn 3 p lnn T(W (`) (`;d))e d 2 2 dd Z 3 p lnn 3 p lnn e d 2 2 dd Z ¥ ¥ T(W (`) (`;d))e d 2 2 dd Z 3 p lnn 3 p lnn e d 2 2 dd = T(W (`) ) Z 3 p lnn 3 p lnn e d 2 2 dd T(W (`) ) 1 O 1 n 4:5 : (B.24) 156 In the chain of inequalities above, the second inequality holds since T(W) 0. To see the equality, consider each term in the sum on the right side of (B.23) when we compute T(W (`) ). By the definition of S u (v i ;v j ), we have S DW ( v 0 ; v i )=Pfsign(DW (`) v 0 )= sign(DW (`) v i )g. The corresponding term is given byPfsign(DW (`) (`;d) v 0 )= sign(DW (`) (`;d) v i )g, when we compute T(W (`) (`;d)). The vectorsW (`) andW (`) (`;d) agree in all components except for the`-th component. The`-th component ofW (`) has the standard normal distribution, whereas the `-th component ofW (`) (`;d) is fixed at d. Therefore, by conditioning, we get Pfsign(DW (`) v 0 )= sign(DW (`) v i )g = Z ¥ ¥ Pfsign(DW (`) (`;d) v 0 )= sign(DW (`) (`;d) v i )ge d 2 2 dd: Using the same argument for each term in the sum on the right side of (B.23), we get T(W (`) ) = R ¥ ¥ T(W (`) (`;d))e d 2 2 dd. The last inequality in (B.24) holds since 1 R 3 p lnn 3 p lnn e d 2 2 dd = O( 1 n 4:5 ), which is shown in the proof of Lemma 4.2 in [96]. We can computeF(W (`) ) by replacingu with DW (`) andv with v on the right side of (B.18). By (B.17), each expression delineated with square brackets on the right side of (B.18) corresponds to a prob- ability. Thus,F(W (`) ) 0. Since T(W (`) (`;d))+F(W (`) (`;d))= 4C(z), we get max d2[3 p lnn;3 p lnn] F(W (`) (`;d)) = 4C(z) min d2[3 p lnn;3 p lnn] T(W (`) (`;d)) 4C(z) T(W (`) ) 1 O 1 n 4:5 = 4C(z) T(W (`) ) O 1 n 4:5 C(z) 1 O 1 n 4:5 = F(W (`) ) O 1 n 4:5 C(z) 1 O 1 n 4:5 F(W (`) ) O 1 n 4:5 C(z); where the first inequality uses (B.24) and the second inequality holds sinceF(W (`) ) 0. Next, we will show that the lemma above holds whend2 S instead ofd2[3 p lnn;3 p lnn]. We need the following lemma from Appendix 2 in [96]. Lemma B.5.9 Letting h (`) d ( v i ; v j )= S DW (`) (`;d) ( v i ; v j ) for i; j2 N[f0g with i6= j, we have dh (`) d ( v i ; v j ) dd = O(n 4 ) whenever the derivative exists. 157 In the next lemma, we build on Lemmas B.5.8 and B.5.9 to show that the inequality in Lemma B.5.8 continues to hold when we choosed2 S and consider any` 2n+ 1. Lemma B.5.10 For any` 2n+ 1, we have max d2S n F(W (`) (`;d)) o F(W (`) ) O 1 n 4:5 C(z): Proof: First, fix `< 2n+ 1. Let d = argmax d2[3 p lnn;3 p lnn] F(W (`) (`;d)). By the definition of S in the de-randomization algorithm, there exists ˆ d2 S such thatjd ˆ dj= O( 1 n 9 ) and S DW (`) (`;d) ( v i ; v j ) is differentiable ind over the interval(minfd ; ˆ dg;maxfd ; ˆ dg). So, by Lemma B.5.9, we get jS DW (`) (`;d ) ( v i ; v j ) S DW (`) (`; ˆ d) ( v i ; v j )j = O 1 n 5 : Since ( v;DW (`) (`;d ))2I , we can compute F(W (`) (`;d )) by replacingu with DW (`) (`;d ) and v with v on the right side of (B.18). Similarly, we can compute F(W (`) (`; ˆ d)) by replacing u with DW (`) (`; ˆ d) andv with v on the right side of (B.18). So, using the equality above, we get jF(W (`) (`;d ))F(W (`) (`; ˆ d))j 1 2 å (i; j)2M(z) 3 n r i j (z)+q i (z)+q j (z) o O 1 n 5 + 2jNn N(z)j å i2N(z) q i (z)O 1 n 5 = O 1 n 5 C(z); (B.25) where the last equality uses the fact that 6C(z) 3 2 å (i; j)2M(z) (r i j (z) + q i (z) + q j (z)) + 2jNn N(z)jå i2N(z) q i (z). Thus, it follows that max d2S n F(W (`) (`;d)) o F(W (`) (`; ˆ d)) F(W (`) (`;d )) O 1 n 5 C(z) = max d2[3 p lnn;3 p lnn] n F(W (`) (`;d)) o O 1 n 5 C(z) F(W (`) ) O 1 n 4:5 C(z); where the second inequality uses (B.25) and the third inequality uses Lemma B.5.8 along with the fact that O( 1 n 5 )+ O( 1 n 4:5 )= O( 1 n 4:5 ). Second, fix`= 2n+1. Noting the discussion right after the de-randomization algorithm,W (`) (`;d) is a deterministic vector of the form( ¯ w 0 ; ¯ w 1 ;:::; ¯ w 2n ;d). Furthermore, by the definition of S, for any i2 N[ f0g, the sign of v i DW (`) (`;d) does not change whend takes values between two consecutive elements 158 of S. Observe that if we execute the randomized rounding algorithm with the input( v;DW (`) (`;d)), the output of the algorithm depends only on the signs off v i DW (`) (`;d) : i2 N[f0gg. Therefore, as d takes values between two consecutive elements of S, the value ofX RR ( v;DW (`) (`;d)) does not change. Noting the definition of F(W), it follows that the value of F(W (`) (`;d)) does not change either as d takes values between two consecutive elements of S. Furthermore, asd takes values smaller thand min e or larger thand max +e, the value ofF(W (`) (`;d)) does not change. Therefore, we obtain max d2S n F(W (`) (`;d)) o = max d2R n F(W (`) (`;d)) o Z ¥ ¥ F(W (`) (`;d))e d 2 2 dd = F(W (`) ); where the last equality follows from the same argument that we use to show that R ¥ ¥ T(W (`) (`;d))e d 2 2 dd = T(W (`) ) in the proof of Lemma B.5.8. In the next theorem, we give the performance guarantee for the output of the de-randomization algo- rithm. Theorem B.5.11 Letting w be the output of the de-randomization algorithm and ^ x=X RR ( v;D w), ^ x is a deterministic subset of products that satisfies å (i; j)2M V i j (^ x) g i j (R i j (^ x) z) 0:79 O 1 n f R (z): Proof: Since the vector w in Step 4 of the de-randomization algorithm is deterministic, it follows that ^ x is a deterministic subset of products. Noting Step 4 of the de-randomization algorithm, we have W (`+1) =W (`) (`; ¯ w ` ). In this case, sinceF(W (`) (`;d)) f(d) O 1 n 5 C(z) for eachd2 S in Step 3 of the de-randomization algorithm, we get F(W (`+1) )=F(W (`) (`; ¯ w ` )) f( ¯ w ` ) O 1 n 5 C(z). Fur- thermore, since, we also have f(d)F(W (`) (`;d)) O 1 n 5 C(z) for each d2 S in Step 3 of the de- randomization algorithm, we get max d2S f(d) max d2S F(W (`) (`;d)) O 1 n 5 C(z). In this case, since ¯ w ` = argmax d2S f(d) in Step 3, we obtain F(W (`+1) ) = F(W (`) (`; ¯ w ` )) f( ¯ w ` ) O 1 n 5 C(z) = max d2S f f(d)g O 1 n 5 C(z) max d2S n F(W (`) (`;d)) o O 1 n 5 C(z): 159 Noting Lemma B.5.10 and the fact that O( 1 n 5 )+ O( 1 n 4:5 )= O( 1 n 4:5 ), the chain of inequalities above implies thatF(W (`+1) )F(W (`) )O( 1 n 4:5 )C(z). There are 2n+2 iterations in the de-randomization algorithm. Adding the last inequality over`= 0;1;:::;2n+ 1, we obtainF(W (2n+2) )F(W (0) ) O( 1 n 3:5 )C(z). We haveW (2n+2) = w at the last iteration of the de-randomization algorithm. Also, W (0) is the random vector taking values inR 2n+2 , where the components are independent and have the standard normal distribution. As discussed at the beginning of this section, using the input( v;DW (0) ) in the randomized rounding algorithm is equivalent to using the input( v; ^ u), where ^ u is a vector taking values inR n+1 with the components being independent and having the standard normal distribution. Therefore, we obtain å (i; j)2M V i j (^ x) g i j (R i j (^ x) z) = å (i; j)2M V i j (X RR ( v;D w)) g i j (R i j (X RR ( v;D w)) z) = F( w) = F(W (2n+2) ) F(W (0) ) O 1 n 3:5 C(z) = å (i; j)2M EfV i j (X RR ( v;DW (0) )) g i j (R i j (X RR ( v;DW (0) )) z)g O 1 n 3:5 C(z) = å (i; j)2M EfV i j (X RR ( v; ^ u)) g i j (R i j (X RR ( v; ^ u)) z)g O 1 n 3:5 C(z) a 4 å (i; j)2M n r i j (z) p( v 0 ; v i ; v j )+q i (z) p( v 0 ; v i ; v j )+q j (z) p( v 0 ; v i ; v j ) o O 1 n 3:5 C(z) = a g R (z) O 1 n 3:5 C(z); where the second and fourth equalities follow from the definition ofF(W) and the second inequality is by Theorem B.5.4. Thus, we haveå (i; j)2M V i j (^ x) g i j (R i j (^ x)z)a g R (z)O( 1 n 3:5 ), in which case, noting Lemma B.5.7 and the fact that O( 1 n )+ O( 1 n 3:5 )= O( 1 n ), we get å (i; j)2M V i j (^ x) g i j (R i j (^ x) z) a g R (z) O 1 n 3:5 C(z) a f R (z) O 1 n C(z) 0:79 f R (z) O 1 n C(z) 0:79 O 1 n f R (z): Here, the third inequality uses the fact thata 0:79 and the fact that the optimal objective value of problem (B.15) is non-negative as discussed at the end of the proof of Theorem B.5.4. The last inequality holds since f R (z) C(z) by Lemma B.5.6. By the theorem above, for anye> 0, these exists a constant K such that if n K=e, then we can use the de-randomization algorithm to obtain a deterministic subset of products that satisfies V i j (^ x) g i j (R i j (^ x) 160 z)(0:79e) f R (z). If n< K=e, then we can enumerate all possible subsets in constant time. Thus, we have a 0:79e approximation algorithm for anye> 0. Closing this section, we consider any iteration`< 2n+ 1 in the de-randomization algorithm and ar- gue that the set S in Step 2 includes a polynomial number of elements and discuss how to construct f(d) for each d2 S in Step 3. By Lemma 3.4 in [96], S DW (`) (`;d) is non-differentiable in d at no more than two values of d. Furthermore, in Section 5 in [96], the authors show how to compute the points of non-differentiability for S DW (`) (`;d) efficiently. Noting (B.18), we can expressF(W (`) (`;d)) as a linear combination offS DW (`) (`;d) ( v i ; v j ) : i; j2 N[f0g with i6= jg. Since there are O(n 2 ) elements in the setf(i; j) : i; j2 N with i6= jg,F(W (`) (`;d)) is non-differentiable ind at no more than O(n 2 ) values of d. Also, the setfd2[3 p lnn;3 p lnn] :d is a multiple of 1 n 9 g has O(n 9 p lnn) elements. Next, we focus on constructing f(d) such thatj f(d)F(W (`) (`;d))j= O 1 n 5 C(z) for eachd2 S. In Section 7 in [96], the authors give an algorithm to construct an approximation to S DW (`) (`;d) ( v i ; v j ) with an error of O( 1 n 5 ). Using ˜ S DW (`) (`;d) ( v i ; v j ) to denote the approximation, we construct f(d) in Step 3 of the de-randomization algorithm as f(d) = 1 2 å (i; j)2M(z) ( r i j (z) h ˜ S DW (`) (`;d) ( v 0 ; v i )+ ˜ S DW (`) (`;d) ( v 0 ; v j )+ ˜ S DW (`) (`;d) ( v i ; v j ) 1 i + q i (z) h ˜ S DW (`) (`;d) ( v 0 ; v i )+ ˜ S DW (`) (`;d) ( v 0 ; v j )+ ˜ S DW (`) (`;d) ( v i ; v j ) 1 i + q j (z) h ˜ S DW (`) (`;d) ( v 0 ; v i )+ ˜ S DW (`) (`;d) ( v 0 ; v j )+ ˜ S DW (`) (`;d) ( v i ; v j ) 1 i ) + 2jNn N(z)j å i2N(z) q i (z) ˜ S DW (`) (`;d) ( v 0 ; v i ): We can computeF(W (`) (`;d)) by replacingu with DW (`) (`;d) andv with v on the right side of (B.18). Therefore, we obtain j f(d)F(W (`) (`;d))j 1 2 å (i; j)2M(z) 3 n r i j (z)+q i (z)+q j (z) o O 1 n 5 + 2jNn N(z)j å i2N(z) q i (z)O 1 n 5 : By the same reasoning that we use to obtain the equality in (B.25), the right side of the inequality above is O( 1 n 5 )C(z). Lastly, in the proof of Lemma B.5.10, we show that we can computeF(W (`) (`;d)) exactly when`= 2n+ 1. Thus, when`= 2n+ 1, we can use f(d)=F(W (`) (`;d)). 161 B.6 Structural Properties of the Extreme Points We focus on the extreme points of the polyhedron given by the set of feasible solutions to the LP that computes f R at the beginning of Section 3.5.1. This polyhedron is given by P = ( (x;y)2[0;1] j ˆ Nj R j ˆ Mj + : y i j x i + x j 1 8(i; j)2 ˆ M; å i2 ˆ N x i c ) : If we have c n so that there is no capacity constraint, thenP is the boolean quadric polytope studied by [109]. By Theorem 7 in [109], all components of any extreme point of the boolean quadric polytope take values inf0; 1 2 ;1g. Also, [65] studies optimization problems over the feasible setP\2f0;1g j ˆ Nj R j ˆ Mj + with c n and constructs half-integral solutions with objective values exceeding the optimal, in which case, she can obtain 0:5-approximate solutions when the objective function coefficients are all positive. By Theorem 7 in [109], if(^ x; ^ y) is an extreme point ofP with c n, then ˆ x i 2f0; 1 2 ;1g for all i2 ˆ N. In the next counterexample, we show that this property does not hold when there is a capacity constraint. Example B.6.1 (Dense Extreme Points in Capacitated Problem) Consider the polyhedronP for the case where we have c= 3 andj ˆ Nj= 7 with ˆ N=f1;:::;7g. Let ^ x=( 2 5 ; 2 5 ; 2 5 ; 2 5 ; 2 5 ; 2 5 ; 3 5 ) and ^ y= 02R j ˆ Mj + . Note that we haveå i2 ˆ N ˆ x i = c and ˆ x i + ˆ x j 1 0= ˆ y i j for all(i; j)2 ˆ M, which implies that(^ x; ^ y)2P. We claim that(^ x; ^ y) is an extreme point ofP. Assume on the contrary that there exist(^ x+; ^ y+)2P and(^ x; ^ y)2P with(;) non-zero so that we have(^ x; ^ y)= 1 2 (^ x+; ^ y+)+ 1 2 (^ x; ^ y). Since we have ^ y = 0, ^ y+2R j ˆ Mj + and ^ y2R j ˆ Mj + , it must be the case that = 0. Noting that (^ x+; ^ y+)2P and (^ x; ^ y)2P, for each i2f1;:::;6g, the constraint y i7 x i + x 7 1 that definesP yields 0= ˆ y i7 +d i7 ˆ x i +e i + ˆ x 7 +e 7 1 and 0= ˆ y i7 d i7 ˆ x i e i + ˆ x 7 e 7 1: In the case, since ˆ x i + ˆ x 7 = 1 by the definition of ^ x, the inequalities above imply that e 7 =e i for all i2f1;:::;6g. Also, the constraint å i2 ˆ N x i c yields å 7 i=1 ˆ x i +å 7 i=1 e i c and å 7 i=1 ˆ x i å 7 i=1 e i c, in which case, noting thatå 7 i=1 ˆ x i = 3= c by the definition of ^ x, we obtainå 7 i=1 e i = 0. Combining the last equality with the fact that e i =e 7 for all i2f1;:::;6g, it follows that e i = 0 for all i2f1;:::;7g. So, (;) is the zero vector, which is a contradiction. 162 Next, we give the proof of Lemma 3.5.1. Throughout the proof, we use the fact that if (^ x; ^ y) is an extreme point ofP(H), then we must have ˆ y i j =[ ˆ x i + ˆ x j 1] + for all(i; j)2 ˆ M. This result holds because if ˆ y i j >[ ˆ x i + ˆ x j 1] + for some (i; j)2 ˆ M, we can perturb only this component of ^ y by +e ande for a small enough e > 0 while keeping the other components of (^ x; ^ y) constant. In this case, the two points that we obtain in this fashion are inP(H) and(^ x; ^ y) can be written as a convex combination of the two points, which contradicts the fact that(^ x; ^ y) is an extreme point ofP(H). Therefore, it indeed holds that ˆ y i j =[ ˆ x i + ˆ x j 1] + for all(i; j)2 ˆ M for any extreme point(^ x; ^ y) ofP(H). Below is the proof of Lemma 3.5.1. Proof of Lemma 3.5.1: To get a contradiction, assume that we have an extreme point(^ x; ^ y) such that ˆ x i 62( 1 2 ;1) for all i2 ˆ N and ˆ x k 62f0; 1 2 ;1g for some k2 ˆ N. We define F=fk2 ˆ N : ˆ x k 62f0; 1 2 ;1gg. Consider some k2 F. Since we assume that ˆ x i 62( 1 2 ;1) for all i2 ˆ N and we have ˆ x k 62f0; 1 2 ;1g, it follows that ˆ x k 2(0; 1 2 ). Therefore, we have ˆ x k 2(0; 1 2 ) for all k2 F. Also, by the definition of F, we have ˆ x i 2f0; 1 2 ;1g for all i62 F. Summing up the discussion so far, we obtain ˆ x k 2(0; 1 2 ) for all k2 F and ˆ x i 2f0; 1 2 ;1g for all i62 F. Thus, we can partition ˆ N into three subsets F =fk2 ˆ N : ˆ x k 2(0; 1 2 )g, S=fi2 ˆ N : ˆ x i 2f0; 1 2 gg and L =fi2 ˆ N : ˆ x i = 1g so that ˆ N = F[ S[ L. Since we assume that ˆ x k 62f0; 1 2 ;1g for some k2 ˆ N, jFj=j ˆ Nn(S[ L)j 1. First, we consider the casejFj= 1. We use k to denote the single element of F. Given the extreme point (^ x; ^ y), we define the solution(x 1 ;y 1 ) as follows. For a small enoughe> 0, we set x 1 k = ˆ x k +e, x 1 i = ˆ x i for all i2 S[ L and y 1 i j =[x 1 i + x 1 j 1] + for all(i; j)2 ˆ M. Since we have ˆ x k 2(0; 1 2 ) and ˆ x i 2f0; 1 2 ;1g for all i2 ˆ Nnfkg, the sumå i2 ˆ N ˆ x i cannot be an integer. So, the constraintå i2 ˆ N x i c is not tight at the extreme point(^ x; ^ y), which implies thatå i2 ˆ N x 1 i =å i2 ˆ N ˆ x i +e c for a small enoughe. Also, since ˆ x k 2(0; 1 2 ), we obtain x 1 k = ˆ x k +e 1 for a small enoughe. In this case,(x 1 ;y 1 )2P(H). Similarly, we define the solution (x 2 ;y 2 ) as follows. We set x 2 k = ˆ x k e, x 2 i = ˆ x i for all i2 S[ L and y 2 i j =[x 2 i + x 2 j 1] + for all(i; j)2 ˆ M. By using an argument similar to the one earlier in this paragraph, we can verify that(x 2 ;y 2 )2P(H). Consider(k; j)2 ˆ M with j2 S. We have y 1 k j =[x 1 k +x 1 j 1] + =[ ˆ x k +e+ ˆ x j 1] + = 0=[ ˆ x k + ˆ x j 1] + = ˆ y k j , where the third and fourth equalities follow from the fact that ˆ x k 2 (0; 1 2 ) and ˆ x j 2f0; 1 2 g for all j2 S, whereas the last equality follows from the fact that ˆ y i j =[ ˆ x i + ˆ x j 1] + in the extreme point(^ x; ^ y). Similarly, we have y 1 jk = ˆ y jk for all ( j;k)2 ˆ M with j2 S. Consider (k; j)2 ˆ M with j2 L. We have y 1 k j =[x 1 k + x 1 j 1] + =[ ˆ x k +e+ ˆ x j 1] + =[ ˆ x k + ˆ x j 1] + +e = ˆ y k j +e, where the third equality follows from the fact that ˆ x j = 1 for all j2 L. Similarly, we have y 1 jk = ˆ y jk +e for all( j;k)2 ˆ M with j2 L. For the other cases not considered by the preceding four conditions, we have y 1 i j = ˆ y i j . By following precisely 163 the same line of reasoning used in this paragraph, we can also show that y 2 k j = ˆ y k j for all(k; j)2 ˆ M with j2 S, y 2 jk = ˆ y jk for all( j;k)2 ˆ M with j2 S, y 2 k j = ˆ y k j e for all(k; j)2 ˆ M with j2 L and y 2 jk = ˆ y jk e for all( j;k)2 ˆ M with j2 L. Also, we have y 2 i j = ˆ y i j for the other cases not considered by the preceding four conditions. In this case, we get y 1 i j + y 2 i j = 2 ˆ y i j for all(i; j)2 ˆ M. The discussion in this and the previous paragraph shows that ^ x= 1 2 x 1 + 1 2 x 2 and ^ y= 1 2 y 1 + 1 2 y 2 for(x 1 ;y 1 )2P(H) and(x 2 ;y 2 )2P(H), so (^ x; ^ y) cannot be an extreme point ofP(H). Thus, we get a contradiction and the desired result follows. Second, we consider the casejFj 2. We use k and k 0 to denote any two elements of F. Given the extreme point (^ x; ^ y), we define the solution (x 1 ;y 1 ) as follows. For a small enough e > 0, we set x 1 k = ˆ x k +e, x 1 k 0 = ˆ x k 0e, x 1 i = ˆ x i for all i2(Fnfk;k 0 g)[ S[ L and y 1 i j =[x 1 i + x 1 j 1] + for all(i; j)2 ˆ M. Sinceå i2 ˆ N x 1 i =å i2 ˆ N ˆ x i , it follows that(x 1 ;y 1 ) satisfies the constraintå i2 ˆ N x 1 i c. Also, since ˆ x k 2(0; 1 2 ) and ˆ x k 02(0; 1 2 ), we have x 1 k 1 and x 1 k 0 0. In this case, it follows that(x 1 ;y 1 )2P(H). Similarly, we define the solution(x 2 ;y 2 ) as follows. We set x 2 k = ˆ x k e, x 2 k 0 = ˆ x k 0+e, x 2 i = ˆ x i for all i2(Fnfk;k 0 g)[S[L and y 2 i j =[x 2 i + x 2 j 1] + for all(i; j)2 ˆ M. Using an argument similar to the one earlier in this paragraph, we have(x 2 ;y 2 )2P(H). Lastly, following an argument similar to the one in the previous paragraph, we can show that ^ x= 1 2 x 1 + 1 2 x 2 and ^ y= 1 2 y 1 + 1 2 y 2 , in which case, we, once more, reach a contradiction. B.7 Generalized Nested Logit Model with at Most Two Products per Nest In this section, we give extensions of our results to the generalized nested logit model with at most two products in each nest. B.7.1 Assortment Problem We index the set of products by N =f1;:::;ng. We usex=(x 1 ;:::;x n )2f0;1g n to capture the subset of products that we offer to the customers, where x i = 1 if and only if we offer product i. We denote the collection of nests by M. For each nest q2 M, we let g q 2[0;1] be the dissimilarity parameter of the nest. For each product i and nest q, we let a iq 2[0;1] be the membership parameter of product i for nest q. We haveå q2M a iq = 1 for all i2 N, so the membership parameters for a particular product adds up to one. Letting B q =fi2 N :a iq > 0g, B q corresponds to the set of products with strictly positive 164 membership parameters for nest q. We use v i to denote the preference weight of product i and v 0 to denote the preference weight of the no purchase option. Letting V q (x)=å i2B q (a iq v i ) 1=g q x i , under the generalized nested logit model, if we offer the subset of productsx, then a customer decides to make a purchase in nest q with probability V q (x) g q =(v 0 +å `2M V ` (x) g q ). If the customer decides to make a purchase in nest q, then she chooses product i2 B q with probability(a iq v i ) 1=g q x i =V q (x). In this case, if we offer the subset of productsx and a customer has already decided to make a purchase in nest q, then the expected revenue that we obtain from the customer is R q (x)=å i2B q p i (a iq v i ) 1=g q x i =V q (x), where p i is the revenue of product i. Lettingp(x) be the expected revenue that we obtain from a customer when we offer the subset of products x, we have p(x) = å q2M V q (x) g q v 0 +å `2M V ` (x) g ` R q (x) = å q2M V q (x) g q R q (x) v 0 +å q2M V q (x) g q : Defining the set of feasible subsets of productsF that we can offer in the same way that we define in the main body of the paper, we formulate our assortment problem as z = max x2F p(x) = max x2F ( å q2M V q (x) g q R q (x) v 0 +å q2M V q (x) g q ) : (B.26) The PCL model is a special case of the generalized nested logit model. Since the assortment problem under the PCL model is strongly NP-hard, the problem above is strongly NP-hard as well. B.7.2 A Framework for Approximation Algorithms To relate problem (B.26) to the problem of computing the fixed point of a function, we define the function f :R!R as f(z) = max x2F ( å q2M V q (x) g q (R q (x) z) ) : (B.27) We can show that f() is decreasing and continuous with f(0) 0. Therefore, there exists a unique ˆ z 0 that satisfies f(ˆ z)= v 0 ˆ z. In our approximation framework, we will construct an upper bound f R () on f() so that f R (z) f(z) for all z2R. This upper bound will be decreasing and continuous with f R (0) 0, so that there also exists a unique ˆ z 0 that satisfies f R (ˆ z) = v 0 ˆ z. Theorem 3.3.1 contin- ues to hold, as long as we replace å (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) a f R (ˆ z) in the Sucient Condition 165 with å q2M V q (^ x) g q (R q (^ x) ˆ z) a f R (ˆ z). The approximation framework given in Section 3.3.1 con- tinues to hold as well, as long as we replace å (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) a f R (ˆ z) in Step 3 with å q2M V q (^ x) g q (R q (^ x) ˆ z)a f R (ˆ z). The preceding discussion holds with an arbitrary number of products in each nest. In the rest of the discussion, we focus on the case where there are at most two products in each nest so thatjB q j 2 for all q2 M. We proceed to constructing an upper bound f R () on f(). To construct this upper bound, we use an LP relaxation of problem (B.27). In particular, we definer q (z) andq iq (z) as r q (z)= å i2B q (a iq v i ) 1=g q ! g q å i2B q (p i z)(a iq v i ) 1=g q å i2B q (a iq v i ) 1=g q and q iq (z)=a iq v i (p i z): Following the same discussion at the beginning of Section 3.3.2, if we have x i = 1 for all i2 B q , then V q (x) g q (R q (x) z)=r q (z), whereas if we have x i = 1 for exactly one i2 B q , then V q (x) g q (R q (x) z)= q iq (z). Note that sincejB q j= 2, if we offer some product in nest q, then we must have either x i = 1 for all i2 B q or x i = 1 for exactly one i2 B q . In this case, letting m q (z)=r q (z)å i2B q q iq (z) for notational brevity, we can express V q (x) g q (R q (x) z) in the objective function of problem (B.27) succinctly as V q (x) g q (R q (x) z) = r q (z) Õ j2B q x j + å i2B q q iq (z) 1 Õ j2B q nfig x j ! x i = m q (z) Õ i2B q x i + å i2B q q iq (z)x i : (B.28) Thus, we can use the expressionå q2M (m q (z)Õ i2B q x i +å i2B q q iq (z)x i ) to equivalently write the objective function of problem (B.27), in which case, this problem becomes f(z) = max ( å q2M m q (z) Õ i2B q x i + å i2B q q iq (z)x i ! : å i2N x i c; x i 2f0;1g8i2 N ) : To linearize the termÕ i2B q x i in the objective function above, we define the decision variable y q 2f0;1g with the interpretation that y q =Õ i2B q x i . To ensure that y q takes the valueÕ i2B q x i , we impose the con- straints y q å i2B q x i jB q j+ 1 and y q x i for all i2 B q . In this case, noting thatjB q j 2, if x i = 1 for all i2 B q , then the constraints y q å i2B q x i jB q j+1 and y q x i for all i2 B q ensures that y q = 1. If x i = 0 for some i2 B q , then the constraint y q x i ensures that y q = 0. Using the decision variablesfy q : q2 Mg, we 166 can formulate the problem above as an integer program. We use the LP relaxation of this integer program to construct the upper bound f R () on f(). In particular, we define the upper bound f R () as f R (z) = max å q2M m q (z)y q + å i2B q q iq (z)x i ! (B.29) s:t: y q å i2B q x i jB q j+ 1 8q2 M y q x i 8i2 B q ; q2 M å i2N x i c 0 x i 1 8i2 N; y q 08q2 M: In the formulation of the LP above, we use the fact that if we offer some product in a nest, then we offer either all or one of the products in this nest, which holds when the number of products in this nest is at most two. Therefore, our development uses the assumption thatjB q j 2 for all q2 M. Lemma 3.3.2 continues to hold so that f R (z) is decreasing in z. In this case, by using the same argument right before Lemma 3.3.2, it follows that f R () is decreasing and continuous with f R (0) 0. Therefore, there exists a unique ˆ z 0 satisfying f R (ˆ z)= v 0 ˆ z, corresponding to the fixed point of f R ()=v 0 . Furthermore, using the same approach in the proof of Theorem 3.3.3, we can show that we can solve an LP to compute the fixed point of f R ()=v 0 . We define N(z)=fi2 N : p i zg and M(z)=fq2 M : p i z 8i2 B q g. Using the same approach in the proof of Lemma B.2.1, we can show that there exists an optimal solutionx =fx i : i2 Ng and y =fy q : q2 Mg to problem (B.29) with x i = 0 for all i62 N(z) and y q = 0 for all q62 M(z). Therefore, without loss of generality, we assume that if(x ;y ) is an optimal solution to problem (B.29), then x i = 0 for all i62 N(z) and y q = 0 for all q62 M(z). Also, using the same approach in the proof of Lemma B.2.2, we can show thatm q (z) 0 for all q2 M(z), in which case, the decision variablesfy q : q2 M(z)g take their smallest possible value in an optimal solution to problem (B.29). Thus, the constraint y q x i is redundant. In this case, problem (B.29) is equivalent to the problem f R (z) = max å q2M m q (z)y q + å i2B q q iq (z)x i ! (B.30) s:t: y q å i2B q x i jB q j+ 18q2 M 167 å i2N x i c 0 x i 1 8i2 N; y q 08q2 M: Working with problem (B.30), rather than problem (B.29), will be more convenient. In the next two sections, we focus on the uncapacitated and capacitated problems separately. B.7.3 Uncapacitated Problem We consider the case where c n so that there is no capacity constraint. We let ˆ z be such that f R (ˆ z)= v 0 ˆ z. Throughout this section, since the value of ˆ z is fixed, as is done in the main body of the paper, we exclude the reference to ˆ z. In particular, we let m q =m q (ˆ z), q iq =q iq (ˆ z), r iq =r iq (z), ˆ N = N(z), ˆ M = M(z) and f R = f R (ˆ z). We let(x ;y ) be an optimal solution to problem (B.30) with z= ˆ z. As discussed at the end of the previous section, without loss of generality, we assume that x i = 0 for all i62 ˆ N and y q = 0 for all i62 ˆ M. We define the random subset of products ^ X =f ˆ X i : i2 Ng as follows. For each i2 N, we have ˆ X i = 1 with probability x i , whereas ˆ X i = 0 with probability 1 x i . Different components of the vector ^ X are independent of each other. Through minor modifications in the proof of Theorem 3.4.1, we can show that E å q2M V q ( ^ X) g q (R q ( ^ X) ˆ z) 0:5 f R : (B.31) In particular, using the same argument in the proof of Lemma B.2.2, we can show thatm q 0 for all q2 ˆ M. In this case, for q2 ˆ M, the decision variable y q takes its smallest possible value in an optimal solution to problem (B.30). Thus, without loss of generality, we can assume that the optimal solution (x ;y ) to problem (B.30) satisfies y q = å i2B q x i jB q j+ 1 + for all q2 ˆ M. Furthermore, by the definition of ^ X, we haveEf ˆ X i g= x i for all i2 N and the different components of ^ X are independent of each other. In this case, noting (B.28), we get å q2M EfV q ( ^ X) g q (R q ( ^ X) ˆ z)g = å q2M m qÕ i2B q Ef ˆ X i g+ å i2B q q iq Ef ˆ X i g ! = å q2M m qÕ i2B q x i + å i2B q q iq x i ! 168 = å q2M 1(q2 ˆ M) m qÕ i2B q x i + å i2B q q iq x i ! + å q2M 1(q62 ˆ M) å i2B q q iq x i ! = å q2M 1(q2 ˆ M) m q å i2B q x i jB q j+ 1 + + å i2B q q iq x i ! + å q2M 1(q62 ˆ M) å i2B q q iq x i ! + å q2M 1(q2 ˆ M)m q Õ i2B q x i å i2B q x i jB q j+ 1 + ! = f R + å q2 ˆ M m q Õ i2B q x i å i2B q x i jB q j+ 1 + ! f R + 1 4 å q2 ˆ M m q : (B.32) In the chain of inequalities above, the third equality uses the fact that if q62 ˆ M, then there exists some j2 B q such that p j < ˆ z, in which case, we get j62 ˆ N. Having j62 ˆ N implies that x j = 0. Thus, there exists some j2 B q such that x j = 0, which yields Õ i2B q x i = 0. The fifth equality is by the fact that y q = å i2B q x i jB q j+ 1 + for all q2 ˆ M and y q = 0 for all q62 ˆ M. To see that the inequality holds, if jB q j= 2 for nest q, then we use the fact that 0 ab[a+ b 1] + 1=4 for all a;b2[0;1], whereas if jB q j= 1 for nest q, then we use the fact that a[a] + = 0 for all a2[0;1], along with the fact that m q 0 for all q2 ˆ M. Next, we give a feasible solution to problem (B.30), which, in turn, allows us to construct a lower bound on f R . We define the solution(^ x; ^ y) to problem (B.30) as ˆ x i = 8 > > < > > : 1 2 if i2 ˆ N 0 if i62 ˆ N, ˆ y q = 8 > > < > > : 0 ifjB q j= 2 1 ifjB q j= 1. It is straightforward to check that the solution(^ x; ^ y) is feasible to problem (B.30) when we do not have a capacity constraint. Therefore, the objective value of problem (B.30) evaluated at(^ x; ^ y) provides a lower bound on f R , so we have f R å q2M m q ˆ y q + å i2B q q iq ˆ x i ! = å q2M 1(q2 ˆ M) å i2B q q iq ˆ x i + å q2M 1(q62 ˆ M) å i2B q q iq ˆ x i = 1 2 å q2M 1(q2 ˆ M) å i2B q q iq + 1 2 å q2M 1(q62 ˆ M) å i2B q 1(i2 ˆ N)q iq 1 2 å q2M 1(q2 ˆ M) å i2B q q iq 1 2 å q2M 1(q2 ˆ M) å i2B q q iq r q ! = 1 2 å q2 ˆ M m q ; 169 In the chain of inequalities above, the first equality holds because the definition ofm q immediately implies that m q = 0 whenjB q j= 1. Also, we have ˆ y q = 0 whenjB q j= 2. Therefore, we have m q ˆ y q = 0 for all q2 M. The second equality holds since having q2 ˆ M implies having i2 ˆ N for all i2 B q , in which case, we have ˆ x i = 1 2 for all i2 B q . The second inequality holds because if we have i2 ˆ N, thenq iq 0 by the definition of ˆ N andq iq . The last inequality follows from the fact that if we have q2 ˆ M, then the definition ofr q implies thatr q 0. The chain of inequalities above yields the inequality f R + 1 2 å q2 ˆ M m q 0, in which case, by the chain of inequalities in (B.32), we getå q2M EfV q ( ^ X) g q (R q ( ^ X) ˆ z)g 1 2 f R + 1 2 ( f R + 1 2 å q2 ˆ M m q ) 1 2 f R , establishing (B.31). The subset of products ^ X is a random variable but we can use the method of conditional expectations to de-randomize the subset of products ^ X so that we obtain a deterministic subset of products ^ x that satisfies å (i; j)2M V i j (^ x) g i j (R i j (^ x) ˆ z) 0:5 f R . In this case, ^ x is a 0.5-approximate solution to the uncapacitated problem under the generalized nested logit model with at most two products in each nest. Unfortunately, our approach in Appendix B.3 to obtain a 0.6-approximate solution does not extend to the generalized nested logit model. In Appendix B.3, letting ˆ N =f1;:::;mg, we index the products such that q 1 q 2 :::q m . Under the generalized nested logit model, it is not possible to ensure that we haveq 1q q 2q :::q mq for all q2 M. B.7.4 Capacitated Problem Considering the assortment problem with the capacity constraint, we let ˆ z be such that f R (ˆ z)= v 0 ˆ z, where f R (z) is the optimal objective value of problem (B.30). Our goal is to find a subset of products ^ x such that å q2M V q (^ x) g q (R q (^ x) ˆ z) 0:25 f R (ˆ z) andå i2N ˆ x i c. B.7.4.1 Half-Integral Solutions Through Iterative Rounding Similar to our approach for the capacitated problem, since ˆ z is fixed, we exclude the reference to ˆ z. In particular, we let m q = m q (ˆ z), q iq =q iq (ˆ z), r iq =r iq (ˆ z), ˆ N = N(ˆ z), ˆ M = M(ˆ z) and f R = f R (ˆ z). For any H ˆ N, we use the polyhedronP(H) to denote the set of feasible solutions to problem (B.30) after we fix the values of the decision variablesfx i : i2 Hg at 1 2 and the values of the decision variablesfx i : i62 ˆ Ng andfy q : q62 ˆ Mg at zero. Therefore, the polyhedronP(H) is given by P(H) = ( (x;y)2[0;1] jNj R jMj + : y q å i2B q x i jB q j+ 18q2 M; å i2N x i c; 170 x i = 1 2 8i2 H; x i = 0 8i62 ˆ N; y q = 0 8q62 ˆ M ) : (B.33) An analogue of Lemma 3.5.1 holds for the polyhedronP(H) given above. In particular, for any H ˆ N, letting(^ x; ^ y) be an extreme point ofP(H), we can use the same approach in the proof of Lemma 3.5.1 to show that if there is no product i2 ˆ N such that 1 2 < ˆ x i < 1, then we have ˆ x i 2f0; 1 2 ;1g for all i2 ˆ N. We use the same iterative rounding algorithm given in Section 3.5.1, as long as we modify the objective function of the Variable Fixing problem to reflect the objective function of problem (B.30). Thus, we replace the Variable Fixing problem in Step 2 with f k = max ( å q2M m q (z)y q + å i2B q q iq (z)x i ! : (x;y)2P(H k ) ) : (B.34) At the first iteration of the iterative rounding algorithm, we have H 1 =?. Also, as discussed earlier, there exists an optimal solution(x ;y ) to problem (B.30), where we have x i = 0 for all i62 ˆ N and y q = 0 for all q62 ˆ M. Therefore, we have f 1 = f R . As the iterations of the iterative rounding algorithm progress, we fix additional variables at the value 1 2 . Therefore, the optimal objective value of problem (B.34) degrades from iteration k to k+1. We can use the same approach in the proof of Lemma 3.5.3 to upper bound the degradation in the optimal objective value. In particular, we can show that f k f k+1 1 2 å q2M 1(i k 2 B q )q i k ;q , where i k is the product that we choose in Step 3 of the iterative rounding algorithm at iteration k. To see this result, we define the solution(~ x; ~ y) to problem (B.34) at iteration k+1 as follows. Letting(x k ;y k ) be an optimal solution to problem (B.34) at iteration k and i k be the product that we choose in Step 3 of the iterative rounding algorithm at iteration k, we set ˜ x i = x k i for all i2 ˆ Nnfi k g, ˜ x i k = 1 2 and ˜ x i = 0 for all i62 ˆ N. Also, we set ˜ y q = å i2B q ˜ x i jB q j+ 1 + for all q2 ˆ M and ˜ y q = 0 for all q62 ˆ M. Using the same approach in the proof of Lemma 3.5.3, we can show that the solution (~ x; ~ y) is feasible to problem (B.34) at iteration k+ 1. Furthermore, we can show that y k q ˜ y q for all q2 M. In this case, since (~ x; ~ y) is a feasible but not necessarily an optimal solution problem (B.34) at iteration k+ 1, we obtain f k f k+1 f k å q2M m q ˜ y q + å i2B q q iq ˜ x i ! = f k å q2M m q ˜ y q å i2N å q2M 1(i2 B q )q iq ˜ x i = å q2M m q (y k q ˜ y q ) + å q2M 1(i k 2 B q )q i k ;q x k i k 1 2 + å i2Nnfi k g å q2M 1(i2 B q )q iq (x k i ˜ x i ) 171 1 2 å q2M 1(i k 2 B q )q i k ;q : In the chain of inequalities above, the first equality holds since(x k ;y k ) is an optimal solution to problem (B.34) at iteration k. To see the last inequality, note that m q 0 and y k q ˜ y q 0 for all2 ˆ M, whereas we have y k q = 0 for all q62 ˆ M by the definition ofP(H) and ˜ y q = 0 for all q62 ˆ M. Similarly, we have ˜ x i = x k i for all i2 ˆ Nnfi k g and x k i = 0= ˜ x i for all i62 ˆ N. Lastly, we have x k i k 1 and i k 2 ˆ N in the iterative rounding algorithm, soq i k ;q 0 for all q2 M such that i k 2 B q . Building on the fact that f k f k+1 1 2 å q2M 1(i k 2 B q )q i;k q , we can use the same approach in the proof of Lemma 3.5.4 to show that if(x ;y ) is an optimal solution to problem (B.34) at the last iteration of the iterative rounding algorithm, thenå q2M (m q y q +å i2B q q iq x i ) 1 2 f R . B.7.4.2 Feasible Subsets Through Coupled Randomized Rounding We let(x ;y ) be an optimal solution to problem (B.34) at the last iteration of the iterative rounding algo- rithm. At the last iteration, there is no product i2 ˆ N such that 1 2 < x i < 1, in which case, by the analogue of Lemma 3.5.1 for the polyhedronP(H) in (B.33), we have x i 2f0; 1 2 ;1g. We apply the coupled random- ized rounding approach in Section 3.5.2 without any modifications to obtain a random subset of products ^ X =f ˆ X i : i2 ˆ Ng. In this case, through minor modifications in the proof of Theorem 3.5.2, we can show thatå q2M EfV q ( ^ X) g q (R q ( ^ X) ˆ z)g 1 2 å q2M (m q y q +å i2B q q iq x i ). Therefore, noting the discussion at the end of the previous section, we getå q2M EfV q ( ^ X) g q (R q ( ^ X) ˆ z)g 1 2 å q2M (m q y q +å i2B q q iq x i ) 1 4 f R . We describe the modifications in the proof of Theorem 3.5.2. For each nest q, we will show that EfV q ( ^ X) g q (R q ( ^ X) ˆ z)g 1 2 (m q y q +å i2B q q iq x i ). By the construction of the coupled randomized rounding approach, Ef ˆ X i g= x i . Noting (B.28), we have V q ( ^ X) g q (R q ( ^ X) ˆ z)=m qÕ i2B q ˆ X i +å i2B q q iq ˆ X i . IfjB q j= 1, then the definition of m q implies that m q = 0, in which case, we obtain EfV q ( ^ X) g q (R q ( ^ X) ˆ z)g=å i2B q q iq x i 1 2 å i2B q q iq x i = 1 2 (m q y q + å i2B q q iq x i ). To see the inequality, note that if i2 ˆ N, thenq iq 0, but if i62 ˆ N, then x i = 0. Thus, we have EfV q ( ^ X) g q (R q ( ^ X) ˆ z)g 1 2 (m q y q +å i2B q q iq x i ) wheneverjB q j= 1. In the rest of the discussion, we assume thatjB q j= 2 and consider three cases. Case 1: Suppose that the two products in B q are paired in the coupled randomized rounding approach. Therefore, we have Õ i2B q ˆ X i = 0. Since the products in B q are paired, we have x i = 1 2 for all i2 B q , in which case, noting problem (B.34), we must have i2 ˆ N for all i2 B q . Thus, we obtain q2 ˆ M. In 172 this case, we haveEfV q ( ^ X) g q (R q ( ^ X) ˆ z)g=å i2B q q iq x i 1 2 (m q y q +å i2B q q iq x i ), where the inequality holds becausem q 0 andq iq 0 for all i2 B q for any nest q2 ˆ M. Case 2: Suppose that the two products in B q are not paired in the coupled randomized rounding approach and q2 ˆ M. In this case,f ˆ X i : i2 B q g are independent so that EfV q ( ^ X) g q (R q ( ^ X) ˆ z)g = m qÕ i2B q x i +å i2B q q iq x i . Furthermore, since q2 ˆ M, we have m q 0, in which case, y q = å i2B q x i jB q j+ 1 + = å i2B q x i 1 + . Noting that å i2B q x i 2, if x j = 0 for some j2 B q , then we have y q = å i2B q x i 1 + = 0=Õ i2B q x i . Thus, it follows that EfV q ( ^ X) g q (R q ( ^ X) ˆ z)g = m qÕ i2B q x i + å i2B q q iq x i 1 2 (m qÕ i2B q x i + å i2B q q iq x i ) = 1 2 (m q y q + å i2B q q iq x i ); (B.35) where the inequality holds because if q2 ˆ M, then we have r q 0 and q iq 0 for all i2 B q , so get m qÕ i2B q x i +å i2B q q iq x i =r qÕ i2B q x i +å i2B q q iq 1Õ j2B q nfig x j x i 0. Similarly, if x j = 1 for some j2 B q , then we can show that y q = å i2B q x i 1 + =Õ i2B q x i and we can follow the same line of reasoning in (B.35) to getEfV q ( ^ X) g q (R q ( ^ X) ˆ z)g 1 2 (m q y q +å i2B q q iq x i ). Lastly, if x i = 1 2 for all i2 B q , then we have y q = å i2B q x i jB q j+ 1 + = å i2B q x i 1 + = 0, so EfV q ( ^ X) g q (R q ( ^ X) ˆ z)g = m qÕ i2B q x i + å i2B q q iq x i = r qÕ i2B q x i + å i2B q q iq 1 Õ j2B q nfig x j ! x i å i2B q q iq 1 Õ j2B q nfig x j ! x i = 1 2 (m q y q + å i2B q q iq x i ); where the last equality in the chain of inequalities above holds because we have y q = 0 and x j = 1 2 for all j2 B q , along with the fact thatjB q j= 2. Case 3: Suppose that the two products in B q are not paired in the coupled randomized rounding ap- proach and q62 ˆ M. Since q62 ˆ M, noting problem (B.34), we get y q = 0. Furthermore, since q62 ˆ M, we have p j < ˆ z for some j2 B q , in which case, j62 ˆ N. Therefore, we have x j = 0, which implies thatÕ i2B q x i = 0. In this case, we getEfV q ( ^ X) g q (R q ( ^ X) ˆ z)g=m qÕ i2B q x i +å i2B q q iq x i =å i2B q q iq x i 1 2 å i2B q q iq x i = 1 2 (m q y q +å i2B q q iq x i ), where the inequality holds because if i2 ˆ N, thenq iq 0, but if i62 ˆ N, then x i = 0. In all of the three cases considered above, we haveEfV q ( ^ X) g q (R q ( ^ X) ˆ z)g 1 2 (m q y q +å i2B q q iq x i ), as desired. Therefore, the random subset of products ^ X satisfieså q2M EfV q ( ^ X) g q (R q ( ^ X) ˆ z)g 0:25 f R . 173 Also, we have å i2N ˆ X i c by the construction of the coupled randomized rounding approach. As in Section 3.5.2, we can de-randomize the subset of products ^ X by using the method of conditional expec- tations to obtain a deterministic subset of products ^ x that satisfieså q2M V q (^ x) g q (R q (^ x) ˆ z) 0:25 f R and å i2N ˆ x i c. 174 Appendix C Supporting Arguments for Chapter 4 The additional materials in this part include two parts. In the first part, Sections C.1 to C.6, we provide additional theoretical results to complement the discussions in the main text. In particular, in Section C.1, we characterize the consumer optimal stopping problem when customized product sorting — through which the consumer can sort the items using prices or other metrics before the search — is provided by the platform. We show that a generalized threshold policy is optimal. The results generalize Lemma 4.2.1. In Section C.2, we extend the base model considered in Section 4.2 to the satisficing choice model. In this model there is an exogenous threshold such that the consumer stops the search whenever she finds an item with utility higher than this threshold, and an exogenous probability with which the consumer abandons the search after viewing an item. We show that we can generalize our results for the base model to this setting as well. In Section C.3, we show that the well-known Vickrey-Clarke-Groves (VCG) mechanism cannot be used to sell top slots, because it violates the zero-payment property we specified in Section 4.5. In Section C.4, we discuss an alternative, constructive approach to specify the payment function for SOR, of which we gave an integral form representation in Theorem 4.5.1. In Section C.5, we show that when the platform sells all slots, our mechanism can be implemented as a Nash equilibrium in a modified generalized second price (GSP) auction. In Section C.6, we consider the optimal ranking and position ranking if instead of maximizing the weighted surplus, the platform is interested in maximizing its direct revenue. The approach used in the main text can be extended in this case to derive optimal ranking rules and robust auction techniques. In the second part, starting from Section C.7, we present proofs of the theoretical results in both Chapter 4 and this chapter. 175 C.1 Consumer Search with Item Sorting The base model assumes that the consumer believes that(u i ; p i ) are ex-ante i.i.d. distributed before eval- uating each product in detail. However, e-commerce platforms can provide tools that allow the consumer to rank the search results, for example by price, relevance, or the number/quality of reviews, and so on, before the search. We now discuss how the consumer would search with such sortings. For this purpose, let us assume that the joint distribution of u i and p i is G J (u; p). Note that the distribution of u i p i , G(), is determined by G J (;). Let us suppose without loss of generality that with a sorting rule F, the items are indexed in decreasing order of their indexes. Also, the consumer still searches downward in the list and evaluating each item costs her search cost s. Denote the history after viewing item i as H i =f(u k ; p k )g i k=1 . Given H i , the consumer may change his belief of the distribution(u j ; p j ) for j i+ 1, which we denote by G J; j u; p F;H i . Note that the history does not includefe k g i k=1 , nor does it impact the consumer’s belief of e j for j i+ 1, since e i ’s are the idiosyncratic terms of consumer utility that are unobservable to the platform, and hence not reflected in the sorting. Let us denote the utility of the current best option after viewing i items asV max . Note that given the above assumption, the optimal stopping problem is a standard Markovian decision process with state variables described by the triple (V max ;H i ;F). With standard dynamic programming techniques, we can use backward induction to recursively solve for the optimal policy [114]. However, here we can show that the optimal stopping policy is characterized by a generalized threshold policy. Proposition C.1.1 (Threshold Policy with Customer Sorting) There exists an optimal policy such that given H i and F, the policy is characterized by a threshold ˜ v i (H i ;F) such that the consumer stops searching and accepts the current best option wheneverV max > ˜ v i (H i ;F). In general, to find ˜ v i (H i ;F) mentioned in Proposition C.1.1, one needs to recursively solve a series of Bellman equations. Next, we show that in some special cases, the calculations can be greatly simplified. Let F 1 () F 2 () denote the convolution of distributions of F 1 () and F 2 (). We use G(jH i ;F) to denote the distribution of u i+1 p i+1 given the history and letY(jH i ;F)= G(jH i ;F)H() denote the distribution of u i+1 p i+1 +e i+1 . Furthermore, we say that H i+1 =f(u 0 k ; p 0 k )g i+1 k=1 covers H i =f(u k ; p k )g i k=1 , if(u 0 k ; p 0 k )= (u k ; p k ) for k i. 176 Proposition C.1.2 (Simplified Threshold Calculation) If for any i, and any H i and H i+1 such that H i+1 covers H i , it holds that G(jH i ;F) G(jH i+1 ;F), then ˜ v i (H i ;F) is the unique solution to the equation s= R ¥ ˜ v (1Y(xjH i ;F))dx. Intuitively, G(jH i ;F) G(jH i+1 ;F) says that for an item viewed first, its intrinsic utility is stochasti- cally larger than an item subsequently viewed. This is the case if the sorting system benefits the consumer. Furthermore, note that this result includes Lemma 4.2.1 as a special case. Next, we discuss some other special cases to Proposition C.1.2. These results are straightforward to verify by definitions, so we skip the proofs. Proposition C.1.3 (Special Cases to Product Sorting) The following results hold. For part (i) and (ii) we assume that u k and p k are independent, i.e., G J (u; p)= G u (u)G p (p) for some distribution functions G u () and G p (). (i) Assume that items are sorted in increasing order of p k . It follows that G(jH i ;F)= G u () G p i (jp i ), where G p i (xjp i )= 8 > > < > > : ¯ G p (x) ¯ G p (p i ) ni ; if xp i ; 1; otherwise, and G(jH i ;F) G(jH i+1 ;F) for all i, H i , and H i+1 such that H i+1 covers H i . (ii) Assume that items are sorted in decreasing order of u k . It follows that G(jH i ;F)= G u i (ju i ) G p (), where G u i (xju i )= 8 > > < > > : G u (x) G u (u i ) ni ; if x u i ; 1; otherwise, and G(jH i ;F) G(jH i+1 ;F) for all i, H i , and H i+1 such that H i+1 covers H i . (iii) Assume that items are sorted in decreasing order of u k p k . It follows that G(jH i ;F) G(jH i+1 ;F) for all i, H i , and H i+1 such that H i+1 covers H i . In particular, part (i) corresponds to the case in which the consumer sorts items by price, and the qualities are independent of prices. This is plausible when, for example, the quality difference of the 177 products is very small. If the consumer sorts the items by relevance, then arguably part(ii) can be a good approximation. It is also possible that the consumer sorts the items by the number of reviews, which might take into account both u i and p i , and be a good proxy for the intrinsic utilities. C.2 The Satisficing Choice Model In the discussion of the base model, we assume that the consumer is fully rational and adopts the optimal sequential search strategy to find the best item. In practice, the modeler might sacrifice this rationality assumption to some extent and propose more flexible models to fit the application. On this track, we propose the following satisficing choice model closely related to the base model. We parameterize the model as follows. The consumer still searches downward in the list. Given some exogenous parameter ˆ v , we assume that whenever the consumer finds an item i such that V i > ˆ v , she will accept the item. Also, we assume that with some exogenous probability that the consumer will abandon the search after each item even if she has not found a desired one. Let ˆ b i = ˆ a i Pfdoes not leavejV i ˆ v g. Also, we assume that the consumer enters the search if her outside option utility V 0 is less than ˆ v . Denote ˆ a i =PfV i ˆ v g, and ˆ b 0 =PfV 0 ˆ v g. In this fashion, given the ranking p, we can write the purchase probability of the item ranked in slot j asÕ k j1 ˆ b p(k) (1 ˆ a p( j) ). We are still interested in maximization of the weighted surplus. Denote the total expected search cost that the consumer incurs with rankingp by S p . We will solve max p ( n å j=1 Õ k j1 ˆ b p(k) (1 ˆ a p( j) ) g 1 q p( j) +g 2 q p( j) +g 3 E V p( j) V p( j) > ˆ v E V 0 V 0 ˆ v g 3 S p ) : (C.1) Let us define ˆ r i (q i ) def = 1 ˆ a i 1 ˆ b i g 1 q p( j) +g 2 q p( j) +g 3 E V i V i > ˆ v 1 1 ˆ a i E V 0 V 0 ˆ v ; which we call as the modified net surplus as item i, which serves the same role to the net surplus in the base model. Note that, when g 3 = 0, it follows that ˆ r(q i )= 1 ˆ a i 1 ˆ b i r i (q i ), where r i () is defined in (4.3), and the objective function in (C.1) reduces to n å j=1 Õ k j1 ˆ b p(k) (1 ˆ a p( j) )r p( j) (q p( j) )= n å j=1 Õ k j1 ˆ b p(k) (1 ˆ b p( j) ) 1 ˆ a p( j) 1 ˆ b p( j) r p( j) (q p( j) ) ! : 178 Comparing this formula against Lemma 4.7.2, we see that using the same argument of Theorem 4.3.1, one can justify that ranking in decreasing order of 1 ˆ a i 1 ˆ b i r i (q i ) is optimal. The ratio 1 ˆ a i 1 ˆ b i captures the fact that in the current setting, we want an item to induce an immediate purchase rather than letting the consumer abandon. Ifg 3 6= 0, the argument is a little bit more involved, but we can still show the following result. Proposition C.2.1 (Optimal Ranking to the Satisficing Choice Model) It follows that n å j=1 Õ k j1 ˆ b p(k) (1 ˆ a p( j) ) g 1 q p( j) +g 2 q p( j) +g 3 E V p( j) V p( j) > ˆ v E V 0 V 0 ˆ v g 3 S p = n å j=1 Õ k j1 ˆ b p(k) (1 ˆ b p( j) ) ˆ r(q p( j) ): (C.2) Moreover, ranking ˆ p (), given by decreasing order of ˆ r i (q i ) (ties can be broken arbitrarily), is optimal to (C.1). We can discuss selling top slots in a mechanism design framework for the satisficing choice model as well. In particular, we can define the modified surplus-ordered ranking (MSOR) using the modified net surplus, in the same way as Definition 6. Definition 3 (Modified-Surplus-Ordered Ranking) Given a base order p 0 , the modified surplus- ordered ranking (MSOR), p MSOR k () (with inverse mapping s MSOR k ()), is the ranking rule that satisfies the following properties: (i) p MSOR k () ranks the k sellers with the highest modified net surplus in the top k positions in decreas- ing order of their modified net surplus. 1 (ii) The order of the sellers ranked on position k+1 to n is consistent with the base order: s MSOR k (;i)> s MSOR k (; j) if and only ifs 0 (i)>s 0 ( j), for all sellers i and j such thats MSOR k (;i);s MSOR k (; j) k+ 1. Following the same argument to Theorem C.4.1, we can show that there exists a unique mechanism that satisfies the properties mentioned in Theorem 4.5.1. Proposition C.2.2 (Payment Function of MSOR) There exists a unique payment function t MSOR k (b;i)= q s MSOR k (b);i b i R b i 0 q s MSOR k (h;b i );i dh for all i2 N such that the MSOR mechanism M MSOR k (b) = fp MSOR k (b);t MSOR k (b)g satisfies the following four desired properties: (i) dominant-strategy incentive 1 For ease of exposition, we break the ties according to base orderp 0 . 179 compatibility;(ii) ex-post individual rationality;(iii) non-negative payment; and(iv)(zero-paymentprop- erty) ifs MSOR k (b;i) k+ 1, then t MSOR k (b;i)= 0. Moreover, let us define ˆ w= min 1in g 2 q p( j) +g 3 E V i V i > ˆ v 1 1 ˆ a i E V 0 V 0 ˆ v . Then in view of (C.2), a similar line of analysis to that of Theorem 4.4.4 can justify the following. Proposition C.2.3 (Performance Guarantee of MSOR) Assume that ˆ w 0. Suppose that without loss of generality ˆ b 1 ˆ b 2 ˆ b n . For any given k, the ratio of the expected weighted surplus of MSOR and that of the optimal ranking with complete information satisfies E å n j=1 Õ t j1 ˆ b p MSOR k (t) (1 ˆ a p MSOR k ( j) )r p MSOR k ( j) (q p MSOR k ( j) ) E å n j=1 Õ t j1 ˆ b ˆ p (t) (1 ˆ a ˆ p ( j) )r ˆ p ( j) (q ˆ p ( j) ) 1 Õ 1ik a i : Discussions in the section also raises the point that, mathematically speaking, the bulk of our results essentially hold under the assumption that the purchase probability takes a “multiplicative” form. This may help the platform construct models that fit specific applications and enjoy the nice properties discussed in this chapter. C.3 Limitation of the VCG Mechanism in Selling k Slots In this section, we show that the VCG mechanism cannot be used in our context. We first formally define the set of feasible ranking rules that we may use to sell the top slots. Recall that a base order is a ranking of items independent of private types. Definition 4 Given a base order p 0 , define P k 2P as the set of ranking rules that are functions of, and consistent with the base order: namely, if p k ()2P k (with inverse mapping s k ()), then s k (;i)> s k (; j) if and only ifs 0 (i)>s 0 ( j), for all sellers i and j ranked in position k+1 to n byp k (). In words, P k is the collection of rankings such that the selection and ranking of the top k sellers depend on while the ranking of other sellers is consistent withp 0 . Notice that p SOR k 2P k . With Definition 4, we let p VCG k (with inverse mapping s VCG k ) be the optimal ranking in P k , which maximizes the weighted surplus for a given. We first clarify that when k = 1, p SOR 1 6=p VCG 1 . Thus, SOR is different from the optimal ranking inP k . 180 Example C.3.1 (SOR is different from optimal ranking) Suppose N =f1;2;3g, and k = 1. Further- more, assume that q 1 = 1, q 2 = 0, and q 3 = 7 8 . Let a 0 = 1, a 1 = 1 4 , a 2 = 0 and a 3 = 1 8 . Also, let g 2 =g 3 = 0, and g 1 = 1 to focus on seller surplus. The base order and SOR are both given by (1;2;3). Intuitively, because seller 2 will “steal” all of the demand for the sellers ranked below, but generate zero profit himself, we want to place seller 2 in the last position with p VCG 1 . Formally, one can verify that p VCG 1 () is given by(3;1;2), which is inconsistent with SOR. Thus,p SOR 1 6=p VCG 1 . Note thatp VCG 1 () only gives seller 2 purchase probability 1 32 . Therefore, SOR is not optimal in terms of maximizing the social welfare. However, as we proved in Theorem 4.4.4, even compared with the benchmark optimal surplus, the welfare loss with SOR is marginal. A natural follow-up question is whether we can, alternatively, use VCG to sell k slots. We will show that the payment function induced byp VCG k violates the zero-payment property. Example C.3.2 (VCG violates the zero-payment rule) We show that the p VCG 1 does not satisfy our de- sired properties. We follow the setup of Example C.3.1 here, with the exception that we change the valua- tion of seller 2 toq 0 2 such thatq 0 2 =q 3 = 7 8 >q 2 = 0. The optimal rankingp VCG 1 (q 0 2 ; 2 ) is then changed to (1;2;3). This implies that although seller 2 is not ranked in the top position with q 0 , he now imposes greater externalities to others by changing the overall ranking from(3;1;2) to(1;2;3), so he has to pay. Formally, one can verify as follows. Note that the purchase probability of seller 2 is now increased from 1 32 to 1 4 . Given the optimal ranking rulep VCG 1 , Proposition 4.7.5 asserts that with b 2 as the bid from seller 2, incentive compatibility implies q s VCG 1 (b 2 ; 2 );2 q 2 t(b 2 ; 2 ;2)= Z b 2 0 q s VCG 1 (h; 2 );2 dh+C 2 ( 2 ); for some constant C 2 ( 2 ). 2 Note that when q 2 = 0, seller 2 is not ranked in the top position so he pays zero, which suggests that C 2 ( 2 )= 0. Therefore, withq 0 2 = 7 8 one can calculate the right-hand side equal to 7 64 . On the left hand, q s VCG 1 (q 0 2 ;q 2);2 q 0 2 = 7 32 . This shows seller 2’s payment t(q 0 2 ; 2 ;2) is 7 64 , even though he is not ranked on top. The previous example further justifies our approach with SOR. Although the VCG approach fails, inspired by the central idea of “internalizing the externalities” in its design, we present in Section C.4 a different method to construct the payment function of SOR, in which we measure a seller’s marginal 2 The proper use of Proposition 4.7.5 requires the monotonicity of purchase probabilities determined byp VCG 1 in seller’s valuations, which can be proven by contradiction. 181 contribution to the social welfare in SOR to deduce the same payment function as Theorem 4.5.1. More importantly, this method provides a simple two-step procedure (Algorithm 1) to calculate the payment function without evaluating the integral in Theorem 4.5.1. Furthermore, on the terminology side, the VCG mechanism usually optimizes the sum of the agents’ utilities in a conventional mechanism framework. In our context we are concerned with the weighted surplus maximization, which is an affine function of the sum of sellers’ utilities. The VCG mechanism can be extended to handle such cases as well, and we are adopting this extension in our discussion here. Such an extension on VCG is common in the algorithmic game theory literature. For a detailed discussion, we refer readers to the book [108], Chapter 9. The discussion in the section also partially explains why we does not pursue the optimal ranking of selling k slots. In order to discuss the optimality, one needs to first determine the scope of feasible solutions. Naturally, one might useP k as the feasible set of rankings, but as the discussion above shows, the optimal ranking inP k not only is complex, but it also fails the zero-payment property. On the other hand, if one shrinks the feasible set, and discusses “the optimal mechanism of selling k slots,” then the problem strikes us as a hard combinatorial optimization since pointwise optimization (for each) is no longer sufficient due to the existence of zero-payment properties. Instead, the SOR mechanism, although constructive in nature, enjoys the advantage of being intuitively easy to understand, satisfying all mechanism constraints, and foremost have near-optimal performance. C.4 SOR Payment Function with Virtual Societies Although Theorem 4.5.1 establishes the properties we need in theory, a compact way of determining and writing the payment function does not directly follow. In this part, connecting our problems with the classical VCG mechanism, we take a different course to show how we can easily determine the payment function with SOR. For this purpose, we define “virtual sellers” and “virtual societies,” and show that the SOR mechanismM SOR k (b)=(p SOR k (b);t SOR k (b)) as in Theorem 4.5.1, is equivalent to the VCG mech- anism in the virtual societies. This approach also produces a simple procedure to calculate the payment function rather than using the integral form, which could be complicated to evaluate and hard to compre- hend. In particular, we show that although SOR is not the optimal ranking, by modifying the net surplus of a subset of sellers to produce the corresponding virtual sellers, we can map SOR to the optimal ranking rule in the virtual societies. When determining a seller’s payment, a key property of this mapping is that 182 it preserves the purchase probabilities of this particular seller under consideration. This allows us to apply VCG in virtual societies to obtain the payment. Specifically, given the bids of sellers,b, we assume without loss of generality that r i (b i )r i 0(b i 0) if i< i 0 . For the definition ofr i (), please refer to (4.3). For each seller i, let D i be the set of sellers ranked before seller i when he reports zero, and seller i himself. Intuitively, this is the set of sellers subject to seller i’s externalities. Formally, D i def =f j :s SOR k (0;b i ; j)s SOR k (0;b i ;i)g. As mentioned above, a key component in the design of the payment function is to apply the VCG mechanism to the virtual societies. Also, notice that in terms of implementing VCG in our context, each seller is characterized by two parameters, i.e.,a i andr i . Therefore, on a high level, N i , which is the virtual society of seller i, consists of virtual sellers d i ( j) for j2 D i , whose corresponding parameters,a d i ( j) and r d i ( j) , are modified from seller j2 D i . More specifically, the virtual society N i for seller i is given by the following definition. Definition 5 (i) IfjD i j> k, we define the pivotal seller to seller i,d(b i ), as the seller in D i with the kth highest net surplus when the reported valuations are(0;b i ). In this case, for j2 D i and j6= i, let d i ( j) be a virtual seller such thata d i ( j) =a j , and r d i ( j) = 8 > > < > > : r d(b i ) b d(b i ) ; ifr j (b j )r d(b i ) b d(b i ) ; r j (b j ); ifr j (b j )>r d(b i ) b d(b i ) : Let d i (i) be a virtual seller such thata d i (i) =a i andr d i (i) =r i (b i ). Define the virtual society for seller i as N i def =fd i ( j) : j2 D i g. (ii) IfjD i j k, then N i =fd i ( j) : j2 D i g, where d i ( j) is a virtual seller such thata d i ( j) =a j (b j ), and r d i ( j) =r j (b j ). Given N i , we denote N i i = N i nfd i (i)g[fd i 0 (i)g, where d i 0 (i) is a virtual seller such that r d i 0 (i) =w i anda d i 0 (i) =a i . The payment function is calculated in two steps as given in Algorithm 1. In the definition above, we map each seller j in D i to a virtual seller d i ( j) in the virtual society. If seller j’s net surplus is less than that of the pivotal seller, i.e., r d(b i ) b d(b i ) , we inflate the corresponding virtual seller’s net surplus to r d(b i ) b d(b i ) . Also observe that for fixed i, the parameters of sellers in N i are independent of his bid b i , except for seller d i (i). Therefore, for any possible bids of seller i, this 183 inflation allows us to map SOR to the optimal ranking in the virtual society and at the same time maintain the same purchase probability of seller i with SOR and that of seller d i (i) in the virtual society. Algorithm 1: Payment Function of SOR Input: The bids of sellersb. Step 1: Determine the virtual societies N i and N i i of seller i for all i2 N. Step 2 (Apply VCG in the virtual societies): Calculate W i (b)= max s ( å j2N i q s; j r j ) ;andW i i (b i )= max s 8 < : å j2N i i q s; j r j 9 = ; ; and set the payment of seller i as t SOR k (b;i)= 1 g 1 W i i (b i )W i (b) + q s SOR k (b);i b i : (C.3) Furthermore, N i i corresponds to the case in which seller i reports zero valuation. Thus,W i i (b i ) is the social surplus value of the virtual society when seller i’s externalities are removed. If everyone reports truthfully, seller i’s utility will be(1=g 1 ) W i ()W i i ( i ) + q i 0 q i , where the first term is his marginal contribution to the society with SOR and the last term is due to the returned demand, which is independent of the ranking. Virtual societies allow us to internalize the externalities of each seller. Zero-payment property can be verified as follows: if with b i seller i is not ranked on the top k positions, virtual sellers d i (i) and d i 0 (i) then have the same ranking in their respective virtual societies, and therefore the same purchase probability, which is equal to q s SOR k (b);i . Therefore, (1=g 1 ) W i (b)W i i (b i ) is exactly q s SOR k (b);i b i . Seller i makes zero-payment. We remark on the following theorem. Proposition C.4.1 The surplus-ordered-ranking (SOR) mechanism defined by M SOR k = p SOR k (b);t SOR k (b) , where p SOR k (b) is given by Definition 1, and t SOR k (b) is given by Eq. (C.3), satisfies the desired properties as stated in Theorem 4.5.1. Furthermore, t SOR k (b) is the unique payment function for SOR qualifying these conditions. In summary, this design has both similarities and differences to VCG due to the greedy nature of SOR. Also, the design reduces to the familiar VCG if we sell all the slots, i.e., k = n. In this 184 case, for any seller i,W(b)W i (b)=W i (b i )W i i (b i ), which is equal to constant Õ j2f0g[D ia i å j2N=D iÕ k2N=D i :k< j a k (1a j )r j (b j ). This term is equal to the part of weighted surplus from sellers not in D i , independent of seller i’s bid. Therefore, Proposition C.4.1 can be used to derive Proposition C.5.1 directly. Instead of writing the payment function in the integral form, this presentation features the con- nection of our problem to the classical VCG from the social welfare maximization perspective together with a simple and compact way of writing the payments. C.5 Selling All Slots In this section, we consider the scenario in which all the positions are auctioned off. This resembles the practice of Google Shopping. Note that SOR is optimal when selling all the slots (see Theorem 4.3.1). In Section C.5.1, we briefly discuss that selling all the slots can be implemented by the VCG mechanism. In Section C.5.2, we show that VCG ranking and payment can be replicated as a Nash equilibrium outcome in a variation of the generalized second price (GSP) auction. 3 C.5.1 The VCG Mechanism In this part, we consider selling all positions with direct mechanisms. In this scenario, SOR is equivalent to the optimal ranking by definition. VCG is sufficient to guarantee incentive compatibility, individual rationality, and non-negative payment. Recall that the optimal ranking rule p is given by decreasing order of the net surplus r i (q i ). Also recall s is the inverse mapping to p . For the rest of this subsection, let us assume that without loss of generality, the bids are such thatr i (b i )r j (b j ) if i< j, sop (b;i)= i for all i. Also recall the following definition of the optimized social net surplus (after scaling), 1 g 1 W(b)= 1 g 1 max s ( å j2N q s; j r j (b j ) ) = 1 g 1 å j2N q s ; j r j (b j ): 3 We assume g 1 > 0 for this section, since otherwise the supply-side surplus is not part of the weighted surplus and there is no need to use an auction to extract private information. 185 Similarly, consider the case in which seller i bids zero whileb i remain unchanged. We can write the optimized social net surplus asW i (b i ). Let the payment of seller i be t (b;i)= 1 g 1 (W i (b i )W (b))+ q s ;i b i : (C.4) If everyone reports truthfully, seller i’s total payoff is(1=g 1 )(W ()W i ( i ))+q i 0 q i . The first term is his marginal contribution to the social surplus. The second term is his surplus from the returned demand, which is invariant to ranking. Following the ideas of classical VCG design [60, 59], we can show the next result. Proposition C.5.1 (The VCG Mechanism) When the platform sells all the slots, the VCG mechanism M (b)=(s (b);t (b)) satisfies incentive compatibility, individual rationality, and non-negative payment. C.5.2 Generalized Second Price (GSP) Auction We now consider an alternative auction method to sell all the slots. In this part, we will again assume that the sellers are such thatq i q j for i< j. In online adwords auctions, such as Google Adwords, one common practice is to use the generalized second price (GSP) auction, which has been widely studied in the literature (see Section 4.1.1). In GSP, sellers are ordered in decreasing order of their bids, and a seller’s payment in each consumer purchase is the bid from the seller ranked below him. Previous work on GSP in online adwords auction has considered supply-side surplus as the objective, with the separable assumption on purchase probabilities [43, 137]. They show that the VCG outcome is in the set of locally envy-free equilibriums in the complete information game. We now show that, even with consumers searches and seller externalities, the VCG mechanism can be implemented as a Nash equilibrium using a modified GSP auction. Specifically, compared with the stan- dard GSP, we introduce a modified payment function for each consumer purchase, as stated in Algorithm 2. This modification guarantees incentive compatibility and individual rationality for the social optimal ranking (Proposition C.5.2). The primary change is that we reduce the payment of seller i byw i =g 1 . This is necessary because we include consumer surplus and aggregate revenue in the weighted surplus maximiza- tion. We shift the bid of the seller ranked behind seller i to excludew i =g 1 , which is not part of the private information. Furthermore, with a multiplicative factor q j;p(b) =(q j;p(b) + q p(b; j) 0 ), we are giving away some free benefits to the sellers. In fact, this is equivalent to only charging the sellers for the fresh demand. This is because the part of sellers’ surplus with the returned demand is not changed with different rankings, and 186 therefore it is not part of the auction. Notice that if we are just concerned with supply-side surplus—the objective of the online adwords auction — the modified payment function assumes the standard form of GSP payment. Algorithm 2: The modified GSP Auction Step 1: Collect the bidsb from sellers, and rank the sellers in decreasing orders of their bids. Denote this ranking byp(b) (with inverse mappings(b)). Step 2: Sellerp(b; j) will make payment q j;p(b) q j;p(b) + q p(b; j) 0 b p( j+1) w p( j) =g 1 ; if j< n and sellerp(b;n) pays zero. It is readily shown that this auction format is not dominant strategy solvable. Therefore, we investigate its equilibrium in a complete information game, a common assumption in the study of the GSP auction [43, 137]. Recall that w = min i2N w i is the minimum quality score of the sellers, and p () is the optimal ranking with complete information defined in Theorem 4.3.1. We have the following proposition. Proposition C.5.2 (Nash Equilibrium of the Modified GSP Auction) In the complete information game of the GSP auction with payment adjustment (Algorithm 2), entering the auction and bidding b i =(1a i ) r i (q i ) g 1 +a i b i+1 ; (C.5) for seller i2 N is a Nash equilibrium, where we define b n+1 =w=g 1 . Furthermore, the resulting ranking is the same as the optimal ranking,p (). Note that we add artificial seller n+ 1 to simplify the presentation. This proposition suggests that incentive compatibility and individual rationality can be achieved by the auction format in Algorithm 2. [43] and [137] connect VCG in the position auctions with separable purchase probabilities to the stable assignment of the assignment game studied in [121] and show that the VCG outcome is a locally envy- free equilibrium. This method is inapplicable in our context with seller externalities and weighted surplus optimization. A key challenge in establishing Proposition C.5.2 is to show that it is in the sellers’ interest 187 to submit bids that are monotone in the net surplus. By incorporating the payment adjustment termw i =g 1 , the proof of Proposition C.5.2 uses the recursive relationship in (C.5) to establish this monotonicity and no one-slot deviation is profitable. We can then inductively show that the bids defined by Eq. (C.5) indeed form an equilibrium. Modified GSP with entry fee. To make sellers’ payments consistent with those of the VCG mechanism, we introduce an entry fee, which adjusts the seller’s expected payoff but has no impact on the sellers’ equilibrium bidding behavior. Let us define entry feeh i for seller i as h i = 8 > > < > > : 1 g 1 W i ( i )V i ( i ) ; for i6= n; 0; for i= n; (C.6) whereV i ( i )= max s:N=fig7!N=fig å j2N=fig q s; j r j (q j )+Õ 0 jn: j6=i a j (1a i )w: We have the following result. Proposition C.5.3 (The Modified GSP and the VCG Mechanism) The entry fees as defined in (C.6) satisfyh i 0 for all i. With this entry fee, in the complete information game of the modified GSP auction, accepting the entry fee and biddingb as defined in Eq. (C.5) is a Nash equilibrium. Furthermore, the resulting ranking and payment are the same as the outcome of VCG mechanismM (). In the proof to the above proposition, we show that the bid of seller i can be equivalently expanded as b i = 1 g 1 q i1;p () V i1 (i1) W () + 1 g 1 r i1 (q i1 ) = 1 g 1 q i1;p () V i1 (i1) å k6=i1 q k;p () r k (q k ) ; (C.7) for i such that 2 i n+ 1. It is readily verified thath i + q i;p () (b i+1 w i =g 1 )= t (;i); therefore, the payments in the equilibrium of the modified GSP match sellers’ payments in VCG. Finally, we mention that if only supply-side surplus is concerned, or g 2 =g 3 = 0, then the entry fee is zero for all seller i. The auction format we presented essentially reduces to the standard GSP auction format. The VCG outcome is just a Nash equilibrium of the standard GSP auction. These discussions provide a theoretical justification of the GSP auction format practiced in the industry. 188 C.6 Platform Profit Maximization The objective function we propose in (4.2) in Section 4.3 considers three components. The consumer surplus encompasses the welfare of the consumer. The aggregate revenue represents the business volume of the platform, which is a critical measure for the platform’s business growth. The supply-side surplus measures the benefit that third-party sellers get through using the platform, including their profit and brand- advertising effect, and so on. With this objective function, we primarily want to focus the attention on long- term operational goals of the platform. In other words, our objective is that through the implementation of search result ranking, the platform can enjoy a healthy growth while simultaneously sharing the benefits with consumers and sellers. From this point of view, we do not emphasize the discussion of allocation of revenues and profits in the main text. Moreover, even though the profit of sellers seems to appear twice with some choices of the weights in the objective function, we do not consider such double counting as a issue, since different terms measure the welfare of different parties, as previously mentioned. One might argue that sometimes the platform could focus on the short-term objectives, such as its own profit. Commonly, a commission fee of a fixed revenue percentage of the third-party seller is collected by the platform. Moreover, some of the items sold can be the platform’s own products. Auction revenue collected from the sponsored search program could also be an important source of platform profit. We may consider the following objective g 4 Commision Fee+g 5 Auction Revenue; (C.8) and call it platform profit maximization. Note that now everything is in monetary terms, so in practice we may simply takeg 4 =g 5 = 1. Since the objective function includes auction revenue, we cast our problem as a mechanism design one. More specifically, letM denote the set of platform’s own items. For i2M , we require thatq i = 0. We seek a mechanism(p();t()), wherep() and t() are the ranking rules and payment functions respectively, that solves the following problem max (p();t()) E " n å i=1 g 4 q s(;i) ˆ p i +g 5 t(;i) # s.t. t(;i)= 08i2M;t(;i) 08i2NnM; (IC); (IR); (p();t())2F: (C.9) 189 Here ˆ p i is the revenue that the platform gains from selling item i. When i2M , ˆ p i is the profit of the item. When i = 2M , ˆ p i =b p i where b is the commission fee percentage. Generically, we use ˜ F to denote all possible feasible mechanisms. Note that ˜ F depends on the context. For example, ˜ F might be unconstrained, if the platform wants to sell all slots. If the platform sells only the top k slots, then we may specify ˜ F asf(p();t()) : p()2P k ;t() satisfies the zero-payment propertyg; whereP k is defined in Section C.3. For i2M , we require thatq i = 0 and t(;i)= 0. (IC) and (IR) stand for incentive compatibility and individual rationality, respectively. The expectation here is respect to the randomness of. Note that t(;i) depends on the ranking. Also, notice that we have ignored the returned demand, as we are typically considering a market with a large number of sellers, on which the returned demand is negligible. To a large extent, what we have derived for the weighted surplus maximization, still holds. In particular, we will show that a simple sorting rule similar to Theorem 4.3.1 is optimal to the current setting. We need the following lemma first. Lemma C.6.1 (Alternative Representation to Platform Profit Maximization) (C.9) can be equiva- lently written as max E " n å i=1 q s(;i) g 4 ˆ p i +g 5 q i ¯ F i (q i ) f i (q i ) 1(i= 2M) # s:t: q s(q i ; i );i q s(q 0 i ; i );i 8q i q 0 i ;8 i2NnM q s();i q i t(;i)= Z q i 0 q s(h; i ) dh;8 i2NnM t(;i)= 08i2M; (p();t())2F: (C.10) The significance of Lemma C.6.1 lies in that it gives us a form that is much more convenient to work with. This result is proved by a classical trick in mechanism design that was first used in [104]. In particular, in the setting in which we can sell all slots,F is unconstrained and we can solely focus on the rankingp(). As long as it induces purchase probability of seller i monotone inq i , we have a feasible mechanism. Let us definek i def = g 4 ˆ p i +g 5 q i ¯ F i (q i ) f i (q) 1(i= 2M) as the net virtual profit of item i. In the mechanism design literature, the quantity q i ¯ F i (q i ) f(q i ) is usually known as the virtual valuation. We can prove the next result, which is a close analog to Theorem 4.3.1. A simple sorting rule is also optimal in the current setting. 190 Proposition C.6.2 (Optimal Ranking to Platform Profit Maximization) Assume that q i ¯ F i (q i ) f i (q i ) is in- creasing in q i for all i2N nM . When the platform sells all slots, then the optimal solution to (C.9) is solved by ordering items in decreasing order of their net virtual profit — i.e.,k i . Let us denote this rankingp ;Prot (). We skip the proof and give the key arguments here. The result can be established by first using the interchange argument adopted in the proof of Theorem 4.3.1 and then verifying the monotonicity condition on q s(q 0 i ; i );i , which ensures that the ranking induces a proper mechanism. The monotonicity condition is guaranteed by the condition thatq i ¯ F i (q i ) f i (q i ) is increasing inq i . Note that this is a standard assumption in the mechanism design literature and satisfied by any distribution with an IFR. To sell the top slots, we may consider the following analogue to SOR. Definition 6 (Profit-Ordered Ranking) Given a base order p 0 , the profit-ordered ranking (POR), p ROR k () (with inverse mappings ROR k ()), is the ranking rule that satisfies the following properties: (i) p POR k () ranks the k sellers with the highest net virtual profit in the top k positions in decreasing order of their net virtual profit. (ii) The order of the sellers ranked on position k+1 to n is consistent with the base order: s POR k (;i)> s POR k (; j) if and only if s 0 (i)>s 0 ( j), for all sellers i and j such that s POR k (;i);s POR k (; j) k+ 1. One can further prove the next result. Proposition C.6.3 (Properties of POR) Assume thatq i ¯ F i (q i ) f i (q i ) is increasing inq i for all i2NnM . (i) There exists a unique payment function t POR k (;i)= q s POR k ();i q i R q i 0 q s POR k (h; i );i dh for all i2 N such that the POR mechanismM POR k () =fp POR k ();t POR k ()g satisfies the following four de- sired properties: (a) IC; (b) IR; (c) non-negative payment; and (d) if s POR k (;i) k+ 1, then t POR k (;i)= 0. (ii) Suppose thatq i ¯ F i (q i ) f i (q i ) 0 for all i2NnM and that without loss of generalitya 1 a 2 a n . It then holds that E å n i=1 g 4 q s POR ();i ˆ p i +g 5 t(;i) E å n i=1 g 4 q s ;Prot ();i ˆ p i +g 5 t(;i) 1 Õ 1ik a i : 191 Finally, we mention that if the platform simultaneously considers both the long-term objective (weighted surplus) and short-term objective (platform revenue) by taking their weighted sum, our anal- ysis can be extended as well. C.7 Proof of Lemma 4.2.1 We first argue that there exists a unique solution to Eq. (4.1). We restated it as a lemma. We useY() to denote the c.d.f. of V andf() to denote its p.d.f.. Lemma C.7.1 When s> 0, there exists a unique solution to the following equation. v=P V v v+P V > v E[VjV > v] s: (C.11) Let the solution be v . When v< v , v<P V v v+P V > v E[VjV > v] s and when v> v , v> P V v v+P V > v E[VjV > v] s. Proof of Lemma C.7.1: Let the c.d.f. and p.d.f of V beY() and y(), respectively. Note that the above equation is equivalent to s=P V > v (E[VjV > v] v)= Z +¥ v (x v)y(x)dx= Z +¥ v ¯ Y(x)dx: It is easy to see that R ¥ v ¯ Y(x)dx is a decreasing function of v. Furthermore, we can show that lim v!+¥ R +¥ v ¯ Y(x)dx = 0 and lim v!¥ R +¥ v ¯ Y(x)dx =+¥. Therefore, the existence of the solution is guaranteed. We now argue the uniqueness. If for v 1 < v 2 both are solutions, then R ¥ v ¯ Y(x)dx= s for v2[v 1 ;v 2 ] Note that R ¥ v ¯ Y(x)dx 0 = ¯ Y(v). Therefore, ¯ Y(v)= 0 in [v 1 ;v 2 ]. This implies that for v2[v 1 ;+¥), ¯ Y(v)= 0 and R ¥ v ¯ Y(x)dx= s. However, this is impossible, since lim v!+¥ R +¥ v ¯ Y(x)dx= 0. Moreover, P V v v+P V > v E[VjV > v] v 0 = Z +¥ v ¯ Y(x)dx 0 = ¯ Y(x) 0 implies the last statement, since the solution is unique. 192 Proof of Lemma 4.2.1: We use backward induction to characterize the optimal policy of the consumer. Denote as v i the realization of V i . LetV max be the utility of the current best option, including the non- purchase option and all the items viewed so far. Also, we use V for the random variable following the same distribution as the random variables V i , i= 1;2;:::;n. It is easy to see that after viewing item n1, she adopts the current best option without looking at item n if and only if V max >P VV max V max +P V >V max E[VjV >V max ] s; where the left-hand-side is her utility if she purchases now and the right-hand-side is her expected utility if she views item n. This inequality is equivalent toV max > v due to Lemma C.7.1. Therefore, after viewing item k= n 1, her optimal policy is 8 > > < > > : purchase product i with v i =V max ; ifV max > v ; view item n; ifV max v : Now suppose that the statement is true for 1 k n 1, we show that the same statement is true for k 1. After viewing item k 1, suppose thatV max > v , we haveV max >P VV max V max +P V > V max E[VjV >V max ] s by the monotonicity as mentioned in Lemma C.7.1. The left-hand-side is her utility if she purchases now. The right-hand-side is her expected utility if she views item k, since after viewing item k, her utility will be maxfV max ;v k g and therefore she will adopt the current best option immediately. Therefore, the customer purchases whenV max > v . Suppose thatV max < v , we have V max <P VV max V max +P V>V max E[VjV >V max ]s by the monotonicity. The left-hand-side is her utility if she purchases now and the right-hand-side is her expected utility if she views item k and ignores the later items, which is weakly less than her optimal expected utility. Therefore, the customer views item k whenV max < v . Therefore, after knowing the valuation of the no-purchase option v 0 , the customer would stop after finding the first k such thatV max > v for 0 k n. 193 C.8 Proof of Lemma 4.7.2 Proof of Lemma 4.7.2: It can be verified that n å j=1 q j;p E V 0 V 0 v = n å j=1 1a p( j) j1 Õ k=0 a p(k) ! E V 0 V 0 v =a 0 1 Õ 1in a i ! E V 0 V 0 v : Then note that L p ()+g 3 n å j=1 q j;p E V 0 V 0 v =g 1 n å j=1 q j;p q p( j) +g 2 n å j=1 q j;p p p( j) +g 3 n å j=1 q j;p E V p( j) V p( j) v js = n å j=1 1a p( j) j1 Õ k=0 a p(k) ! r p( j) q p( j) + g 3 s 1a p( j) g 3 js ! = n å j=1 1a p( j) j1 Õ k=0 a p(k) ! r p( j) q p( j) + n å j=1 1a p( j) j1 Õ k=0 a p(k) ! g 3 s 1a p( j) js ! = n å j=1 1a p( j) j1 Õ k=0 a p(k) ! r p( j) q p( j) +g 3 s n å j=1 j1 Õ k=0 a p(k) j j1 Õ k=0 a p(k) + j j Õ k=0 a p(k) ! = n å j=1 1a p( j) j1 Õ k=0 a p(k) ! r p( j) q p( j) +g 3 sn n Õ j=0 a p( j) = n å j=1 1a p( j) j1 Õ k=0 a p(k) ! r p( j) q p( j) +g 3 sn n Õ i=0 a i ; where the third line follows from the definition of r p( j) q p( j) , the sixth line follows because of the following telescope sum: n å j=1 j1 Õ k=0 a p(k) j j1 Õ k=0 a p(k) + j j Õ k=0 a p(k) ! = n å j=1 j1 Õ k=0 a p(k) + n å j=1 j j Õ k=0 a p(k) j j1 Õ k=0 a p(k) ! = n å j=1 j1 Õ k=0 a p(k) + n å j=1 j1 Õ k=0 a p(k) + n n Õ j=0 a p( j) ! = n n Õ j=0 a p( j) ; and the last line follows becausep() is a permutation. Combining everything together, we have L p ()= n å j=1 1a p( j) j1 Õ k=0 a p(k) ! r p( j) q p( j) +g 3 ns n Õ i=0 a i g 3 n å j=1 q j;p E V 0 V 0 v 194 = n å j=1 1a p( j) j1 Õ k=0 a p(k) ! r p( j) q p( j) +g 3 ns n Õ i=0 a i g 3 a 0 1 Õ 1in a i ! E V 0 V 0 v as desired. C.9 Proof of Theorem 4.3.1 Proof of Theorem 4.3.1. Suppose that in the optimal ranking p there exists j (1 j n 1) such that r p( j+1) q p( j+1) >r p( j) q p( j) . We show that swapping the position of these two items increases the total weighted surplus. Denote the ranking after swapping to be p 0 . Then the net effect of swapping the items in position of j and j+ 1 is 1a p( j+1) j1 Õ k=0 a p(k) ! r p( j+1) q p( j+1) +a p( j+1) j1 Õ k=0 a p(k) ! 1a p( j) r p( j) q p( j) " 1a p( j) j1 Õ k=0 a p(k) ! r p( j) q p( j) +a p( j) j1 Õ k=0 a p(k) ! 1a p( j+1) r p( j+1) q p( j+1) # = 1a p( j) 1a p( j+1) j1 Õ k=0 a p(k) ! r p( j+1) q p( j+1) r p( j) q p( j) 0 (C.12) Therefore, swapping weakly increases the total weighted surplus. This implies that the sorted solution is optimal. C.10 Proof of Corollary 4.7.3 Proof of Corollary 4.7.3: Notice that 1a i =P e i v u i + p i is increasing in u i p i ; thus, s 1a i is also increasing in u i p i . Denote the distribution of e i , 1 i n, by H(). We show when H() is an IFR distribution,E u i p i +e i u i p i +e i v is increasing in u i p i as well. E u i p i +e i u i p i +e i v = E u i p i +e i 1 u i p i +e i v P u i p i +e i v = E e i v u i + p i 1 e i v u i + p i P e i v u i + p i + v = Z ¥ v u i +p i x v u i + p i h(x)dx ¯ H v u i + p i + v : 195 Changing variable z= v u i + p i , it suffices to show Z ¥ z (x z)h(x)dx ¯ H(z) is decreasing in z. Notice that Z ¥ z (x z)h(x)dx ¯ G(z) = Z ¥ z Z x z dyh(x)dx ¯ H(z) = Z ¥ z Z ¥ y h(x)dxdy ¯ H(s) = Z ¥ z ¯ H(y)dy ¯ H(z) = Z ¥ 0 ¯ H(z+ y)dy ¯ H(z) ; which can be equivalently written as R ¥ 0 ( ¯ H(z+ y)= ¯ H(z))dy. It suffices to show ¯ H(z+ y)= ¯ H(z) is decreas- ing in z for all y. This follows from Theorem 4.1 in [11]. We can also prove this from the first principle as follows. Taking the derivative of ¯ H(z+ y)= ¯ H(z) with respect to z, we have ¯ H(z+ y) ¯ H(z) 0 z = h(z+ y) ¯ H(z)+ h(z) ¯ H(z+ y) ¯ H(z) 2 = ¯ H(z+ y) ¯ H(z) h(z) ¯ H(z) h(z+ y) ¯ H(z+ y) 0 for fixed y 0 because H() has an IFR. This concludes the proof for our claim that ife i has distribution with an IFR, it is optimal to rank items in decreasing order of u i p i . At last, we argue that Gumbel distribution has an IFR. Unfortunately, we do not have a reference on this but this is in fact very easy to prove. Notice that by our assumption that g(x) ¯ G(x) = e e x e x 1 e e x = e x e e x 1 ; and h(x) ¯ H(x) 0 = e x e e x 1 + e e x e x e x e e x 1 2 = e x e e x e x e e x + 1 e e x 1 2 : It suffices to verify that e e x e x e e x + 1 0 for x2(¥;+¥). Let t = e x 0. Notice that(te t e t + 1) 0 t = te t + e t e t = te t 0, so this function is increasing in t, and the minimum is achieved at t = 0 with te t e t + 1 t=0 = 0. This concludes the proof. C.11 Proof of Proposition 4.4.1 Proof of Proposition 4.4.1: We first notice that due to our assumption, it must be that E e r i (q i ) E e r i (q i )+g 3 s=(1a i ) =E e g 1 q i +g 2 p i +g 3 E[V i jV i v ] h: 196 Let us fix n. We first consider rankingp ( n ) and show that it is on the order of O(logn). We denote by r (i) (q (i) ) the ith order statistic of allr i (q i ) for 1 i n. We will use the following well-known result on the log-sum-exp function, r (1) (q (1) )= max i=1;2:::;n fr i (q i )g= log e max i=1;2:::;n fr i (q i )g log n å i=1 e r i (q i ) ! : Note that before the observation of u i and p i ,a i are i.i.d. random variables (as functions of u i and p i ) for i such that 1 i n. We denote its expectation byE[a i ]=a. ThenE[L p ( n )] can be bounded as follows: =E " a 0 n å i=1 1a (i) i Õ j=1 a ( j) r (i) (q (i) )a 0 1 Õ 1in a i ! g 3 E V 0 V 0 v + Õ 0in a i g 2 ns # =E " a 0 n å i=1 1a (i) i Õ j=1 a ( j) r (i) (q (i) ) # E " 1 Õ 1in a i !# g 3 E V 0 1 V 0 v + E " Õ 0in a i g 2 ns # =E " a 0 n å i=1 1a (i) i Õ j=1 a ( j) r (i) (q (i) ) # (1a n )g 3 E V 0 1 V 0 v +g 2 nsa 0 a n E " r (1) (q (1) )a 0 n å i=1 1a (i) i Õ j=1 a ( j) # (1a n )g 3 E V 0 1 V 0 v +g 3 nsa 0 a n E r (1) (q (1) ) (1a n )g 3 E V 0 1 V 0 v +g 3 nsa 0 a n = O(logn): The second line follows by the linearity of expectations and the fact that a 0 E V 0 V 0 v = E V 0 1 V 0 v . The third line follows from the independence of a i , and thatE[a i ]=a. The fourth line is because r (1) (q (1) ) is the first ordered statistic. Then fifth line is due to the sum of the proba- bilities in the first expectation is less than one. As to the last one, we know that lim n!¥ na n = 0, that E V 0 1 V 0 v is a constant, and by Jensen’s Equality that E r (1) (q (1) ) E " log n å i=1 e r i (q i ) !# log E " n å i=1 e r i (q i ) #! = log n å i=1 E h e r i (q i ) i ! logh+ logn This takes care of the fresh demand part. Similarly, we notice that E " n å i=1 q i 0 g 1 q i +g 3 E[V i V 0 jA i ]g 3 ns+g 2 p i # =E " n å i=1 q i 0 g 1 q i +g 3 E[V i jA i ]+g 2 p i # g 3 E " n å i=1 q i 0 E[V 0 jA i ] # g 3 E " n å i=1 q i 0 # ns 197 E " n å i=1 q i 0 (g 1 q i +g 3 v +g 2 p i ) # g 3 E " n å i=1 q i 0 E[V 0 jA i ] # g 3 E " n å i=1 q i 0 # ns E " n å i=1 q i 0 (g 1 q i +g 3 E[V i jV i v ]+g 2 p i ) # g 3 E " n å i=1 q i 0 E[V 0 jA i ] # g 3 E " n å i=1 q i 0 # ns =E " n å i=1 q i 0 r i (q i )+ g 3 s 1a i # g 3 E " n å i=1 q i 0 E[V 0 jA i ] # g 3 E " n å i=1 q i 0 # ns E " r [1] (q [1] )+ g 3 s 1a [1] # g 3 E " n å i=1 q i 0 E[V 0 jA i ] # g 3 E " n å i=1 q i 0 # ns= O(logn); where the third line is because conditional on A i , V i v , and the fourth line is because v E[V i jV i v ]. The fifth line is because of the definition ofr i (q i ). For the last line, we define[1] as the index of the item that takes the greatest value ofr [1] (q [1] )+ g 3 s 1a [1] ) among all items. Thus, the inequality holds. To see the last inequality, notice first we can bound the first term by O(logn) by the same technique as above. Also notice 0E h å n i=1 q i 0;n i nsa n ns! 0. For the termE å n i=1 q i 0 E[V 0 jA i ] , we have E " n å i=1 q i 0 E[V 0 jA i ] # =E V 0 1 V 0 max 1in fV i g v E V 0 1 V 0 max 1in fV i g v E V 0 1 max 1in fV i g v =E V 0 a n ! 0: where V 0 = minfV 0 ;0g. This shows that the expected surplus is O(logn). For E[W ˆ p ( n )], we reason as follows. Consider the ranking ˜ p which ranks sellers by their index. Given the realization of(u;p), it must be thatE[W ˆ p ( n )ju;p]E[W ˜ p ( n )ju;p] by the optimality of ˆ p in the incomplete information setting. Therefore,E[W ˆ p ( n )]E[W ˜ p ( n )]. Moreover, if ˜ p is adopted, the consumer’s belief of(u;p) is consistent with the realization of(u;p). Whenever the consumer is willing to search, the expected consumer surplus must be non-negative. Note also that when the consumer is not willing to search, consumer surplus is zero by its definition. Therefore, the expected consumer surplus with ranking ˜ p is non-negative. Also,E[(1a 1 )q 1 ]> 0 since an expectation of a non-negative variable is non- positive unless it is equal to zero almost surely. Note thatE[(1a 1 )q 1 ] is the expected supply side surplus from the first seller when ranking ˜ p is used. Therefore, the expected total supply side surplus when ranking ˜ p is used can be lower bounded byE[(1a 1 )q 1 ]. Also we notice that the expected aggregate revenue is always non-negative. Therefore,E[W ˜ p ( n )]E[(1a 1 )q 1 ] andE[W ˆ p ( n )]E[(1a 1 )q 1 ], which is a constant that does not change with n. Therefore,E[W ˆ p ( n )]=W(1) andE[W p ( n )]= O(logn). We now have the conclusion. 198 C.12 Proof of Proposition 4.7.4 We state the following lemma without a proof, as the foundation of Proposition 4.7.4. For the details, please see [105]. Lemma C.12.1 Assume that X (1) X (2) :::X (n) be the ordered statistics from exponential distribution with mean 1=l. Then the random variables Z 1 ;Z 2 :::;Z n where Z i =l i X (i) X (i+1) with X (n+1) def = 0, are statistically independent standard exponential distributions. Proof of Proposition 4.7.4: Denote byq (i) the ith ordered statistics of n . It is clear by Theorem 4.3.1 that in complete information, the platform will rank the sellers in the order of q (1) q (2) q (n) . Thus, the ex-ante expected weighted surplus from fresh demand is a 0 n å i=1 (1a)a i1 E q (i) =a 0 n å i=1 1a i1 E q (i) q (i+1) =a 0 n å i=1 1a i1 i iE q (i) q (i+1) =a 0 1 2 n å i=1 1a i1 i =a 0 1 2 n å i=1 1 i a 0 1 2 n å i=1 a i1 i = O(logn); where the first equation follows from Abel’s Lemma [115], the third equality follows from Lemma C.12.1, and the fourth equality follows from the fact thatå n i=1 1 i = O(logn) andå ¥ i=1 a i1 i is a constant. It can be verified that the expected weighted surplus with returned demand converges to zero with n growing. For the ranking ˆ p( n ), without any information on the private valuation, we assume the ranking is by index without loss of generality. It can be verified that the expected objective function value is just(1=2)a 0 in the limit. This concludes our proof. C.13 Proof of Proposition 4.4.3 Proof of Proposition 4.4.3: Let us assume without loss of generality that the realizations of q i s are such that r i (q i )r i 0(q i 0) if i< i 0 . The proof is inductive. We will show that adding one more slot always increases the surplus. To this purpose, we compare the surplus generated from p SOR k+1 and p SOR k . If under p SOR k seller k+ 1 is ranked in position k+ 1, the proof is trivial. Thus, suppose that under p SOR k , seller 199 k+ 1 is ranked in position k 0 k+ 2. Now, if we view the sellers between seller k and k+ 1 underp SOR k as a “clump,” we can apply the local interchange argument in the proof of Theorem 4.3.1. To be specific, it must be that å k< j<k 0 1a p SOR k ( j) Õ k< j 0 < j a p SOR k ( j 0 ) r j (q j ) = 1 Õ k< j 0 <k 0 a p SOR k ( j) ! å k< j<k 0 1a p SOR k ( j) Õ k< j 0 < j a p SOR k ( j 0 ) 1Õ k< j 0 <k 0a p SOR k ( j) r j (q j ): Note that å k< j<k 0 1a p SOR k ( j) Õ k< j 0 < j a p SOR k ( j 0 ) 1Õ k< j 0 <k 0a p SOR k ( j) = 1; and thatr p SOR k ( j) q p SOR k ( j) r k+1 (q k+1 ) if k< j< k 0 by our assumption. As a result, we must have å k< j<k 0 1a p SOR k ( j) Õ k< j 0 < j a p SOR k ( j 0 ) 1Õ k< j 0 <k 0a p SOR k ( j) r p SOR k ( j) q p SOR k ( j) r k+1 (q k+1 ): Therefore, we may view these sellers as an “aggregate” seller with net surplus equal to å k< j<k 0 1a p SOR k ( j) Õ k< j 0 < j a p SOR k ( j 0 ) 1Õ k< j 0 <k 0a p SOR k ( j) r p SOR k ( j) q p SOR k ( j) ; and purchase probability 1Õ k< j 0 <k 0a p SOR k ( j) . Compared with p SOR k , p SOR k+1 swaps the position of seller k+ 1 and this aggregate seller. Thus, the argument in (C.12) applies. C.14 Proof of Theorem 4.4.4. Proof of Theorem 4.4.4: We first show that with our assumption,L p is non-negative for any ranking p and any realization ofq i s. To see this, notice that L p ()= n å i=1 (1a i ) Õ 0 ji1 a j r i (q i )a 0 1 Õ 1in a i ! g 3 E V 0 V 0 v +a 0 Õ 0 jn a j g 2 ns: 200 The first term and third term can be verified to be non-negative. For the second term, we notice that E V 0 V 0 v has the same sign as E[V 0 1(V 0 v )]. However E[V 0 1(V 0 v )]E[V 0 ] = 0. Thus, L p () is always positive. Recall we useP for the set of all permutations on N. Forf2P, we useffg for the eventfp =fg for brevity. Also, denoteb f i =Õ 1 ji a f( j) andb f 0 = 1 for allf2P. Note that purchase probability for sellerf(i) can be written as a 0 1a f(i) Õ 1 j<i a f( j) =a 0 b f i1 b f i : Then for anyf2P, E h L p SOR k () f i P f E L p () f P f = E h L p SOR k () f i E L p () f a 0å k i=1 b f i1 -b f i E r f(i) q f(i) f a 0 1Õ 1in a i g 3 E V 0 V 0 v +a 0Õ 0 jn a j g 2 ns a 0å n i=1 b f i1 -b f i E r f(i) q f(i) f a 0 1Õ 1in a i g 3 E V 0 V 0 v +a 0Õ 0 jn a j g 2 ns a 0å k i=1 b f i1 -b f i E r f(i) q f(i) f a 0å n i=1 b f i1 -b f i E r f(i) q f(i) f = å k i=1 b f i1 -b f i E r f(i) q f(i) f å n i=1 b f i1 -b f i E r f(i) q f(i) f ; where the first inequality follows from the linearity of expectations and the non-negativity ofr i (q i ) for all i, which follows from the assumptionw, and the second inequality follows from the well-known identity max i a i b i å i a i å i b i min i a i b i (C.13) if a i and b i are all non-negative, and that a 0 k å i=1 b f i1 -b f i E r f(i) q f(i) f a 0 n å i=1 b f i1 -b f i E r f(i) q f(i) f : Furthermore,E r f(i) q f(i) f E r f( j) q f( j) f whenever i< j. Therefore, å k i=1 b f i1 -b f i E r f(i) q f(i) f å n i=1 b f i1 -b f i E r f(i) q f(i) f 201 = å k i=1 b f i1 -b f i E r f(i) q f(i) f å k i=1 b f i1 -b f i E r f(i) q f(i) f +å n i=k+1 b f i1 -b f i E r f(i) q f(i) å k i=1 b f i1 -b f i E r f(k) q f(k) f å k i=1 b f i1 -b f i E r f(k) q f(k) f +å n i=k+1 b f i1 -b f i E r f(i) q f(i) å k i=1 b f i1 -b f i E r f(k) q f(k) f å n i=1 b f i1 -b f i E r f(k) q f(k) f = å k i=1 b f i1 -b f i å n i=1 b f i1 -b f i = 1b f k 1b f n 1b f k 1 Õ ik a i : For the first inequality, we notice that if we denotex =å k i=1 b f i1 -b f i E r f(i) q f(i) f E r f(k) q f(k) f , it follows that å k i=1 b f i1 -b f i E r f(i) q f(i) f å k i=1 b f i1 -b f i E r f(i) q f(i) f +å n i=k+1 b f i1 -b f i E r f(i) q f(i) = å k i=1 b f i1 -b f i E r f(k) q f(k) f +x å k i=1 b f i1 -b f i E r f(k) q f(k) f +å n i=k+1 b f i1 -b f i E r f(i) q f(i) +x : Since å k i=1 b f i1 -b f i E r f(k) q f(k) f å k i=1 b f i1 -b f i E r f(k) q f(k) f +å n i=k+1 b f i1 -b f i E r f(i) q f(i) 1= x x ; by (C.13), the first inequality holds. The second inequality follows since for any realization,r f(k) q f(k) r f(i) q f(i) if i k+ 1. Using Eq. (C.13) again, we can show that E h L p SOR k () i E[L p ()] = å f2P E h L p SOR k () f i P f å f2P E L p () f P f min f2L E h L p SOR k () f i E L p () f 1 Õ ik a i : We can conclude the proof. C.15 Proof of Theorem 4.5.1. Proof of Theorem 4.5.1: Once C i (b i ) is determined, the unique payment function can be determined by the envelope condition in Proposition 4.7.5. We first show that it must be the case that C i (b i )= 0. One 202 can check that t(0;b i ;i)= 0 if individual rationality and non-negative payment is to be satisfied. Plugging this into envelope condition in Proposition 4.7.5 in turn implies that C i (b i )= 0. Therefore, we can let t SOR k (b;i)= q s SOR k (b);i b i Z b i 0 q s SOR k (h;b i );i dh: By monotonicity of q s SOR k (b i ;b i );i , we know q s SOR k (b);i b i R b i 0 q s SOR k (h;b i );i dh so t(b;i) 0 for allb and i. Therefore, it is the unique payment function satisfying individual rationality, incentive compatibility, and non-negative payment. The last piece to check is the zero-payment property. Suppose with b i seller i is not ranked in the top k position by SOR. One can verify that with b 0 i such that b 0 i < b i , neither is he. This implies that the purchase probabilities are the same for any b 0 i b i . Therefore, R b i 0 q s SOR k (h;b i );i dh= R b i 0 q s SOR k (b;b i );i dh= q s SOR k (b;b i );i R b i 0 dh = q s SOR k (b;b i );i b i . Thus, the zero-payment property follows. The uniqueness follows from the envelope condition in Proposition 4.7.5. C.16 Proof of Lemma 4.6.1. In this section, we present the proof of Lemma 4.6.1. The same proof also justifies the following properties about the optimal search. Proposition C.16.1 With k items left on the list-page, the value function of the consumer’s dynamic pro- gram can be recursively defined as v k (V max )= 8 > > < > > : V max ; ifV max > ¯ v ; v k1 (v max )+ w k1 (V max ); ifV max ¯ v ; where v 0 (V max )= v max and w k1 (V max )= Z +¥ x k1 (V max ) Z +¥ V max x v k1 (x+e) v k1 (V max h(e)de s I g(x)dx s L 0: The value function v k () is a continuously increasing function with v k (¥)= lim V max !¥ v k (V max )>¥, provided that k6= 0. Except on an (possibly empty) interval (¥; ˜ a k ], v k () is strictly increasing, where ˜ a k ¯ v . Also, w k () is decreasing. 203 Proof of Lemma 4.6.1 and Proposition C.16.1: Assume first that s L ;s I > 0. The proof is based on backwards induction. We discuss all cases one by one. Case 1 (at item-page of item n): We start our induction with the case when the consumer is deciding whether to enter item-page of item n. Recall thatV max is her current best utility. If she enters and u n p n +e n V max , she will end up with final utility u n p n +e n ; otherwise, she will obtain utilityV max when she enters. As a result, the consumer enters the item-page if and only if V max > Z ¥ V max (u n p n ) (u n p n +e)h(e)de+ H(V max (u n p n ))v max s I ; (C.14) which simplifies to s I > Z ¥ V max (u n p n ) (u n p n +eV max )h(e)de = Z ¥ V max (u n p n ) (v n (u n p n +e) v n (V max ))h(e)de; where the equality is because as shown in Proposition C.16.1 we have defined that v 0 (V max )=V max . Let us recall the definition of e and x 0 (). By a simple change of variable argument, we see that the unique solution to the equation s I = R ¥ V max x (x+eV max )h(e)de isx 0 (v max )=V max e . Notice that ifV max ¯ v , it must be thatx 0 (V max ) v e . The monotonicity ofx n (V max ) inV max clearly follows. Note that when s I = 0, e =+¥ and x 0 ()=¥. The monotonicity is trivial. This proves item (ii) and (iii) of Lemma 4.6.1 when i= n and k= 0. Case 2 (at item n on the list-page): Notice that by our argument in Case 1, if u n p n <V max e , the consumer will acceptV max immediately without going to the item-page of item n. Otherwise, she will go to the item-page and pick maxfV max ;u n p n +e n g. Hence the consumer accepts the current best and stops searching on the list-page if and only if V max >G(V max e )V max + Z ¥ V max e Z ¥ V max x (x+e)h(e)de+ H(V max x)V max s I g(x)dx s L =V max + Z ¥ V max e Z ¥ V max x (x+eV max )h(e)de s I g(x)dx s L : (C.15) This inequality further simplifies to s L > w 0 (V max ) def = R ¥ V max e R ¥ V max x (x+eV max )h(e)de s I g(x)dx. 204 Note that w 0 () can be written as w 0 (V max )= Z +¥ ¥ Z +¥ ¥ 1(xV max e ) 1(eV max x)(x+eV max ) s I h(e)g(x)dedx: (C.16) Note that the term inside the integral is decreasing inV max , so w 0 () is clearly decreasing inV max . More- over, let us fixV max andd > 0. For any ˆ V max 1(x ˆ V max e ) 1(e ˆ V max x)(x+e ˆ V max ) s I jxj+jej+jV max j+ s I : Therefore, by dominated convergence theorem, we can show that w 0 () is continuous. Furthermore, by Monotone Convergence Theorem, it must be w 0 (¥)=+¥ and w 0 (+¥)= 0. This is sufficient to establish that there exists a solution to s L = w 0 (V max ). We next argue it must be unique. Notice that we can verify that w 0 0 (V max )=P u n p n + minfe n ;e gV max , which is a increasing function ofV max , and w 0 0 (V max ) 0. Hence, if it is the case that v 1 < v 2 and s L = w 0 (v 1 )= w 0 (v 2 ), we must have w 0 0 (v)= 0 for all v> v 1 . As a result, it follows that s L = w 0 (v) for all v v 1 . This contradicts the fact that w 0 (+¥)= 0. Therefore, the solution is unique. Let ¯ v be the unique solution. Because of w 0 0 () 0, w 0 (V max )< s L if and only ifV max > ¯ v . Therefore, the consumer acceptsV max if and only ifV max > ¯ v . This proves item (i) of Lemma 4.6.1 when i= n. Let us turn to Proposition C.16.1. The above results further imply that the value function v 1 (v max )= V max ifV max > v and otherwise v 1 (V max )=V max + Z ¥ V max e Z ¥ V max x (x+eV max )h(e)de s I g(x)dx s L =v 0 (V max )+ Z +¥ x n (V max ) Z +¥ V max x v 0 (x+e) v 0 (V max h(e)de s I g(x)dx s L =v 0 (V max )+ w 0 (V max ): Notice that v 1 (¥) is the expected utility that the consumer obtains when there is only one item on the list-page and she has no outside option. It must be finite, i.e.,V 1 (¥)>¥. Also, v 1 () is increasing since with a higher value ofV max , the consumer can replicate the optimal strategy associated with a lower value ofV max . Continuity of v 1 () follows from the continuity of w 0 (). Lastly, w 0 () 0 is easy to verify because whenV max ¯ v , consumer does not acceptV max and therefore v 0 (V max )>V max . 205 We also argue that there exists an interval (¥; ˜ a 1 ] such that outside this interval, v i () is strictly increasing. By continuity and monotonicity of v 1 (), it is sufficient to show that ifV max 1 >V max 2 and v 1 (V max 1 )= v 1 (V max 2 ), it must be that v 1 (V max ) for allV max V max 1 . Denote the expected utility of the consumer with current bestV max 1 as v 0 1 (V max 1 ) if she follows the optimal strategy as if her current best utility isV max 2 . Then v 0 1 (V max 1 ) v 1 (V max 1 ) because this strategy is not optimal. Also, v 0 1 (V max 1 ) v 1 (V max 2 ). This implies v 0 1 (V max 1 )= v 1 (V max 1 )= v 1 (V max 2 ). This implies that if the consumer follows the strategy when she is endowed withV max 2 , she acceptsV max 2 with probability zero. Suppose otherwise Let A be the event that the consumer acceptsV max 2 . Let V F be the final utility and s F be the final search cost. Then v 1 (V max 2 )=E[V F 1( ¯ A)]+ ¶ A V max 2 +E[s F ]>E[V F 1( ¯ A)]+P A V max 1 +E[s F ]= v 0 1 (V max 1 ): Contradiction. As a result, with anyV max V max 2 , she can follow exactly the same search strategy and obtain the same expected utility as v 1 (V max 2 ). Note that ˜ a 1 ¯ v trivially follows since whenV max > ¯ v , v 1 (V max )=V max . For the rest of the proof, we assume that all conclusions presented in Lemma 4.6.1 and Proposition C.16.1 are true for item i= k+ 1 and we show they are also true for item i= k. Case 3 (at the item-page of item k): We will discuss two cases. Case 3.1 (V max > ¯ v ): In this case, going to item k+ 1 on the list-page is dominated by accepting V max because of the induction hypothesis. Thus, we only need to compare the expected utility of accepting V max and visiting the item-page of item k. Knowing that she will accept maxfV max ;u k p k +e k g if she visits the item-page of item k, she will do so if and only if (C.14) holds. Therefore, she visits item-page k if and only if u k p k V max e . Case 3.2 (V max ¯ v ): In this case, we first observe that acceptingV max is dominated by going to item k+ 1 on the list-page by the induction hypothesis, so one only needs to compare the expected utility of going to item k+ 1 on the list-page and visiting item k’s item-page. Then the consumer will go to item-page of item k if and only if v nk (V max ) v nk (V max )H(V max (u k p k ))+ Z +¥ V max (u k p k ) v nk (u k p k +e)h(e)de s I ; 206 which is equivalent to s I Z +¥ V max (u k p k ) v nk (u k p k +e) v nk (V max ) h(e)de: Step 1: We will argue that there exists a unique solution to the equation s I = Z +¥ V max x v nk (x+e) v nk (V max ) h(e)de; which we define asx nk (V max ). FixV max . We start by observing that Z +¥ V max x v nk (x+e) v nk (V max ) h(e)de = Z +¥ ¥ 1(eV max x) v nk (x+e) v nk (V max ) h(e)de: It is easy to verify that this expression as a function ofx is increasing. By Monotone Convergence Theo- rem, it converges to 0 asx!¥, and it converges to+¥ asx!+¥. Next fixV max and x . We show that Z +¥ V max x v nk (x+e) v nk (V max ) h(e)de is continuous, i.e., when x 0 ! x , Z +¥ ¥ 1(eV max x 0 ) v nk (x 0 +e) v nk (V max ) h(e)de converges to Z +¥ ¥ 1(e V max x) v nk (x+e) v nk (V max ) h(e)de. For this purpose consider x 0 2(xd;x+d) for some fixedd > 0. Note that 1(e vx 0 ) v nk (x 0 +e) v nk (V max ) v nk (x 0 +e) v nk (V max ) v nk (x 0 +e) +jv nk (V max )j x 0 +e +jv nk (V max )j x 0 +jej+jv nk (V max )jjxj+jdj+jej+jv nk (V max )j: Now since v nk () is continuous, we can show by dominated convergence theorem that Z +¥ V max x v nk (x+ e) v nk (V max ) h(e)de is also continuous in x . Therefore, there always exists a solution to Z +¥ V max x v nk (x+e) v nk (V max ) h(e)de = s I : We argue that this solution must be unique. Suppose otherwise. Then there existx 1 andx 2 such thatx 2 >x 1 andx 1 andx 2 both are solutions. Notice that s I = Z +¥ V max x 1 (v nk (x 1 +e) v nk (V max ))h(e)de 207 = Z maxfV max x 1 ; ˜ a nk x 1 g V max x 1 (v nk (x 1 +e) v nk (V max ))h(e)de + Z +¥ maxfV max x 1 ; ˜ a nk x 1 g (v nk (x 1 +e) v nk (V max ))h(e)de = Z +¥ maxfV max x 1 ; ˜ a nk x 1 g (v nk (x 1 +e) v nk (V max ))h(e)de < Z +¥ maxfV max x 1 ; ˜ a nk x 1 g (v nk (x 2 +e) v nk (V max ))h(e)de Z +¥ V max x 2 (v nk (x 2 +e) v nk (V max ))h(e)de: The third equality follows because the first term in the second line is zero due to the definition of ˜ a nk . The first inequality follows since s I > 0 implies that R +¥ maxfV max x 1 ; ˜ a nk x 1 g h(e)de> 0. Combined with the fact that on ( ˜ a nk ;+¥) v nk () is strictly increasing, we have the result. The last inequality follows because V max x 2 <V max x 1 maxfV max x 1 ; ˜ a nk x 1 g. We have arrived at a contradiction so the solution must be unique. Step 2. We next show thatx nk (V max ) ¯ v e , and investigate the monotonicity ofx nk (V max ). First let us assume that u k p k +e > ¯ v , and consider v 0 2( ¯ v ;u k p k +e ). Therefore, v 0 > ¯ v V max . Now s I < Z +¥ v 0 (u k p k ) u k p k +e v 0 h(e)de = Z +¥ v 0 (u k p k ) v nk (u k p k +e) v nk (v 0 ) h(e)de Z +¥ V max (u k p k ) v nk (u k p k +e) v nk (V max ) h(e)de: (C.17) The first inequality follows since v 0 < u k p k +e , and v 0 e is the unique solution to s I = R +¥ v 0 x (x+ z v 0 )h(e)de and R +¥ v 0 x (x+ z v 0 )h(e)de is increasing in x . The first equation is because v 0 v , and when V max v , it follows that V nk (V max ) =V max . The second inequality is be- cause R +¥ V max (u k p k ) (v nk (u k p k + z) v nk (V max ))h(e)de is decreasing inV max , which is true because v nk () is increasing. Thus, if u k p k > v e , it cannot be the solution. Therefore,x nk (V max ) ¯ v e . Note that Z +¥ V max (u k p k ) v nk (u k p k +e)V nk (V max ) h(e)de is decreasing inV max and increas- ing in u k p k . It follows thatx nk (V max ) is increasing inV max . Furthermore, Z +¥ V max (u k p k ) v nk (u k p k +e) v nk (V max ) h(e)de = Z +¥ V max (u k p k ) (v nk1 (u k p k +e)(v nk1 (V max ) h(e)de 208 + Z ¯ v (u k p k ) V max (u k p k ) w nk1 (u k p k +e) w nk1 (V max ) h(e)de Z +¥ V max (u k p k ) (v nk1 (u k p k +e)(v nk1 (V max ) h(e)de where the first equation is due to the definition of v nk1 () and the inequality is because w nk1 () is decreasing. As a result, it must be thatx nk (V max )x nk1 (V max ). Case 4 (at item k on the list-page): First of all, ifV max > ¯ v , based on our discussion of Case 3.1, the consumer accepts the current best and does not look at item k on the list-page if and only if (C.15) holds. But according to our discussion in Case 2, this is clearly satisfied. Therefore, whenV max > v , the consumer will accept the current best item immediately. Second, ifV max v , clearly the consume should not accept the current best item, because even when k= n, accepting the current best item is an inferior option. This finishes the proof of Lemma 4.6.1. As a result of the above discussion, we observe that v nk+1 (V max ) =V max ifV max > ¯ v . When V max ¯ v , the expected utility would be v nk+1 (V max )=G(x nk (V max ))v nk (V max ) + Z ¥ x nk (V max ) Z ¥ V max x v nk (x+e)h(e)de+ H(V max x)v nk (V max ) s I g(x)dx s L =v nk (V max )+ Z ¥ x nk (V max ) Z ¥ V max x (v nk (x+e) v nk (V max ))h(e)de s I g(x)dx s L =v nk (V max )+ w nk (V max ): Furthermore, by the same arguments as in Case 2, we know that w nk () is continuous, v nk+1 () is increas- ing, v nk+1 (¥)>¥, continuous and except for an interval(¥; ˜ a nk ] where ˜ a nk+1 v , v nk+1 () is strictly increasing. This finishes the proof for the case s I ;s L > 0. When s L = 0, we have defined ¯ v =+¥. It is easy to see that the consumer will never stop checking the list-page since doing so only increases her utility in this case. When s I = 0, the consumer will not skip any item-page, which is consistent with our definition of x k () and e . In this case we revert back to the base model. The rest of the conclusions can be verified using similar arguments to the case s I ;s L > 0. With these observations, we conclude the proof. 209 C.17 Proof of Theorem 4.6.2. In this part, we first prove the first part of Theorem 4.6.2. Let us denote ¯ a 0 = P(V 0 ¯ v ). Proof of Theorem 4.6.2 part(i): For this proof, we assume thatG 0 is used by the consumer. We first observe that withG 0 , the consumer will not visit the item-page of an item inN nG . In the optimal ranking, we must rank items inNnG after items inG , since otherwise we are only increasing the search costs of the consumer and the resulting ranking must be suboptimal. Therefore, we assume from now we place items inG in slot 1 tojGj. We need to show that it is optimal to rank items inG in decreasing order of ¯ r i . Notice that regardless how we change the orders of items inG , the surplus from the returned demand is the same. Therefore, we only need to show such ranking is optimal in terms of the weighted surplus from the fresh demand of the items inG . Given a ranking of itemsp, we denote this part of surplus asL G p . We assume without loss of generalityp(0)= 0. Then following a similar line of proof to Lemma 4.7.2, we can show that L G p = jGj å j=1 Õ 0k j1 ¯ a p(k) (1 ¯ a p( j) ) g 1 q p( j) +g 2 p p( j) +g 3 E V p( j) V p( j) > ¯ v jsE V 0 V 0 ¯ v = jGj å j=1 Õ 0k j1 ¯ a p(k) (1 ¯ a p( j) ) ¯ r p( j) +g 3 jGjs Õ 0ijGj ¯ a i g 3 ¯ a 0 1 Õ 1ijGj ¯ a i ! E V 0 V 0 ¯ v : Therefore, using the arguments in the proof of Theorem 4.3.1, we can show the desired conclusion. To prepare for the proof for the second part of Theorem 4.6.2, we will first show a sequence of lemmas. We assume that s I > 0, for otherwise the problem reduces to the base model andp PSOR is optimal. If policy G 0 is adopted by the consumer, then conditional on visiting an item on the list-page, we can write down the probability that the consumer does not purchase this item immediately by b def = G( ¯ v e )+ Z ¥ ¯ v e H( ¯ v x)g(x)dx= 1 Z ¥ ¯ v e ¯ H( ¯ v x)g(x)dx: This is true since the consumer does not purchase implies either u i p i < ¯ v e , which happens with probability G( ¯ v e ), or u i p i ¯ v e and u i p i +e i ¯ v , which happens with probability R ¥ ¯ v e H( ¯ v x)g(x)dx. Moreover, letg n def = å n1 i=1 (1b)b i1 i+b n1 n=(1b n )=(1b). The follow- ing two lemmas follow. 210 Lemma C.17.1 Policy G 0 is asymptotically optimal in the sense that v ¥ (V max )= lim n!¥ v G 0 n (V max ) limsup n!¥ v G n (V max ) for any policyG and anyV max . Moreover, v ¥ (V max )= ¯ v = 1 1b Z ¥ ¯ v e Z ¥ ¯ v x (x+e)h(e)de g(x)dx s L 1b ¯ G( ¯ v e )s I 1b ; (C.18) ifV max ¯ v ; otherwise, v ¥ (V max )=V max . Lemma C.17.2 IfV max ¯ v , v n (V max ) ˜ v n = 1b n1 1b Z ¥ v e Z ¥ v x (x+e)h(e)de g(x)dx+b n1 Z +¥ ¥ Z +¥ ¥ (x+e)h(e)g(x)dxde s I g n s L 1b n1 1b ¯ G(v e )s I : (C.19) Moreover, lim n!¥ v n (V max )= v ¥ (V max ) for allV max and lim n!¥ x n (V max )= ¯ v e for allV max ¯ v . Furthermore, these convergences are uniform. For the rest of the arguments, we define q(n) as quantile of distribution G() such that G( ¯ v e ) G(q(n))= 1=n 3 . If G( ¯ v e )> 0, such a quantile exists when n N 1 for some N 1 large enough. From now on, we assume that the consumer uses search policyG . Lemma C.17.3 Assume that G( ¯ v e )> 0. There exists a positive constant C such that for any n N 1 and any item i with u i p i <q(n), if there are more than C logn items ranked after item i, the consumer enters item i’s item-page with zero probability. For the following discussions, we say that the items inN are fixed, if u i p i andq i are realized for all i2N . Assume that the items inN are fixed. Recall thatG =fi2N : u i p i ¯ v e g. Denote the random total search cost with the optimal rankingp and withp PSOR byC andC PSOR , respectively. We can decompose the surplus we get from the optimal solution asS =S G +S NnG +S R g 3 c , where S G is the surplus from the items inG from the fresh demand,S NnG from the items inN nG from the fresh demand,S R the surplus from the returned demand, and c =E[C 1(no purchase)]. Note that we do not include the last search cost term in the base model since in the base model this term is not changed with ranking. Furthermore, we denote the purchase probability of item i under the optimal ranking p 211 (with inverses ) as q s ;i . Conditional on item i is purchased in the fresh demand, let us denote expected number of item-pages viewed as e(s ;i). Notice that different from our base model, it may not be true that S G +S NnG = å i2N q s ;i g 1 q i +g 2 p i +g 3 E V i V i ¯ v s (i)s L e(s ;i)s I E V 0 V 0 ¯ v ; since conditional on an item is purchased, the expected utility that the consumer obtains from the outside option may not beE V 0 V 0 ¯ v . To convince ourself about this, we only need to consider the case in which there is only one item with u 1 p 1 < ¯ v e : However, if we denote ˆ S G def = å i2G q s ;i g 1 q i +g 2 p i +g 3 E V i V i ¯ v s (i)s L e(s ;i)s I E V 0 V 0 ¯ v ; ˆ S N =G def = å i2NnG q s ;i g 1 q i +g 2 p i +g 3 E V i V i ¯ v s (i)s L e(s ;i)s I E V 0 V 0 ¯ v ; and ˆ S def = ˆ S G + ˆ S NnG +S R , then we will show ˆ S provides a good approximation toS . We can writeS PSOR =S PSOR G +S PSOR NnG +S PSOR R g 3 c PSOR1 , and define ˆ S PSOR G , ˆ S PSOR N =G , and c PSOR1 in a similar fashion. Define t def = Pf ¯ v e +e i ¯ v g=Pfe i e g= H(e ): It is easy to verify that t Pfu i p i +e i ¯ v g if ¯ v e u i p i . Also, it holds thatt< 1, which is implied by R +¥ e (ee )h(e)de= s I and s I > 0. Let us also denote c = ¯ G( ¯ v e )=2 for the following two results. Note that it holds that ¯ G( ¯ v e )> 0, since by definition ¯ v satisfies R +¥ ¯ v e R +¥ ¯ v x (x+e ¯ v )h(e)de s I g(x)dx= s L : If ¯ G( ¯ v e )= 0, the left-hand-side of the equation reduces to zero, but s L > 0. Therefore, c2(0;1). Lemma C.17.4 For any set of fixed itemsN withjNj= n andjGj cn, it holds thatjS ˆ S j O(nt cn ) andjS PSOR ˆ S PSOR j O(nt cn ). Combing Lemmas C.17.3 and C.17.4, we can show the following key result. Lemma C.17.5 Assume that G( ¯ v e )> 0. There exists a constant N 4 with N 4 N 1 such thatS PSOR 1 O(n 2 t cn=2 ) S for any fixed itemsN withjNj= n N 4 satisfying the following assumptions: (i) fi :q n u i p i < ¯ v e g= / 0; (ii) jGj cn; (iii) Assumption (A1) to (A3) are satisfied. With the preparations so far, we are ready to proof Theorem 4.6.2. 212 Proof of Theorem 4.6.2 part (ii): If G( ¯ ve )= 0, then with probability oneN =G: In this case, an item-page will always be viewed, provided that its corresponding item is visited on the list-page. Then the same proof to Theorem 4.3.1 can be used to show that PSOR is optimal. 4 Assume from now on that G( ¯ ve )> 0. The main idea is to show that Assumptions (i) and (ii) in Lemma C.17.5 hold with high probabilities if we sample n items from the distribution. Let us denote A i =fq n u i p i < ¯ v e g. ThenP A i = 1=n 3 by the definition ofq(n). Therefore, the probability that there is a sampled item in the range between q n and ¯ v e is given byP [ i A i 1=n 2 by union bound. Recall that c= ¯ G( ¯ v e )=2. DenoteB the event that assumption (ii) is not satisfied. Then we can showP B e nc=4 . Specifically, define random variable X i = 1 l(u i p i ¯ v e ).E[X i ]= 2c and E[å n i=1 X i ]= 2nc. Therefore, P B =P n å i=1 X i < cn 0 @ e 1 2 1 1 2 1 1 2 1 A cn e nc=4 ; The first inequality uses the Chernoff Bound and the second inequality follows from Lemma 5.26 in [146]. Therefore, the probability that either one of them is not satisfied is bounded byP ([ i A i )[B 1=n 2 + e nc=4 = O(1=n 2 ). We next notice that O(n 2 t cn=2 )= e W(n) . Lastly, we comment that the large n requirement for Lemma C.17.5 can eliminated by choosing an appropriate constant in O(1=n 2 ). The proof is complete. Proof of Lemma C.17.1: We will argue that v ¥ (V max ) def = lim n!¥ v G 0 n (V max ) = V max if V max > ¯ v and v ¥ (V max ) def = lim n!¥ v G 0 n (V max )= ¯ v ifV max ¯ v . Note that by Proposition C.16.1, it must be that v n (V max )=V max ifV max > ¯ v , and v n (V max ) ¯ v ifV max ¯ v . Furthermore, by the optimality of G , for anyV max and policyG, limsup n!¥ v n (V max ) limsup n!¥ v G n (V max ). Then the desired conclusion for the first part follows. As to (C.18), it will be a byproduct of this argument. Note that v ¥ (V max )=V max ifV max > ¯ v is true because ifV max > ¯ v , the consumer will stop and acceptV max . We next show that v ¥ (V max )= ¯ v ifV max ¯ v . Let us assume that policyG 0 is adopted. 4 A technical comment: We argue that G( ¯ ve ) = 0 is possible. To see this, notice that given G(), H(), s I , s I and R +¥ ¯ v e R +¥ ¯ v x (x+e ¯ v )h(e)de s I g(x)dx= s L ; we can define ˜ g(x)= 0 if x= 0 and ˜ g(x)= g(x)= ¯ G( ¯ v e ), and ˜ s L = s L = ¯ G( ¯ v e ). Then we have a case in point. Also, in the main text, we mentioned that when s I = 0 and s L = s, our model reduced to the base model since ¯ v = v , where v is defined in (4.1), and consumer views all item-pages before leaving. In the current case, although the consumer views all items pages before leaving, we are unaware of a search cost value s such that we can reduce the model to the base model in sense ¯ v = v . Nevertheless, the key point here is the proof of Theorem 4.3.1 applies. 213 Denote the random utility that the consumer finally obtains as ˜ V . Let us fix n. We notice the following facts. (i) The expected number of items that the consumer views on the list-page is given by g n , since the number of items the consumer views on the list-page is equal to i with probabilityb i1 (1b) for i n. Also, it is easy to show lim n!¥ g n = 1=(1b). (ii) The expected search cost consumer pays for item i’s item-page isb i1 ¯ G(v e )s I . Therefore, the expected search cost on item-pages is(1b n ) ¯ G(v e )s I =(1b). (iii) Pf ˜ V > ¯ v g= 1b n . (iv) For any i2N ,E ˜ V ˜ V > ¯ v =E V i V i > ¯ v = 1 1b R ¥ v e R ¥ v x (x+ z) f(z)dz g(x)dx: Denote the event of purchase item i as A i . Note that E V i V i > ¯ v = E V j V j > ¯ v for any i; j2 N . E ˜ V ˜ V > ¯ v =E V i V i > ¯ v then follows sinceE ˜ V 1( ˜ V > ¯ v ) =å j2N E ˜ V 1(A i ) = å j2N E V j V j > ¯ v PfA i g=P ˜ V > ¯ v V i V i > ¯ v . In summary we have v G 0 n (V max )=E ˜ V 1( ˜ V > ¯ v ) +E ˜ V 1( ˜ V ¯ v ) g n s L 1b n 1b ¯ G(v e )s I =Pf ˜ V > ¯ v gE ˜ V ˜ V > ¯ v +E ˜ V 1( ˜ V ¯ v ) g n s L 1b n 1b ¯ G(v e )s I =(1b n ) 1 1b Z ¥ v e Z ¥ v x (x+ z) f(z)dz g(x)dx+E ˜ V 1( ˜ V ¯ v ) g n s L 1b n 1b ¯ G(v e )s I : (C.20) Moreover, V max ˜ V ¯ v so jV max j maxfj ¯ v j;jV max jg. Therefore, as n goes to infinity, E ˜ V 1( ˜ V ¯ v ) converges to zero, since as n!¥ E ˜ V 1( ˜ V ¯ v ) E j ˜ Vj1( ˜ V ¯ v ) maxfj ¯ v j;jV max jgb n ! 0: Therefore, we have v ¥ (V max )= lim n!¥ v G 0 n (V max )= 1 1b Z ¥ v e Z ¥ v x (x+ z) f(z)dz g(x)dx s L 1b ¯ G(v e )s I 1b : 214 Now we observe from the definition of ¯ v that s L = Z ¥ ¯ v e Z ¥ ¯ v x (x+e)h(e)de g(x)dx ¯ v Z ¥ ¯ v e ¯ H( ¯ v x)g(x)dx ¯ G( ¯ v e )s I : Dividing both sides by 1b = Z ¥ ¯ v e ¯ H(e x)g(x)dx and move the terms we have ¯ v = 1 1b Z ¥ ¯ v e Z ¥ ¯ v x (x+e)h(e)de g(x)dx s L 1b ¯ G( ¯ v e )s I 1b This concludes the proof. Proof of Lemma C.17.2: We first show that v n (V max ) uniformly converges to v ¥ (V max ) and (C.19). As- sume that there are n items left, andV max ¯ v . Let us consider the following policy. (i) For the first n 1 items, the consumer will accept an item immediately if the item’s utility is higher than ¯ v and the consumer evaluates an item’s item-page if and only if u i p i ¯ v e . (ii) If the consumer cannot find an item with utility greater than or equal to ¯ v from the first n 1 items, she will always accept the last item. Note that if this policy is adopted, as long asV max ¯ v , the consumer’s expected utility is independent ofV max . Denote this number by ˜ v n . Since this policy is suboptimal, we have ˜ v n v n (V max ). Let us denote the expected number of items that the consumer views on the list-page by g n . Then by a similar line of analysis to Lemma C.17.1, especially equation (C.20), we see that as n!¥ ˜ v n = 1b n1 1b Z ¥ v e Z ¥ v x (x+e)h(e)de g(x)dx+b n1 Z +¥ ¥ Z +¥ ¥ (x+e)h(e)g(x)dxde s I g n s L 1b n1 1b ¯ G(v e )s I ! 1 1b Z ¥ v e Z ¥ v x (x+e)h(e)de g(x)dx s L 1b ¯ G(v e )s I 1b = ¯ v : For the first line, we notice thatb n1 Z +¥ ¥ Z +¥ ¥ (x+e)h(e)g(x)dxde s I s L is the expected utility of the consumer gets conditional on visiting item n. For the last line we notice thatg n ! 1=(1b). The last identity follows from the definition of ¯ v . This shows (C.19). Moreover, it follows that ¯ v v n (V max ) ˜ v n ! ¯ v when n!¥. This shows v n (V max )! ¯ v . The uniform convergence part follows since ˜ v n is independent ofV max . As a result, we have the desired conclusion. 215 We next show that lim n!¥ x n (V max )= ¯ v e for allV max < ¯ v , and the convergence is uniform. Consider the function h(x) def = Z ¥ ¯ v x (v ¥ (x+e) ¯ v )h(e)de = Z ¥ ¯ v x (e( ¯ v x))h(e)de = Z ¥ ¯ v x ¯ H(e)de: The second equality is because if e ¯ v x , x +e ¯ v so v ¥ (x +e)=x +e. It is easy to verify the following facts ofh(x) by simple calculus. (i) h(x) is continuous. (ii) lim x!¥ h(x)= 0 and lim x!+¥ h(x)=+¥. (iii) h(x) is a strictly increasing continuous function in the range ( ¯ v z;+¥), where z = supfx : ¯ H(x)> 0g. The derivative ofh(x) is zero whenx ¯ v z . Given these facts, we can define a continuous inverse function ofh(),h 1 (y), with domain(0;+¥). Therefore, for anye> 0, there exists ad > 0 such that as long asjs s I j<d,jh 1 (s I )h 1 (s)j=jv e h 1 (s)j<e. Notice thath 1 (s I )= ¯ v e by the definition ofe . Furthermore, because v n (V max ) uniformly converges to v ¥ (V max ), given d, there exists a N such that as long as n> N, v ¥ (V max ) v n (V max ) v ¥ (V max )d=2 for allV max . Consider an arbitrarily fixed ˆ V max ¯ v . Denote ˆ s= R +¥ ˆ V max x n ( ˆ V max ) v ¥ (x n ( ˆ V max )+e) ¯ v h(e)de. For any n> N js I ˆ sj= Z +¥ ˆ V max x n ( ˆ V max ) v n (x n ( ˆ V max )+e) v n ( ˆ V max ) h(e)de Z +¥ ˆ V max x n ( ˆ V max ) v ¥ (x n ( ˆ V max )+e) ¯ v h(e)de Z +¥ ˆ V max x n ( ˆ V max ) v n (x n ( ˆ V max )+e) v n ( ˆ V max ) v ¥ (x n ( ˆ V max )+e) ¯ v h(e)de Z +¥ ˆ V max x n ( ˆ V max ) v n (x n ( ˆ V max )+e) v ¥ (x n ( ˆ V max )+e) + v n ( ˆ V max ) ¯ v h(e)de Z +¥ ˆ V max x n ( ˆ V max ) d 2 + d 2 h(e)ded: Furthermore, we have Z +¥ ˆ V max x n ( ˆ V max ) v ¥ (x n ( ˆ V max )+e) ¯ v h(e)de = Z +¥ ¯ v x n ( ˆ V max ) v ¥ (x n ( ˆ V max )+e) ¯ v h(e)de =h x n ( ˆ V max ) 216 because whenx n ( ˆ V max )+e< ¯ v , v ¥ (x n ( ˆ V max )+e)= ¯ v . In other words,x n ( ˆ V max )=h 1 ( ˆ s). Therefore, as long as n> N,jx n ( ˆ V max )( ¯ v e )j<e, and we can conclude the proof. Proof of Lemma C.17.3: In this proof we require that n N 1 so that q(n) is appropriately defined. We will argue that there exists C such that u i p i <x C logn (V max ) for anyV max ¯ v . By the monotonicity of x k (V max ) in k, the desired conclusion follows. Towards this goal, we first prove the following claim. Claim 1: There exits C 2 large enough such that for all m and allV max ¯ v , it holds that 2C 2 b m R v x m (V max ) e ¯ H(e)de. That is R v x m (V max ) e ¯ H(e)de = O(b m ). Assume that ˜ V max ¯ v . Then due to the fact that v ¥ ( ˜ V max )= ¯ v , (C.18), and (C.19), it holds that v ¥ ( ˜ V max ) v m ( ˜ V max ) j ¯ v ˜ v m j = b m1 1b Z ¥ v e Z ¥ v x (x+e)h(e)de g(x)dxb m1 Z +¥ ¥ Z +¥ ¥ (x+e)h(e)deg(x)dx s I + g m 1 1b s L b m1 1b ¯ G(v e )s I : Furthermore, one can calculate that by the definition of g m ,g m = å m1 i=1 (1b)b i1 i+b m1 m = (1 b m )=(1b) so that g m 1=(1b)=b m =(1b). Therefore, v ¥ ( ˜ V max ) v m ( ˜ V max ) C 1 b m1 for any ˜ V max and some constant C 1 . As a result, for anyV max ¯ v Z +¥ V max x m (V max ) v m (x m (V max )+e) v m (V max ) h(e)de Z +¥ V max x m (V max ) v ¥ (x m (V max )+e) ¯ v h(e)de = Z +¥ V max x m (V max ) v(x m (V max )+e) v ¥ (x m (V max )+e) v m (V max ) ¯ v h(e)de Z +¥ V max x m (V max ) v m (x m (V max )+e) v ¥ (x m (V max )+e) h(e)de + Z +¥ V max x m (V max ) v m (V max ) ¯ v h(e)de 2C 1 b m1 : Consequently, it follows that 2C 1 b m1 Z +¥ V max x m (V max ) v m (x m (V max )+e) v m (V max ) h(e)de Z +¥ V max x m (V max ) v ¥ (x m (V max )+e) ¯ v h(e)de 217 = Z +¥ e (ee )h(e)de Z +¥ v x m (v max ) x m (V max )+e ¯ v h(e)de = Z +¥ e ¯ H(e)de Z +¥ ¯ v x m (V max ) ¯ H(e)de = Z v x m (V max ) e ¯ H(e)de where the first equation follows since (1) by the definition ofx m (V max ) ande , Z +¥ V max x m (V max ) v m (x m (V max )+e) v m (V max ) h(e)de = s I = Z +¥ e (ee )h(e)de; and, (2) v ¥ (V)=V max ifV max > v and v ¥ (V max )= ¯ v ifV max v . In summary of the development so far, we can choose C 1 big enough such that 2C 1 b m1 R v x m (V max ) e ¯ H(e)de for all m. If we let C 2 = C 1 =b, then for all m and anyV max ¯ v , we have 2C 2 b m R v x m (V max ) e ¯ H(e)de. This finishes the proof of the claim. Denote M= supg(x) and letb = e t where t> 0. Furthermore, ¯ H(e )> 0 since by the definition of e , 0< s I = R +¥ e (ee )h(e)de = R +¥ e ¯ H(e)de. Let us pick some smalld > 0 such that ¯ H(e +d)> 0. There exists C 3 big enough such thatb C 3 logn = n tC 3 ( ¯ H(e +d))=(2Mn 3 C 2 ) for all n> 1. By the above claim, ¯ H(e +d) M 1 n 3 2C 2 b C 3 logn Z v x C 3 logn (V max ) e ¯ H(e)de: By Lemma C.17.2, x C 3 logn (V max ) uniformly converges to ¯ v e as n grows. Thus, when n is large enough, i.e., n N 2 for some N 2 with N 2 N 1 , we have v x C 3 logn (V max )e +d for allV max . As a result, ¯ H(e +d) M 1 n 3 Z ¯ v x C 3 logn (V max ) e ¯ H(e)de ¯ v e x C 3 logn (V max ) ¯ H(e +d): Equivalently, x C 3 logn (V max ) ¯ v e 1=(Mn 3 ) holds for all n N 2 . If N 2 = N 1 , we can set C= C 3 . If N 2 > N 1 , because of the monotonicity of x k () in k we can choose C> C 3 such that x C logn (V max ) ¯ v e 1=(Mn 3 ) holds for all n with N 1 < n< N 2 as well. Now x C logn (V max ) ¯ v e 1=(Mn 3 ); (C.21) 218 holds for all n> N 1 . By the definition ofq(n), 1=n 3 = G(v e ) G(q(n))= Z ¯ v e q(n) g(x)dx M(v e q(n)). Therefore, it must be that q(n) v e 1=(Mn 3 ): Thus by our assumption of Lemma C.17.3 u i p i <q(n) ¯ v e 1=(Mn 3 ): (C.22) Compare (C.21) and (C.22), and we see that x C logn (V max )> u i p i . Therefore, in this case, item i’s item-page will be not be entered by the customer. Proof of Lemma C.17.4: Assume that the optimal ranking p is adopted. Let the p.d.f. of V 0 be t(). Recall that ¯ a 0 =PfV 0 ¯ v g. DenoteA i =fitem i is purchased in fresh demandg. Furthermore, given the realization of V 0 < ¯ v , let the probability that the consumer purchases an item by p(V 0 ). Note that jS +g 3 c ˆ S j=jS G +S NnG ˆ S G ˆ S NnG j= å i2N q s ;i E V 0 V 0 ¯ v å i2N q s ;i E V 0 A i = å i2N q s ;i 1 ¯ a 0 Z ¯ v ¥ xt(x)dx Z ¯ v ¥ p(x)xt(x)dx = Z ¯ v ¥ 1 ¯ a 0 å i2N q s ;i p(x) ! xt(x)dx Z ¯ v ¥ 1 ¯ a 0 å i2N q s ;i p(x) ! x t(x)dx: The first line follows since we are only mis-counting the conditional expected value of V 0 in ˆ S . The second line follows since å i2N q s ;i E V 0 A i =E " å i2N V 0 1(A i ) # =E " V 0 å i2N 1(A i ) # =E[V 0 1(an item is purchased)] = Z ¯ v ¥ p(x)xt(x)dx For any realization V 0 ¯ v , the probability that the consumer does not purchase any item is at most Õ i2G Pfu i p i +e i ¯ v gt cn . Thus, 1 p(x) 1t cn for any x ¯ v . Also,(1= ¯ a 0 )å i2N q s ;i is the probability that an item is purchased conditional on consumer enters the search. Therefore, it must also be true that 1(1= ¯ a 0 )å i2N q s ;i 1t cn by a similar argument. Therefore, jS +g 3 c ˆ S j 1 ¯ a 0 Z ¯ v ¥ t cn jxjt(x)dx=t cn 1 ¯ a 0 Z ¯ v ¥ jxjt(x)dx= O(t cn ): (C.23) 219 Moreover, c nsPfNo items inG is purchasedg nst cn = O(nt cn ). Therefore, jS ˆ S jjS +g 3 c ˆ S j+g 3 c = O(nt cn ): A similar proof can used to showjS PSOR ˆ S PSOR j O(nt cn ). Proof of Lemma C.17.5: Recall we denote the random total search cost with the optimal ranking p and withp PSOR byC andC PSOR . Also letA (i) andA PSOR (i) be the event that item i is purchased in fresh demand with the optimal ranking and with p PSOR , respectively. Throughout this proof we assume n is large enough, i.e., n N 3 for some N 3 , such that N 3 N 1 and for all n N 3 , cn=2 C logn, where C is defined in Lemmas C.17.3 and c= ¯ G( ¯ v e )=2. We first argue the following claim. Claim 1: ˆ S PSOR G å i2NnG E C PSOR 1(A PSOR (i)) ˆ S G å i2NnG E[C 1(A (i))] O(t cn=2 n 2 ). Notice that ˆ S G å i2NnG E[C 1(A (i))] = å i2G q s ;i g 1 q i +g 2 p i +g 3 E V i V i ¯ v E V 0 V 0 ¯ v å i2N E[C 1(A (i))]; (C.24) and ˆ S PSOR G å i2NnG E C PSOR 1(A PSOR (i)) = å i2G q s PSOR ;i g 1 q i +g 2 p i +g 3 E V i V i ¯ v E V 0 V 0 ¯ v å i2N E C PSOR 1(A PSOR (i)) : (C.25) Given the optimal ranking p , let us first consider the following operation: we move all items inG to the top but keep the relative orders for items inG andN nG unchanged. Notice that for any i2G , r i +g 3 E V i V i ¯ v E V 0 V 0 ¯ v > 0 by our assumption (A2). Denote this ranking byp B . Notice that ˆ S B G å i2NnG E C B 1(A B (i)) 220 = å i2G q s B ;i g 1 q i +g 2 p i +g 3 E V i V i ¯ v E V 0 V 0 ¯ v å i2N E C B 1(A B (i)) ; (C.26) where we have definedA B ,C B in a way similar to above. Clearly, ranking p B increases the purchase probability of each item i2G . As a result, å i2G q s ;i g 1 q i +g 2 p i +g 3 E V i V i ¯ v E V 0 V 0 ¯ v å i2G q s B ;i g 1 q i +g 2 p i +g 3 E V i V i ¯ v E V 0 V 0 ¯ v : (C.27) We next argue thatå i2N E C B 1(A (i)) å i2N E[C 1(A (i))]+ O(t cn=2 n 2 s). We let N 1 =fi2G :jf j2G : s ( j)s (i)gj cn=2g; andN 2 =fi2NnG :j j2G : s ( j)s (i)j cn=2g; andN 3 =fi2N :j j2G : s ( j)s (i)j< cn=2g: Notice thatN 1 [N 2 [N 3 =N . For any item j2 NnG and i2N 1 withs ( j)<s (i), there are more than cn=2 items ranked after j withp . Therefore, its item-page will not be visited by the consumer. As a result, using p B , the purchase probability of item i in fresh demand is not changed, and conditional on the purchase of item i in fresh demand, we save the search cost due to j. Thus, å i2N 1 E C B 1(A B (i)) å i2N 1 E[C 1(A (i))]. Since the items inN 2 will be purchased with probability at most ¯ a 0 t cn withp B ,å i2N 2 E C B 1(A B (i)) n ¯ a 0 t cn ns= O(t cn n 2 ). Also, for any i2N 3 , since there are at least cn=2 item ranked before it with p B , we have E C B 1(A (i)) ¯ a 0 t cn=2 ns. Putting things together, we have å i2N E C B 1(A B (i)) = å i2N 1 E C B 1(A B (i)) + å i2N 2 E C B 1(A B (i)) + å i2N 3 E C B 1(A B (i)) å i2N 1 E[C 1(A (i))]+ O(t cn n 2 )+ O(t cn=2 n 2 ) å i2N E[C 1(A (i))]+ O(t cn=2 n 2 ) Combined with (C.24), (C.26) and (C.27), this leads to ˆ S G å i2NnG E[C 1(A (i))] O(t cn=2 n 2 ) ˆ S B G å i2NnG E C B 1(A B (i)) . Next, we notice that givenp B we can change the order of the items inG by ordering them by the value ¯ r i , and similarly for the items inNnG to obtainp PSOR . By a similar argument to the proof of Theorem 4.3.1, we know that ˆ S B G ˆ S PSOR G . Furthermore, by a similar argument as above, compared withp B this 221 leads to a search cost term change of the items inN nG at most O(t cn n 2 ). Putting things together, we have the proof for the claim. We next lower bound the ˆ S PSOR G å i2NnG E C PSOR 1(A PSOR (i)) term. Recall ¯ a 0 = P(V 0 ¯ v ) andt def =Pf ¯ v e +e i ¯ v g=Pfe i e g= H(e ). Claim 2: ˆ S PSOR G å i2NnG E C PSOR 1(A PSOR (i)) a 0 (1t)C O(t cn n 2 ). Without loss of generality we assume that items inG aref1;2;:::;jGjg, and thatp PSOR rank them by their index. Notice that by a similar analysis to Lemma 1, we can show that ˆ S PSOR G = jGj å j=1 Õ 0k j1 ¯ a k (1 ¯ a j ) g 1 q j +g 2 p j +g 3 E V j V j ¯ v jsE V 0 V 0 ¯ v = jGj å j=1 Õ 0k j1 ¯ a k (1 ¯ a j ) g 1 q j +g 2 p j +g 3 E V j V j ¯ v s 1 ¯ a j E V 0 V 0 ¯ v +g 3 jGjs Õ 0ijGj ¯ a i ¯ a 0 (1 ¯ a 1 ) g 1 q 1 +g 2 p 1 +g 3 E V 1 V 1 ¯ v s 1 ¯ a 1 E V 0 V 0 ¯ v a 0 (1t)C; where the first inequality follows because of assumption (A1) and (A2), and the second inequality follows due to 1 ¯ a 1 1t,g 1 q 1 +g 2 p 1 C and assumption (A2). A similar argument to the proof of Claim 1 shows thatå i2NnG E C PSOR 1(A PSOR (i)) O(t cn n 2 ). This concludes the proof for the claim. We next show that for both rankings p PSOR and p , the surplus values from the items inN nG are small. Claim 3: It holds that ˆ S PSOR NnG + å i2NnG E C PSOR 1(A PSOR (i)) = O(nt cn ) and ˆ S NnG + å i2NnG E[C 1(A (i))] = O(nt cn=2 ): We first notice that ˆ S PSOR NnG + å i2NnG E C PSOR 1(A PSOR (i)) = å i2NnG q s PSOR ;i g 1 q i +g 2 p i +g 3 E V i V i ¯ v E V 0 V 0 ¯ v 222 å i2NnG q s PSOR ;i (g 1 q i +g 2 p i ) + å i2NnG g 3 q s PSOR ;i E V i V i ¯ v E V 0 V 0 ¯ v : One important observation is that with p PSOR the probability of the item-page of an item inN nG being entered by the consumer is at most ¯ a 0 t cn . Therefore, the first term can be bounded by nt cn ¯ C. For the second term, fixing i inN nG , we can decompose q s PSOR ;i E V i V i ¯ v E V 0 V 0 ¯ v as the product of the probability of viewing the item i’s item-page, which can be bounded by ¯ a 0 t cn , and E V i E V 0 V 0 ¯ v 1(V i ¯ v ) 0. Then we notice that E V i E V 0 V 0 ¯ v 1(V i ¯ v ) = Z 1(u i p i +e i ¯ v ) u i p i +e i E V 0 V 0 ¯ v h(e)de is an increasing function in u i p i . Therefore, E V i E V 0 V 0 ¯ v 1(V i ¯ v ) E ¯ v e +eE V 0 V 0 ¯ v 1( ¯ v e +e ¯ v ) ; which is a constant. In summary, ˆ S PSOR NnG +å i2NnG E C PSOR 1(A PSOR (i)) = O(nt cn ). For ˆ S NnG +å i2NnG E[C 1(A (i))] , the treatment is very similar. But here we notice that if an item inN nG is ranked with more than cn=2 items fromG after it, the probability of viewing its item- page is zero. Otherwise, there are more than cn=2 items fromG ranked before it. We can follow the same logic as in the proof of ˆ S PSOR NnG +å i2NnG E C PSOR 1(A PSOR (i)) = O(nt cn ). The result follows. Claim 4: It holds that S PSOR R = O(t cn n) andjS R j= O(t cn n). Consider p PSOR . Given any realization V 0 ¯ v , we notice the consumer enters the returned demand only when all items inG fail to satisfy the needs of the consumer. Therefore, the probability that the consumer enters the returned demand, which we denote by q 0 (V 0 ), is at most t cn . Given V 0 and that consumer enters the returned demand, we denote the expected search cost that the consumer incurs by `(V 0 ), and the utility of the current best option by v(V 0 ). Therefore, given V 0 the consumer surplus from the returned demand is q 0 (V 0 )(v(V 0 )V 0 `(V 0 )). We notice that v(V 0 ) ¯ v and`(V 0 ) ns. Therefore, 0 v(V 0 ) V 0 ¯ v V 0 . Denote the p.d.f of V 0 by t(). Then the expected consumer surplus can be bounded as Z ¯ v ¥ q 0 (x)(v(x) x`(x))t(x)dx Z ¯ v ¥ q 0 (x)jv(x) x`(x)jt(x)dx 223 Z ¯ v ¥ q 0 (x)(jv(x) xj+j`(x)j)t(x)dx Z ¯ v ¥ t cn ( ¯ v x)+ ns)t(x)dx=t cn ( ¯ a 0 ¯ v E[V 0 1(V 0 ¯ v )]+ ns) =O(nt cn ): The supply side surplus and the aggregate revenue term can be bounded by ¯ a 0 t cn ¯ C. Therefore, the con- clusion follows. The bound onS R can be proved similarly. Now we can put the claims together to show the desired conclusion. Notice that by Lemma C.17.4 and the claims above S PSOR ˆ S PSOR O(nt cn ) = ˆ S PSOR G å i2NnG E C PSOR 1(A PSOR (i)) ! + ˆ S PSOR N + å i2NnG E C PSOR 1(A PSOR (i)) ! +S PSOR R O(nt cn ) ˆ S PSOR G å i2NnG E C PSOR 1(A PSOR (i)) ! O(nt cn ) where the first inequality follows from Lemma C.17.4, and the second inequality follows from Claims 3 and 4. Similarly,S ˆ S + O(nt cn ) ˆ S G å i2NnG E[C 1(A (i))]+ O(nt cn=2 ): Notice that when n is sufficiently large, say larger than some constant N 4 N 3 , Claim 2 implies S PSOR ˆ S PSOR G å i2NnG E C PSOR 1(A PSOR (i)) O(nt cn )a 0 (1t)C O(t cn n 2 )> 0: As a result,S S PSOR > 0. Therefore, S S PSOR ˆ S G å i2NnG E[C 1(A (i))]+ O(nt cn=2 ) ˆ S PSOR G å i2NnG E[C PSOR 1(A PSOR (i))] O(nt cn ) ˆ S PSOR G å i2NnG E C PSOR 1(A PSOR (i)) + O(n 2 t cn=2 ) ˆ S PSOR G å i2NnG E[C PSOR 1(A PSOR (i))] O(nt cn ) 1+ O(n 2 t cn=2 ): The second inequality is due to Claim 1. The last inequality is because of Claim 2. Therefore, S PSOR =S 1 O(n 2 t cn=2 ): 224 C.18 Proof of Proposition C.1.1 Proof of Proposition C.1.1: Note that as an optimal stopping problem with two decisions, one can use standard dynamic programming techniques to show that here exists at least an optimal deterministic policy and it can be find by recursively evaluating the Bellman equations ([114]). Without loss of generality, we consider the optimal policy that the consumer will continue the search whenever she is indifferent. This rest is a simple proof by contradiction. Assume thatV max 1 <V max 2 . We show that if it is op- timal for the consume to acceptV max 1 given (V max 1 ;H i ;F), then she would stop the search and accept (V max 2 ;H i ;F) as well. With (V max 2 ;H i ;F), if the consumer is willing to search, then there exists a policy G, optimal to (V max 2 ;H i ;F), such that the consumer is better off. Denote random event of accepting the evaluated option with utilityV max 2 byA , the (random) final utility that the consumer accepts as V Final and the (random) total search cost asC . We haveV max 2 PfAjG;H i ;FgV max 2 +E V Final 1(A)jG;H i ;F E[CjG;H i ;F]; or equivalently (1PfAjG;H i ;Fg)V max 2 E V Final 1(A)jG;H i ;F E[CjG;H i ;F]: (C.28) With (V max 1 ;H i ;F), we denote as G 0 the policy with which the consumer searches as if she has history (V max 1 ;H i ;F) and searches optimally. By the construction ofG 0 , her utility would bePfAjG;H i ;FgV max 1 + E V Final 1(A)jG;H i ;F E[SjG;H i ;F]V max 1 , which is implied by (C.28). But implies that with (V max 1 ;H i ;F), the consumer should continue the search. Contradiction. Let us denote ˜ v the infimum ofV max such that consumer stops searching and accepts. We argue that the consumer will continue at ˜ v . Assume she accepts. Using the notations same as above we have (1PfAjG;H i ;Fg) ˜ v E V Final 1(A)jG;H i ;F E[CjG;H i ;F]: But this implies that forV max < ˜ v the consumer should search by exactly the same argument. We have the proof. 225 C.19 Proof of Proposition C.1.2 Proof of Proposition C.1.2: Suppose that the consumer is about to view item n and the best utility isV max . In this case, by an argument similar to Lemma 4.2.1, we know the consumer stops if and only if V max >Y(xjH n1 ;F))V max + Z +¥ V max xdY(xjH n1 ;F) s: Note that using the argument for Lemma C.7.1, we can show that this is equivalent to V max > ˜ v n1 (H n1 ;F), where ˜ v n1 (H n1 ;F) is the unique solution to s= R ¥ ˜ v (1Y(xjH n1 ;F))dx. We now apply backward induction and assume the conclusion holds when the consumer is deciding whether to accept the current best or keep evaluating item i= k+ 1. We observe that after viewing item k1, the distribution of u k p k +e k is given byY k1 (xjH k1 ;F). Assume thatV max ˜ v k1 (H k1 ;F). By an argument similar to Lemma C.7.1. V max Y(xjH k1 ;F)V max + Z +¥ V max xdY(xjH k1 ;F)) s holds. The right-hand-side is the expected utility when the consumer is only search one more item, i.e., item k and then stop, which is dominated by the optimal policy. Therefore, the consumer will continue the search. Now let us assume thatV max > ˜ v k1 (H k1 ;F), we know V max >Y(V max jp k1 )V max + Z +¥ V max xdY(xjH k1 ;F) s; (C.29) or equivalently s > R ¥ V max(1Y(xjH k1 ;F))dx. Assume that the consumer keeps searching. The consumer’s best utility then becomes ˜ V max = maxfu k p k +e k ;V max gV max : Also notice that Y(xjH k1 ;F)Y(xjH k ;F) since for any i Y(xjH i ;F)= G(jH i ;F) H()= Z ¥ ¥ G(xejH i ;F)dH(e); and G(xjH k1 ;F) G(xjH k ;F). Now we notice s> Z ¥ V max (1Y k1 (xjH k1 ;F)dx Z ¥ ˜ V max (1Y k1 (xjH k1 ;F))dx Z ¥ ˜ V max (1Y k (xjH k ;F))dx: 226 By our induction hypothesis, this implies that the consumer will accepts ˜ V max . Therefore, the problem reduces to a one period problem and (C.29) implies he will stop the search. The proof is complete. C.20 Proof of Proposition C.2.1. Proof of Proposition C.2.1: We first remind the readers that in general S p 6= n å j=1 Õ k j1 ˆ b p(k) (1 ˆ a p( j) ) js; since even when the consumer does not purchase, she still incurs search cost, and the expectation of the total search cost she incurs without purchase is dependent on ranking. This is different from the base model. However, definingW j as the event that the consumer pays the search cost for the item ranked on the jth position. we note that an easy way to decompose S p is to as follows S p =E " n å j=1 1(W j )s # = s n å j=1 E[1(W j )]= s n å j=1 Õ k j1 ˆ b p(k) : Therefore, (C.1) is equivalent to n å j=1 Õ k j1 ˆ b p(k) (1 ˆ a p( j) ) g 1 q p( j) +g 2 q p( j) +g 3 E V p( j) V p( j) > ˆ v E V 0 V 0 ˆ v g 3 S p = n å j=1 Õ k j1 ˆ b p(k) (1 ˆ b p( j) ) ˆ r p( j) q p( j) : Then the interchange argument used to prove Theorem 4.3.1 follows. C.21 Proof of Proposition C.4.1. To start the proof, let us denote set of optimal ranking achieving the maximums in the definition ofW i (b) andW i i (b i ), respectively, as S i (b)= argmax s ( 1 g 1 å j2N i q s; j r j ) ;andS i i (b i )= argmax s 8 < : 1 g 1 å j2N i i q s; j r j 9 = ; : 227 We first observe the following: Lemma C.21.1 There existss i (b)2S i (b) ands i i (b i )2S i i (b i ) that satisfying the following proper- ties. (i) s SOR k (b; j)< s SOR k (b;i) if and only if s i (b;d i ( j))< s i (b;d i (i)) for all j2 D i . Furthermore, q s i (b);d i (i) = q s SOR k (b);i . (ii) Virtual seller d i 0 (i) is ranked in the last position by s i i (b i ). Furthermore, q s i i (b i );d i 0 (i) = q s SOR k (0;b i );i . Proof of Lemma C.21.1: The proof of item (i) is established from some observations in several dif- ferent cases. First, consider the case seller s SOR k (b;i) k. Then for any seller j such that j2 D i and s SOR k (b; j)> s SOR k (b;i) it must be still be that r d i ( j) r i (q i ). This is true since either r d i ( j) = r j (q j )r i (q i ) orr d i ( j) =r d(b i ) b d(b i ) r i (q i ). Further, if j2 D i ands SOR k (b; j)<s SOR k (b;i), then r j (q j )r i (q i )r d(b i ) b d(b i ) so r d i ( j) =r j (q j )r i (q i ). As a result, we can pick s i (b) such that s SOR k (b; j)<s SOR k (b;i) if and only ifs i (b;d i ( j))<s i (b;d i (i)) for all j2 D i . Second, if s SOR k (b;i)> k, then seller i will be ranked in the same position with reported valuation (0;b i ) by SOR. Thus, D i is the same as the set of sellers above him with reported valuationb. Then any j2 D i ,r j r d(b i ) b d(b i ) r i (q i ), where the first inequality follows from the construction of the virtual society N i , and the second follows from the fact thats SOR k (b;i)> k. Now we have the first claim of item (i). The second claim follows naturally by the first. The first claim of item (ii) follows by the construction of N i i : for all j2 N i i and j6= d i 0 (i), r j r d(b i ) r i (0). Also, recall D i is the set of sellers ranked above seller i when seller i reports zero, so the second claim follows since for any seller j6= i and j2 D i ,a j =a d i ( j) , anda i =a d i 0 (i) . This lemma suggests that the purchase probabilities from fresh demand by SOR is preserved in the optimal ranking in the virtual societies. We will use this fact to prove the Proposition C.4.1. We first note that the above lemma in fact implies the zero-payment property. Assume that s SOR k (b;i)> k. By SOR, the ranking of sellers with valuationb and(0;b i ) are the same. Therefore, for all seller j2 D i and j6= i, s i (b;d i ( j))<s i (b;d i (i)). In other words, s i (b) ranks seller d i (i) in the last position. Without loss of 228 generality, we can then assume d j (i) is ranked the same by s i (b) and s i i (0;b i ) for all j6= i. Thus, W i i (b i )W i (b)= q s i (b;d i (i)) g 1 b i and t SOR k (b;i)= 1 g 1 W i i (b i )W i (b) + q s SOR k (b);i b i = 1 g 1 q s i (b;d i (i)) g 1 b i + q s SOR k (b);i b i = 0: We remark the following lemma. Lemma C.21.2 Fors i (b) ands i i (b i ) identified in Lemma C.21.1, we have the following: (i) For seller i such thats S k (b;i)> k, t SOR i (b)= 0. (ii) With true valueq i from seller i, bidsb from the sellers, the utility of seller i can be written as u(q i ;b i ;b i )= 1 g 1 0 @ å j2N i nd i (i) q s i (b); j r j + q s i (b);d i (i) r i (q i ) 1 A W i i (b i )+ q i 0 q i : Proof: We only show the proof of item (ii). With the preparations so far, the utility of seller i can be written as u(q i ;b i ;b i )=q s SOR k (b);i q i + q i 0 q i t SOR k (b;i) =q s SOR k (b);i q i + q i 0 q i W i i (b i )W i (b)+ q s SOR k (b);i b i = q s SOR k (b);i q i +W i (b) q s SOR k (b);i b i W i i (b i )+ q i 0 q i = 0 @ q s SOR k (b);i q i + 1 g 1 å j2N i nd i (i) q s i (b); j r j + 1 g 1 q s i (b);d i (i) r i (b i ) q s SOR k (b);i b i 1 A W i i (b i )+ q i 0 q i = 2 4 1 g 1 å j2N i nd i (i) q s i (b); j r j + q s SOR k (b);i q i + 1 g 1 q s i (b);d i (i) r i (b i ) q s SOR k (b);i b i 3 5 W i i (b i )+ q i 0 q i = 1 g 1 0 @ å j2N i nd i (i) q s i (b); j r j + q s i (b);d i (i) r i (q i ) 1 A W i i (b i )+ q i 0 q i : The last line is due to item (i). 229 The utility expression can be decomposed into three parts, the first part can be optimized if seller i reports his true value, the second part is a function not relevant to the report of seller i, and the third part is the profit from returned demand, invariant to ranking. We are now ready to prove the proposition. Proof of Proposition C.4.1: Let us first show that it is always of the seller’s interest to report his valuation truthfully. Note that å j2N i nd i (i) q s i (q i ;b i ); j r j + q s i (q i ;b i );d i (i) r i (q i ) å j2N i nd i (i) q s i (b); j r j + q s i (b);d i (i) r i (q i ) for any b i by the optimality of s i . Thus, the sellers will always report their true valu- ation by Lemma C.21.2. Hence, for the rest of the proof, we will simply write the utility as u(q i ;b i ). Then let us show individual rationality. It is sufficient to show 1 g 1 0 @ å j2N i nd i (i) q s i (q i ;b i ); j r j + q s i (q i ;b i );d i (i) r i (q i ) 1 A W i i (b i )=W i (q i ;b i )W i i (b i ) 0; But this is true since the weighted surplus is always increasing if the sellers’ valuations increase. Next, let us prove t SOR k (b;i) 0 for any i. First, we have t SOR k (b;i)=W i i (b i )W i (b)+ q s SOR k (b);i b i =W i i (b i ) W i (b) q s SOR k (b);i b i =W i i (b i ) W i (b) q s i (b);i b i ; where the last equality is from Lemma C.21.1. Note thatW i (b) q s SOR k (b);i b i can be viewed as the social value resulted from a sub-optimal ranking on N i i . Thus,W i i (b i )W i (b) q s SOR k (b);i b i by the optimality ofW i i (b i ). Zero-payment property has been shown by Lemma C.21.2. Uniqueness follows from Proposition 4.7.5. C.22 Proof of Proposition C.5.1. Proof of Proposition C.5.1: Here we sketch the proof. We verify incentive compatibility first. Suppose that the bids from other sellers areb i , and that seller i has valuation q i . Further suppose that he bids 230 b i 6= q i , and denote his utility by u(q i ;b i ;b i ), where the first component is his true valuation and the second item is his reported valuation. Then we have u(q i ;q i ;b i ) u(q i ;b i ;b i ) = q s (q i ;b i );i q i + 1 g 1 W(q i ;b i ) q s (q i ;b i );i q i q s (b i ;b i );i q i + 1 g 1 W(b i ;b i ) q s (b i ;b i );i b i = 1 g 1 W(q i ;b i ) 1 g 1 å j2Nnfig q s (b i ;b i ); j r j (b j )+ q s (b i ;b i );i r i (q i ) ! 0: The first equality follows since we have canceled the term (1=g i )W i (b i ) from both u(q i ;q i ;b i ) and u(q i ;b i ;b i ). The last inequality follows since the term inside of the parentheses is the social net surplus from an inferior rankings (b i ;b i ) when sellers bid(q i ;b i ). For the rest of the proof, we remind the readers thatW i (b i ) is the optimal social net surplus when sellers have valuation(0;b i ). To show individual rationality, simply observe that u(q i ;q i ;b i ) 1 g 1 (W (q i ;b i )W i (b i )) 0 where the first inequality follows because we have dropped the term q i 0 q i 0 from u(q i ;q i ;b i ), and the second term follows because the social net surplus must improve if seller i’s valuation increases from zero toq i . Non-negative payment property is also easy to show as we can write t (q i ;b i ;i)= 1 g 1 (W i (b i )W (q i ;b i ))+ q s (q i ;b i );i q i = 1 g 1 W i (b i ) 1 g 1 W (q i ;b i ) q s (q i ;b i );i q i 0; since W (q i ;b i )g 1 q s (q i ;b i );i q i is the social net surplus of ranking s (q i ;b i ) when sellers have valuation(0;b i ), so it is dominated byW i (b i ). 231 C.23 Proof of Proposition C.5.2. Proof of Proposition C.5.2: For clarity, we break down the proof into several steps. Recall the following recursive formula for the bids b i =(1a i ) r i (q i ) g 1 +a i b i+1 (C.30) for all i in N, where we define b n+1 =w=g 1 . We claim the following. Claim. For i2 N,r i (q i )=g 1 b i+1 and b i b i+1 . We will establish this claim inductively. We first show this for i= n. Notice that r n (q n ) g 1 = w n g 1 +q n w n g 1 w g 1 = b n+1 : As a result, b n =(1a n ) r n (q n ) g 1 +a n b n+1 (1a n )b n+1 +a n b n+1 = b n+1 : Now suppose these conclusions hold for i= k+ 1. Then b k+1 =(1a k+1 ) r k+1 (q k+1 ) g 1 +a k+1 b k+2 (1a k+1 ) r k+1 (q k+1 ) g 1 +a k+1 r k+1 (q k+1 ) g 1 r k+1 (q k+1 ) g 1 r k (q k ) g 1 : Using the argument similar to the above, it must be that b k b k+1 . Therefore, we finished the proof of the above-mentioned claim As a result of the above claim, all the sellers are ranked the same as p . One moment of reflection shows the following. Claim. With payment q j;p(b) = q j;p(b) + q p(b; j) 0 b p(b; j+1) (1=g 1 )w p(b; j) for each purchase, the auction format in Algorithm 2 is equivalent to the auction format in which the sellers only pay b p(b; j+1) (1=g 1 )w p(b; j) for their fresh demand, and do not pay for the returned demand. Therefore, we can assume without loss of generality q i 0 = 0 for all i2 N and focus on this new auction format. A side outcome of the above two claims is that the sellers are willing to participate into the 232 auctions, as they get non-negative payoff by doing so. This shows individual rationality. We then claim the following. Claim. There are no profitable downward deviations for any seller. We again show this claim by induction. Seller i will be not want to deviate position i+ 1 for the following reason. Eq. (C.30) implies (1a i+1 ) r n (q i ) g 1 (1a i+1 ) r n (q i+1 ) g 1 = b i+1 a i+1 b i+2 ; or equivalently q i + w i g 1 b i+1 a i+1 q i + w i g 1 b i+2 : Multiply both sides byÕ j<i a j (1a i ). We have the desired result for the base case: there is no profitable one slot downward deviation for seller i. There is no profitable deviation to position k> i for seller i implies Õ 1< j<i a j (1a i ) q i + w i g 1 b i+1 Õ 1< j<i a j (1a i ) Õ i< jk a j q i + w i g 1 b k+1 : (C.31) On the other hand, there is no profitable one slot downward deviation for seller k is equivalent to (q k +w k =g 1 b k+1 )a k+1 (q k +w k =g 1 b k+2 ), which implies (1a k+1 )(q i + w i g 1 )(1a k+1 )(q k + w k g 1 ) b k+1 a k+1 b k+2 ; where the first inequality follows sincer i (q i )r k (q k ). As a result,(q i +w i =g 1 b k+1 )a k+1 (q i +w i =g 1 b k+2 ). Therefore, combined with (C.31), we have Õ 1< j<i a j (1a i ) q i + w i g 1 b i+1 Õ 1< j<i a j (1a i ) Õ i< jk+1 a j q i + w i g 1 b k+2 : In summary, there are no profitable downward deviations. Claim. There are no profitable upward deviations for any seller. 233 This again can be shown by induction. Consider the base case first, where we show that seller i+1 does not want to deviate to position i. The desired conclusion could be written asa i (q i+1 +w i+1 =g 1 b i+2 ) q i+1 +w i+1 =g 1 b i , which could be organized as b i (1a i ) q i+1 + w i+1 g 1 +a i b i+2 : By Eq. (C.30), we know that b i =(1a i ) r i (q i ) g 1 +a i b i+1 =(1a i ) q i + w i g 1 +a i b i+1 (1a i ) q i+1 + w i+1 g 1 +a i b i+2 ; sinceq i + 1 g 1 w i q i+1 + 1 g 1 w i+1 and b i+1 b i+2 . This finishes the proof for the base case. Let us assume that there is no profitable deviation to position k+ 1 for seller i+ 1 with k< i. We need to show the same result for position k, or equivalently Õ k ji a j (q i+1 + w i+1 g 1 b i+2 )q i+1 + w i+1 g 1 b k : By induction hypothesis, Õ k ji a j (q i+1 + w i+1 g 1 b i+2 )a k q i+1 + w i+1 g 1 b k+1 q i+1 + w i+1 g 1 b k : The second equality follows since by (C.30), b k =(1a k ) q k + w k g 1 +a k b k+1 (1a k ) q i+1 + w i+1 g 1 +a k b k+1 : We now have verified the above claim. Putting everything together, we conclude the proof. C.24 Proof of Proposition C.5.3 Proof of Proposition C.5.3: We first show Eq. (C.32) holds. For 2 i n+ 1, b i = 1 g 1 q i1;p () V i1 (i1) W () + 1 g 1 r i1 (q i1 ) = 1 g 1 q i1;p () V i1 (i1) å k6=i1 q k;p () r k (q k ) ; (C.32) 234 where b i is defined in Eq. (C.30). Notice that for 2 i n, 1 g 1 q i1;p V i1 (i1) å k6=i1 q k;p r i (q i ) = Õ i jn a j w g 1 + 1 g 1Õ ji2 a j (1a i1 ) 2 6 6 6 6 4 å ji2 Õ k j1 a k (1a j )r j (q j )+Õ ji2 a jå ji Õ ik j1 a k (1a j )r j (q j ) å ji2 Õ k j1 a k (1a j )r j (q j )+a i1Õ ji2 a jå ji Õ ik j1 a k (1a j )r j (q j ) 3 7 7 7 7 5 = Õ i jn a j w g 1 + 1 g 1Õ ji2 a j (1a i1 ) " Õ ji2 a j (1a i1 ) å ji Õ ik j1 a k (1a j )r j (q j ) !# = Õ i jn a j w g 1 + 1 g 1 å ji Õ ik j1 a k (1a j )r j (q j ) =a i Õ i+1 jn a j w g 1 + å ji+1 Õ i+1k j1 a k (1a j ) r j (q j ) g 1 ! +(1a i ) r i (q i ) g 1 : It can be verified that b n+1 = 1 g 1 q n;p () V n ( n ) å k6=n q k;p () r k (q k ) =w=g 1 : Therefore, 1 g 1 q i1;p V i1 (i1) å k6=i1 q k;p r i (q i ) = Õ i jn a j w g 1 + 1 g 1 å ji Õ ik j1 a k (1a j )r j (q j ) is a solution to Eq. (C.30). Notice that Eq. (C.30) admits one and only one solution. As a result, b i = 1 g 1 q i1;p () V i1 (i1) 1 g 1 å k6=i1 q k;p () r i (q i ) : This substantiates (C.32). We next prove the non-negativity of entry fees. This result holds trivially for seller n. For seller i6= n, we notice that h i = 1 g 1 W i ( i )V i ( i ) 235 = 1 g 1 W i ( i ) 1 g 1 0 @ max s:Nnfig7!Nnfig å k2Nnfigq s; j r j (q j )+ Õ 0 ji1 a j Õ i+1 jn a j (1a i )w 1 A 1 g 1 W i ( i ) 1 g 1 0 @ max s:Nnfig7!Nnfig å k2Nnfigq s; j r j (q j )+ Õ 0 ji1 a j Õ i+1 jn a j (1a i )w i 1 A : Similar to the proof of Proposition C.5.1, we notice thatW i ( i ) is the optimal social net surplus(0; i ). The second term in the last line is the surplus value produced by an inferior ranking applied on (0; i ). Thus, the above term must be non-negative. Besides, when g 2 =g 3 = 0, the inequality binds and the last line equals to zero. Notice that if we can show the total payment of a seller coincide with that from VCG, we can be sure that the sellers will participate by the individual rationality of the VCG mechanism. One can see that the payment of seller n is zero in our modified GSP, the same as in VCG. For other sellers, we notice that the total payment of seller i is q i;p () + q i 0 q i;p () q i;p () + q i 0 b i+1 w i g 1 +h i = q i;p () b i+1 w i g 1 +h i =q i;p () 1 g 1 q i;p () V i ( i )W () + 1 g 1 r i (q i ) w i g 1 + 1 g 1 W i (q i )V i (q i ) = 1 g 1 (W i (q i )W ())+ q i;p r i (q i ) g 1 w i g 1 = 1 g 1 (W i (q i )W ())+ q i;p () q i = t (;i): With this, we conclude the proof. C.25 Proof of Lemma C.6.1 Proof of Lemma C.6.1: Notice that E " n å i=1 t(;i) # =E " å i2M t(;i) # = Z M å i2M t(;i) Õ i2M f i (q i )d M = Z M å i2M q s();i q i Z q i 0 q s(h; i );i dh Õ i2M f i (q i )d M = Z M å i2M q s();i q iÕ i2M f i (q i )d M å i2M Z Mnfig Z ¯ q i 0 Z q i 0 q s(h; i );i dh f i (q i )dq i ! Õ i2Mnfig f i (q i )d Mnfig = Z M å i2M q s();i q iÕ i2M f i (q i )d M å i2M Z Mnfig Z ¯ q i 0 Z ¯ q i h f i (q i )dq i q s(h; i );i dh ! Õ i2Mnfig f i (q i )d Mnfig 236 = Z M å i2M q s();i q iÕ i2M f i (q i )d M å i2M Z Mnfig Z ¯ q i 0 ¯ F i (h) f i (h) q s(h; i );i f i (h)dh ! Õ i2Mnfig f i (q i )d Mnfig = Z M å i2M q s();i q i ¯ F i (q i ) f i (q i ) Õ i2M f i (q i )d M =E " n å i=1 q s();i q i ¯ F i (q i ) f i (q i ) 1(i= 2M) # : The second line follows by Proposition 4.7.5. In the application of Proposition 4.7.5, we observe that (IR) is satisfied as long as C i ( i ) 0, so to maximize the revenue, we take C i ( i )= 0. The fourth line follows from exchanging the order of integral. Note that with Proposition 4.7.5, (IC) is automatically satisfied. Also, t(;i) 08i2N nM is implied by Proposition 4.7.5 when C i ( i )= 0. With this, we have the proof. C.26 Proof of Proposition C.6.3 Proof of Proposition C.6.3: We will show the sketch of the proof. We first consider the first part. We notice that q s ROR k ();i is increasing in q i with i fixed, which we argue as follows. Note that we have assumed thatq i ¯ F i (q i ) f i (q i ) in increasing inq i . By the definition of ROR, ifq i q 0 i and the ranking of seller i is changed withq i , it must be that seller i is selected as one of the top k sellers. In this case the sellers ranked above seller i withq i is a subset of sellers ranked above seller i withq 0 i . If the ranking of seller i is the same forq andq 0 , then the sellers above him is the same. The monotonicity follows. For the zero-payment property, let us assume that forq i q 0 i = 0 and for both valuations, seller i is not picked for the top k slots. By the definition of POR, the set of sellers ranked above seller i is then not changed. Therefore, q s ROR k (h; i );i is a constant in the interval[0;q i ] and t ROR k (;i)= q s ROR k ();i q i R q i 0 q s ROR k (h; i );i dh implies zero-payment probability. This finishes the proof of the first part. For the second part, we notice that E " n å i=1 g 4 q s ROR (;i) ˆ p i +g 5 t(;i) # =E " n å i=1 q s ROR (;i) g 4 ˆ p i +g 5 q i ¯ F i (q i ) f i (q i ) 1(i= 2M) # and E " n å i=1 g 4 q ˆ s(;i) ˆ p i +g 5 t(;i) # =E " n å i=1 q ˆ s(;i) g 4 ˆ p i +g 5 q i ¯ F i (q i ) f i (q i ) 1(i= 2M) # : Then we follow the exact same argument to that of Theorem 4.4.4. 237 Bibliography [1] E. Abbe, M. Bierlaire, and T. Toledo. Normalization and correlation of cross-nested logit model. Transportation Research Part B: Methodology, 41:795–808, 2007. [2] G. Aggarwal, J. Feldman, S Muthukrishnan, and M. P´ al. Sponsored search auctions with markovian users. In International Workshop on Internet and Network Economics, pages 621–628. Springer, 2008. [3] A. Alentorn and S. Markosei. Generalized Extreme Value Distribution and Extreme Economic Value at Risk (EE-VaR), pages 47–71. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008. [4] N. Alon and J. H. Spencer. The Probabilistic Methods. John Wiley & Sons, Inc., New York, NY , 2000. [5] A. Alptekino˘ glu and J. H. Semple. The exponomial choice model: A new alternative for assortment and price optimization. Operations Research, 64(1):79–93, 2016. [6] S. P. Anderson and A. de Palma. Multiproduct firms: A nested logit approach. The Journal of Industrial Economics, 40(3):261–276, 1992. [7] S. P. Anderson, A. de Palma, and J.-F. Thisse. Discrete choice theory of product differentiation. MIT Press, 1992. [8] A. Aouad, V . Farias, and R. Levi. Assortment optimization under consider-then-choose choice models. 2017. Working Paper. [9] A. Archer and ´ E. Tardos. Truthful mechanisms for one-parameter agents. In Proceedings of the 42Nd IEEE Symposium on Foundations of Computer Science, FOCS ’01, pages 482–492, 2001. [10] S. Athey and G. Ellison. Position auctions with consumer search. The Quaterly Journal of Eco- nomics, 126(3):1213 – 1270, 2011. [11] R. Barlow and F. Proschan. Mathematical Theory of Reliability. Society for Industrial and Applied Mathematics, 1996. [12] S. Bekhor and J. N. Prashker. GEV-based destination choice models that account for unobserved similarities among alternatives. Transportation Reserach Part B: Methodology, 42(3):243–262, 2008. [13] C. M. Bishop. Pattern Recognition and Machine Learning. Springer, New York, NY , 2006. [14] G. Bitran and R. Caldentey. An overview of pricing models for revenue management. Manufactur- ing & Service Operations Management, 5(3):203–229, 2003. [15] T. Bodea, M. Ferguson, and L. Garrow. Choice-based revenue management: Data from a major hotel chain. Manufacturing & Service Operations Management, 11(2):356–361, 2009. 238 [16] S. Boyd and L. Vandenberghe. Convex optimization. Cambridge University Press, 2004. [17] F. Branco, M. Sun, and J. M. Villas-Boas. Optimal search for product information. Management Science, 58(11):2037–2056, 2012. [18] T. F. Bresnahan, S. Stern, and M. Trajtenberg. Market segmentation and the sources of rents from innovation: Personal computers in the late 1980s. The RAND Journal of Economics, 28:S17–S44, 1997. [19] J. J. M. Bront, I. Mendez Diaz, and G. Vulcano. A column generation algorithm for choice-based network revenue management. Operations Research, 57(3):769–784, 2009. [20] C.-H.Wen, T. Chen, and C.Fu. A factor-analytic generalized nested logit model for determining market position of airlines. Transportation Research Part A: policy and practice, 62:71–80, 2014. [21] G. Calzolari and V . Denicol. Competition with exclusive contracts and market-share discounts. American Economic Review, 103(6):2384–2411, 2013. [22] N. S. Cardell. Variance components structures for the extreme-value and logistic distributions with application to models of heterogeneity. Econometric Theory, 13(2):185–213, 1997. [23] M. Cary, A. Das, B. Edelman, I. Giotis, K. Heimerl, A. R. Karlin, C. Mathieu, and M. Schwarz. On best-response bidding in GSP auction. 2008. NBER Working Paper. [24] T. Y . Chan and Y . H. Park. Consumer search activities and the value of ad positions in sponsored search advertising. Marketing Science, 34(4):606–623, 2015. [25] A. Chen, P. Kasikitwiwat, and Z. Ji. Solving the overlapping problem in route choice with paired combinatorial logit model. Transportation Research Record, 1857(1):65–73, 2003. [26] A. Chen, S. Ryu, X. Xu, and K. Choi. Computation and application of the paired combinatorial logit stochastic user equilibrium problem. Computers & Operations Research, 43(1):68–77, 2014. [27] K. D. Chen and W. H. Hausman. Technical note: Mathematical properties of the optimal product line selection problem using choice-based conjoint analysis. Management Science, 46(2):327–332, 2000. [28] Y . Chen and C. He. Paid placement: Advertising and search on the internet. The Economic Journal, 121(556):309–328, 2011. [29] Y . Chen and S. Yao. Sequential search with refinement: Model and application with click-stream data. Management Science, 63(12):4345–4365, 2017. [30] H. Choi and C. F. Mela. Online marketplace advertising. 2017. Working Paper. [31] L. Chu, H. Nazerzadeh, and H. Zhang. Position ranking and auctions for online marketplaces. Management Science, 2019. forthcoming. [32] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, and D. E. Knuth. On the Lambert-W function. Advances in Computational Mathematics, 5(1):329–359, 1996. [33] C. F. Daganzo and M. Kusnic. Technical note – two properties of the nested logit model. Trans- portation Science, 27(4):395–400, 1993. [34] J. Dai, W. Ding, A. J. Kleywegt, X. Wang, and Y . Zhang. Choice based revenue management for parallel flights. Technical report, Georgia Tech, Atlanta, GA, 2014. 239 [35] A. Daly and M. Bierlaire. A general and operational representation of Generalised Extreme Value models. Transportation Research Part B: Methodological, 40(4):285–305, 2006. [36] J. M. Davis, G. Gallego, and H. Topaloglu. Assortment planning under the multinomial logit model with totally unimodular constraint structures. Technical report, Cornell University, School of Oper- ations Research and Information Engineering, 2013. [37] J. M. Davis, G. Gallego, and H. Topaloglu. Assortment optimization under variants of the nested logit model. Operations Research, 62:250–273, 2014. [38] E. de Klerk. Aspects of Semidefinite Programming: Interior Point Algorithms and Selected Appli- cations. Kluwer Academic Publishers, New York, NY , 2004. [39] W. Fernandez de la Vega and M. Karpinski. Approximation complexity of nondense instances of max-cut. Electronic Colloquium on Computational Complexity, 2006. Report No. 101. [40] A. Desire and V . Goyal. Near-optimal algorithms for capacity constrained assortment optimization. Working Paper, Columbia University, 2016. [41] L. Dong, P. Kouvelis, and Z. Tian. Dynamic pricing and inventory control of substitute products. Manufacturing & Service Operations Management, 11(2):317–339, 2009. [42] A. Dukes and L. Liu. Online shopping intermediaries: The strategic design of search environments. Management Science, 62(4):1064–1077, 2016. [43] B. Edelman, M. Ostrovsky, and M. Schwarz. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. American Economic Review, 97(1):242–259, 2007. [44] B. Edelman and M. Schwarz. Optimal auction design and equilibrium selection in sponsored search auctions. American Economic Review, 100(2):597–602, 2010. [45] J. B. Feldman and H. Topaloglu. Technical note: Capacity constraints across nests in assortment optimization under the nested logit model. Operations Research, 63(4):812–822, 2015. [46] J. B. Feldman and H. Topaloglu. Revenue management under the Markov chain choice model. Operations Research, 65(5):1322–1342, 2017. [47] J. B. Feldman, D. Zhang, X. Liu, and N. Zhang. Taking assortment optimization from theory to practice: Evidence from large field experiments on alibaba. 2018. Working Paper. [48] J. Feng, H. K. Bhargava, and D. M. Pennock. Implementing sponsored search in web search engines: Computational evaluation of alternative mechanisms. INFORMS J. on Computing, 19(1):137–148, 2007. [49] G. Gallego and M. Hu. Dynamic pricing of perishable assets under competition. Management Science, 60(5):1241–1259, 2014. [50] G. Gallego, G. Iyengar, R. Phillips, and A. Dubey. Managing flexible products on a network. Working Paper, Columbia University, 2004. [51] G. Gallego and H. Topaloglu. Constrained assortment optimization for the nested logit model. Management Science, 60(10):2583–2601, 2014. [52] G. Gallego and G. J. van Ryzin. A multiproduct dynamic pricing problem and its applications to network yield management. Operations Research, 45(1):24–41, 1997. 240 [53] G. Gallego and R. Wang. Multi-product price optimization and competition under the nested attrac- tion model. Operations Research, 62(2):450–461, 2014. [54] M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP- Completeness. W. H. Freeman and Company, New York, NY , 1979. [55] T. J. Gilbride, P. J. Lenk, and J. D. Brazell. Generalized extreme value model and additively sepa- rable generator function. Journal of Econometrics, 76(1-2):129–140, 1997. [56] T. J. Gilbride, P. J. Lenk, and J. D. Brazell. Market share constraints and the loss function in choice- based conjoint analysis. Marketing Science, 27(6):995–1011, 2008. [57] M. X. Goemans and D. P. Willamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM, 42(6):1115–1145, 1995. [58] N. Golrezaei, H. Nazerzadeh, and P. Rusmevichientong. Real-time optimization of personalized assortments. Management Science, 60(6):1532–1551, 2014. [59] J. Green and J. Laffant. Characterization of satisfactory mechanisms for the revelation of the pref- erences for public goods. Econometrica, 45(2):771–782, 1977. [60] T. Groves. Incentive in teams. Econometrica, 41(3):617 – 631, 1973. [61] D. Gupta, A. V . Hill, and T. Bouzdine-Chameeva. A pricing model for clearing end-of-season retail inventory. European Journal of Operational Research, 170:518–540, 2006. [62] W. Hanson and K. Martin. Optimizing multinomial logit profit functions. Management Science, 42(7):992–1003, 1996. [63] J. R. Hauser. Consideration-set heuristics. Journal of Business Research, 67(8):1688 – 1699, 2014. [64] I. Hendel and A. Nevo. Measuring the implications of sales and consumer inventory behavior. Econometrica, 74(6):1637–1673, 2006. [65] D. S. Hochbaum. Instant recognition of half-integrality and 2-approximations. In K. Jansen and J. Rolim, editors, Approximation Algorithms for Combinatorial Optimization, pages 99–110, Aal- borg, Denmark, 1998. International Workshop APPROX’98. [66] D. S. Hochbaum, N. Megiddo, J. Naor, and A. Tamir. Tight bounds and 2-approximation algorithms for integer programs with two variables per inequality. Mathematical Programming, 62(1):69–83, Feb 1993. [67] W. J. Hopp and X. Xu. Product line selection and pricing with modularity in design. Manufacturing & Service Operations Management, 7(3):172–187, 2005. [68] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, New York, NY , 2012. [69] T. Huh and H. Li. Pricing under the nested attraction model with a multi-stage choice structure. Operations Research, 63(4):840–850, 2015. [70] P. Hummel and R. P. McAfee. Position auctions with externalities. In Web and Internet Economics: 10th International Conference, WINE 2014, pages 417–422, 2014. [71] R. Inderst and G. Shaffer. Market-share contracts as facilitating practices. The RAND Journal of Economics, 41(4):709–729, 2010. 241 [72] P. Jeziorski and I. Segal. What makes them click: Empirical analysis of consumer demand for search advertising. American Economic Journal: Microeconomics, 7(3):24–53, 2015. [73] W. A. Kamarkura, B. Kim, and J. Lee. Modeling preference and structural heterogeneity in con- sumer choice. Marketing Science, 15(2):152 – 172, 1996. [74] P. K. Kannan and Gordon P. Wright. Modeling and testing structured markets: A nested logit approach. Marketing Science, 10(1):58–82, 1991. [75] D. Karger, R. Motwani, and M. Sudan. Approximate graph coloring by semidefinite programming. J. ACM, 45(2):246–265, March 1998. [76] A. Karoonsoontawong and D.-Y . Lin. Combined gravity model trip distribution and paired combi- natorial logit stochastic user equilibrium problem. Networks and Spatial Economics, 15(4):1011– 1048, 2015. [77] T. T. Ke, Z.-J. M. Shen, and J. M. Villas-Boas. Search for information on multiple products. Man- agement Science, 62(12):3576–3603, 2016. [78] P. W. Keller. Tractable Multi-Product Pricing under Discrete Choice Models. PhD thesis, Mas- sachusetts Institute of Technology, Cambridge, MA, 2013. [79] D. Kempe and M. Mahdian. A cascade model for externalities in sponsored search. In Internet and Network Economics: 4th International Workshop, WINE 2008, pages 585–596, 2008. [80] B.-D. Kim, R. C. Blattberg, and P. E. Rossi. Modeling the distribution of price sensitivity and implications for optimal retail pricing. Journal of Business & Economic Statistics, 13(3):291–303, 1995. [81] J. B. Kim, P. Albuquerque, and B. J. Bronnenberg. Online demand under limited consumer search. Marketing Science, 29(6):1001 – 1023, 2010. [82] J. Kleinberg and E. Tardos. Algorithm Design. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 2005. [83] S. D. Kominers and Y . Zhou. Dynamic position auctions with consumer search. In Algorithmic Aspects in Information and Management: 5th International Conference, AAIM 2009, pages 240– 250, 2009. [84] F. S. Koppelman and C.-H. Wen. The paired combinatorial logit model: properties, estimation and application. Transportation Research Part B: Methodological, 34(2):75–89, 2000. [85] S. G. Krantz and H. R. Parks. The implicit function theorem. Springer Science+Business Media, LLC, 2003. [86] L. C. Lau, R. Ravi, and M. Singh. Iterative Methods in Combinatorial Optimization. Cambridge University Press, New York, NY , 2011. [87] P. L’Ecuyer, P. Maill´ e, N. E. Stier-Moses, and B. Tuffin. Revenue-maximizing rankings for online platforms with quality-sensitive consumers. Operations Research, 65(2):408–423, 2017. [88] G. Li and P. Rusmevichientong. A greedy algorithm for the two-level nested logit model. Operations Research Letters, 42(5):319 – 324, 2014. [89] G. Li, P. Rusmevichientong, and H. Topaloglu. The d-level nested logit model: Assortment and price optimization problems. Operations Research, 63(2):325–342, 2015. 242 [90] H. Li and W. T. Huh. Pricing multiple products with the multinomial logit and nested models: Concavity and implications. Manufacturing & Service Operations Management, 13(4):549–563, 2011. [91] H. Li and S. Webster. Optimal pricing of correlated product options under the pairwise combinato- rial logit model. Working Paper, University of Arizona, 2015. [92] Q. Liu and G. J. van Ryzin. On the choice-based linear programming model for network revenue management. Manufacturing & Service Operations Management, 10(2):288–310, 2008. [93] R. D. Luce. Individual choice behavior: a theoretical analysis. Wiley, New York, NY , 1959. [94] B. Lucier, R. Paes Leme, and E. Tardos. On revenue in the generalized second price auction. In Proceedings of the 21st International Conference on World Wide Web, WWW’12, pages 361–370, 2012. [95] C. Maglaras and J. Meissner. Dynamic pricing strategies for multiproduct revenue management problems. Manufacturing & Service Operations Management, 8(2):136–148, 2006. [96] S. Mahajan and H. Ramesh. Derandomizing approximation algorithms based on semidefinite pro- gramming. SIAM Journal on Computing, 28(5):1641–1663, 1999. [97] D. McFadden. Conditional logit analysis of qualitative choice behavior. In P. Zarembka, editor, Frontiers in Economics, pages 105–142, New York, NY , 1974. Academic Press. [98] D. McFadden. Modeling the choice of residential location. In Spatial Interaction Theory and Planning Models, pages 531–551. North Holland, 1978. [99] D. McFadden. Econometric models for probabilistic choice among products. The Journal of Busi- ness, 53(3):S13–S29, 1980. [100] D. McFadden and K. Train. Mixed MNL models for discrete response. Journal of Applied Eco- nomics, 15:447–470, 2000. [101] K. S. Miller. On the inverse of the sum of matrices. Mathematics Magazine, 54(2):67–72, 1986. [102] G. Moscarini and L. Smith. The optimal level of experimentation. Econometrica, 69(6):1629–1644, 2001. [103] S. Mumbower, L. A. Garrow, and M. J. Higgins. Estimating flight-level price elasticities using on- line airline data: A first step toward integrating pricing, demand, and revenue optimization. Trans- portation Research Part A: Policy and Practice, 66:196–212, 2014. [104] R. Myerson. Optimal auction design. Mathematics of Operations Research, 6(1):641–654, 1981. [105] H. N. Nagaraja. Order Statistics from Independent Exponential Random Variables and the Sum of the Top Order Statistics, pages 173–185. Birkh¨ auser Boston, Boston, MA, 2006. [106] R. R. Nelson and S. G. Winter. An evolutionary theory of economic change. Harvard University Press, Cambridge, MA, USA, 2009. [107] J. P. Newman. Normalization of network generalized extreme value models. Transportation Re- search Part B: Methodological, 42(10):958–969, 2008. [108] N. Nisan, T. Roughgarden, Eva Tardos, and V . V . Vazirani. Algorithmic Game Theory. Cambridge University Press, New York, NY , USA, 2007. 243 [109] M. Padberg. The boolean quadric polytope: Some characteristics, facets and relatives. Mathematical Programming, 45(1):139–172, Aug 1989. [110] A. Papola and V . Marzano. A network generalized extreme value model for route choice allowing implicit route enumeration. Computer-Aided Civil and Infrastructure Engineering, 28(8):560–580, 2013. [111] J. W. Payne. Task complexity and contingent processing in decision making: An information search and protocol analysis. Organizational Behavior and Human Performance, 16(2):366–387, 1976. [112] J. Prashker and S. Bekhor. Investigation of stochastic network loading procedures. Transportation Research Record, 1645(1):94–102, 1998. [113] J. Prashker and S. Bekhor. Stochastic user-equilibrium formulations for extended-logit assignment models. Transportation Research Record, 1676:145–152, 1999. [114] M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY , USA, 1st edition, 1994. [115] R. A. Rankin. Chapter 4 - infinite series. In ROBERT A. RANKIN, editor, An Introduction to Mathematical Analysis, pages 109 – 149. Pergamon, 1963. [116] W. Z. Rayfield, P. Rusmevichientong, and H. Topaloglu. Approximation methods for pricing prob- lems under the nested logit model with price bounds. INFORMS Journal on Computing, 27(2):335– 357, 2015. [117] J. Roberts and M. Lattin. Development and testing of a model consideration set composition. Jour- nal of Marketing Research, 28(4):429 – 440, 1991. [118] P. Rusmevichientong, Z.-J. M. Shen, and D. B. Shmoys. Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Operations Research, 58(6):1666–1680, 2010. [119] P. Rusmevichientong, D. B. Shmoys, C. Tong, and H. Topaloglu. Assortment optimization under the multinomial logit model with random choice parameters. Production and Operations Management, 23(11):2023–2039, 2014. [120] A. Ruszczynski. Nonlinear optimization. Princeton University Press, 2006. [121] L. S. Shapley and M. Shubik. The assignment game I: The core. International Journal of Game Theory, 1(1):111–130, 1971. [122] J. Sherman and W. J. Morrison. Adjustment of an inverse matrix corresponding to a change in one element of a given matrix. The Annals of Mathematical Statistics, 21(1):124–127, 1950. [123] J. Silva-Risso and I. Ionova. A nested logit model of product and transaction-type choice for plan- ning automakers? pricing and promotions. Marketing Science, 27(4):545 – 556, 2008. [124] K. A. Small. A discrete choice model for ordered alternatives. Econometrica, 55(2):409–424, 1987. [125] J.-S. Song and Z. Xue. Demand management and inventory control for substitutable products. Technical report, Duke University, Durham, NC, 2007. [126] T. J. Steenburgh. The invariant proportion of substitution property (ips) of discrete-choice models. Marketing Science, 27(2):300 – 307, 2008. [127] L. Sullivan. Amazon sponsored products drive 1.5 billion dollars in sales, 2016. 244 [128] J. Swait. Choice set generation within the generalized extreme value family of discrete choice models. Transportation Research Part B: Methodological, 35(7):643–666, 2001. [129] J. Swait. Flexible covariance structures for categorical dependent variables through finite mixtures of generalized extreme value models. Journal of Business & Economic Statistics, 21(1):80–87, 2003. [130] K. Talluri and G. J. van . Ryzin. Revenue management under a general discrete choice model of consumer behavior. Management Science, 50(1):15–33, 2004. [131] K. Talluri and G. J. van Ryzin. An analysis of bid-price controls of network revenue management. Management Science, 44(11):1577–1593, 1998. [132] K. Train. Discrete choice methods with simulation. Cambridge University Press, 2002. [133] C. Ulu, D. Honhon, and A. Alptekino˘ glu. Learning consumer tastes through dynamic assortments. Operations Research, 60(4):833–849, 2012. [134] R. M. Ursu. The power of rankings: Qualifying the effects of rankings on online consumer search and purchase decisions. 2016. Working Paper. [135] G. J. van Ryzin and S. Mahajan. On the relationship between inventory costs and variety benefits in retail assortments. Management Science, 45(11):1496–1509, 1999. [136] G. J. van Ryzin and G. Vulcano. A market discovery algorithm to estimate a general class of nonparametric choice models. Management Science, 61(2):281–300, 2015. [137] H. R. Varian. Position auctions. International Journal of Industrial Organization, 25(6):1164–1178, 2007. [138] P. V ovsha. Application of cross-nested logit model to mode choice in Tel Aviv, Israel, Metropolitan area. Transportation Research Record: Journal of the Transportation Research Board, 1607:6–15, 1997. [139] G. Vulcano, G. J. van Ryzin, and W. Chaar. Choice-based revenue management: An empirical study of estimation and optimization. Manufacturing & Service Operations Management, 12(3):371–392, 2010. [140] G. Vulcano, G. J. van Ryzin, and R. Ratliff. Estimating primary demand for substitutable products from sales transaction data. Operations Research, 60(2):313–334, 2012. [141] R. Wang. Capacitated assortment and price optimization under the multinomial logit model. Oper- ations Research Letters, 40(6):492–497, 2012. [142] R. Wang and O. Sahin. The impact of consumer search cost on assortment planning and pricing. Management Science, 0(0), 2017. [143] M. Weitzman. Optimal search for the best alternatives. Econometrica, 47(3):641–654, 1979. [144] C.-H. Wen and F. S. Koppelman. The generalized nested logit model. Transportation Research Part B: Methodological, 35(7):627–641, 2001. [145] H. C. W. L. Williams. On the formation of travel demand models and economic evaluation measures of user benefit. Environment and Planning A, 9(3):285–344, 1977. [146] D. P. Williamson and D. B. Shmoys. The Design of Approximation Algorithms. Cambridge Univer- sity Press, New York, NY , USA, 1st edition, 2011. 245 [147] A. Wolinsky. True monopolistic competition as a result of imperfect information. The Quarterly Journal of Economics, 101(3):493–511, 1986. [148] J. Yao, A. Chen, S. Ryu, and F. Shi. A general unconstrained optimization formulation for the combined distribution and assignment problem. Transportation Research Part B: Methodology, 59:137–160, 2014. [149] S. Yao and C. F. Mela. A dynamic model of sponsored search advertising. Marketing Science, 30(3):447–468, 2011. [150] D. Zhang and Z. Lu. Assessing the value of dynamic pricing in network revenue management. INFORMS Journal on Computing, 25(1):102 –115, 2013. [151] D. Zhang and Z. Lu. Assessing the value of dynamic pricing in network revenue management. INFORMS Journal on Computing, 25(1):102–115, 2013. [152] H. Zhang, P. Rusmevichientong, and H. Topaloglu. Pricing under the generalized extreme value models with homogeneous price sensitivity parameters. Technical report, Cornell Tech, New York, NY , 2016. [153] H. Zhang, P. Rusmevichientong, and H. Topaloglu. Multiproduct pricing under the generalized extreme value models with homogeneous price sensitivity parameters. Operations Research, tech. note, 66(6):1559–1570, 2018. [154] H. Zhang, P. Rusmevichientong, and H. Topaloglu. Assortmet optimization under the pairwise combinatorial logit models. 2019. working paper. 246
Abstract (if available)
Abstract
Tremendous opportunities and many challenges arise in revenue management (RM) along with the evolution of retailing from brick-and-mortar stores to online marketplaces. Building upon this paradigm shift, choice models, which capture consumer substitution and demand based on statistical methods, play an increasingly important role in modern RM by providing it with the “mathematical language” to describe consume behavior in an increasingly more elaborated manner. My dissertation focuses on advancing the theory of choice modeling and its modern applications in RM. ❧ This current paradigm shift of retailing has resulted in an explosive growth of data availability, and fundamentally changed our ability to experiment with different models. Such advancements enable the possibility of fitting a very complex choice model for an application. Therefore, it is critical for us to understand how to operationalize these complex models for decision making. In Chapter 2, we consider unconstrained and constrained multi-product pricing problems when customers choose according to an arbitrary generalized extreme value (GEV) model and the products have the same price sensitivity parameter. GEV family is a large class of models and it subsumes infinitely many different choice models, including many well-known ones such as Multinomial Logit or Nested Logit Model. In the unconstrained problem, there is a unit cost associated with the sale of each product. The goal is to choose the prices for the products to maximize the expected profit obtained from each customer. We show that the optimal prices of the different products have a constant markup over their unit costs. We provide an explicit formula for the optimal markup in terms of the Lambert-W function. In the constrained problem, motivated by the applications with inventory considerations, the expected sales of the products are constrained to lie in a convex set. The goal is to choose the prices for the products to maximize the expected revenue obtained from each customer, while making sure that the constraints for the expected sales are satisfied. If we formulate the constrained problem by using the prices of the products as the decision variables, then we end up with a non-convex program. We give an equivalent market-share-based formulation, where the purchase probabilities of the products are the decision variables. We show that the market-share-based formulation is a convex program, the gradient of its objective function can be computed efficiently, and we can recover the optimal prices for the products by using the optimal purchase probabilities from the market-share-based formulation. Our results for both unconstrained and constrained problems hold for any arbitrary GEV model. ❧ In Chapter 3, along a similar line, we consider uncapacitated and capacitated assortment problems under the paired combinatorial logit model, which is quite complex choice model but demonstrated to be successful in transportations. Here our goal is to find a set of products to maximize the expected revenue obtained from a customer. In the uncapacitated setting, we can offer any set of products, whereas in the capacitated setting, there is an upper bound on the number of products that we can offer. We establish that even the uncapacitated assortment problem is strongly NP-hard. To develop an approximation framework for our assortment problems, we transform the assortment problem into an equivalent problem of finding the fixed point of a function, but computing the value of this function at any point requires solving a nonlinear integer program. Using a suitable linear programming relaxation of the nonlinear integer program and randomized rounding, we obtain a 0.6-approximation algorithm for the uncapacitated assortment problem. Using randomized rounding on a semidefinite programming relaxation, we obtain an improved 0.79-approximation algorithm, but the semidefinite programming relaxation can get difficult to solve in practice for large problem instances. Finally, using iterative rounding, we obtain a 0.25-approximation algorithm for the capacitated assortment problem. Our computational experiments on randomly generated problem instances demonstrate that our approximation algorithms, on average, yield expected revenues that are within 0.9% of an efficiently-computable upper bound on the optimal expected revenue. ❧ Furthermore, there are many novel revenue management applications for choice models along with this paradigm change. In Chapter 4, we focus on such applications in online markets. Online e-commerce platforms such as Amazon and Taobao connect thousands of sellers and consumers every day. We present a choice model that considers consumers' search costs on such platforms and the externalities sellers impose on each other and study how platforms should rank products displayed to consumers and utilize the top and most salient slot. This model allows us to study a multi-objective optimization, whose objective includes consumer and seller surplus, as well as the sales revenue, and derive the optimal ranking decision. In addition, we propose a surplus-ordered ranking (SOR) mechanism for selling some of the top slots. This mechanism is motivated in part by Amazon's sponsored search program. We show that the Vickrey–Clarke–Groves (VCG) mechanism would not be applicable to our setting and propose a new mechanism, which is near-optimal, performing significantly better than those that do not incentivize sellers to reveal their private information. Moreover, we generalize our model to settings where platforms can provide partial information about the products and facilitate the consumer search and show the robustness of our findings.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Modeling customer choice in assortment and transportation applications
PDF
Efficient policies and mechanisms for online platforms
PDF
Real-time controls in revenue management and service operations
PDF
Essays on consumer product evaluation and online shopping intermediaries
PDF
Essays on information design for online retailers and social networks
PDF
Essays on the luxury fashion market
PDF
Essays on service systems
PDF
Essays on service systems with matching
PDF
Marketing strategies with superior information on consumer preferences
PDF
Essays on bounded rationality and revenue management
PDF
Difference-of-convex learning: optimization with non-convex sparsity functions
PDF
Statistical learning in High Dimensions: Interpretability, inference and applications
PDF
Essays on understanding consumer contribution behaviors in the context of crowdfunding
PDF
The impacts of manufacturers' direct channels on competitive supply chains
PDF
Essays on commercial media and advertising
PDF
Commercialization of logistics infrastructure as an offline platform
PDF
Do humans play dice: choice making with randomization
PDF
Three essays on agent’s strategic behavior on online trading market
PDF
Multi-armed bandit problems with learned rewards
PDF
Some topics on continuous time principal-agent problem
Asset Metadata
Creator
Zhang, Heng
(author)
Core Title
Essays on revenue management with choice modeling
School
Marshall School of Business
Degree
Doctor of Philosophy
Degree Program
Business Administration
Publication Date
06/17/2019
Defense Date
03/22/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
assortment optimization,choice modeling,combinatorial optimization,convex optimization,e-commerce platform ranking,mechanism design,multi-product pricing,OAI-PMH Harvest,optimal stopping,revenue management
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Zhu, Leon (
committee chair
), Nazerzadeh, Hamid (
committee member
), Rusmevichientong, Paat (
committee member
), Yang, Sha (
committee member
)
Creator Email
hengz@usc.edu,hengzhang24@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-174974
Unique identifier
UC11662614
Identifier
etd-ZhangHeng-7493.pdf (filename),usctheses-c89-174974 (legacy record id)
Legacy Identifier
etd-ZhangHeng-7493.pdf
Dmrecord
174974
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Zhang, Heng
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
assortment optimization
choice modeling
combinatorial optimization
convex optimization
e-commerce platform ranking
mechanism design
multi-product pricing
optimal stopping
revenue management