Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Do humans play dice: choice making with randomization
(USC Thesis Other)
Do humans play dice: choice making with randomization
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Do Humans Play Dice: Choice Making with Randomization by Ruixin Qiang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Computer Science) May 2019 Copyright 2019 Ruixin Qiang Acknowledgements First and foremost, I would like to thank my Ph.D. advisors, Shaddin Dughmi and David Kempe. They introduced a lot of interesting problems in algorithmic game theory to me and guided me in such an exciting research eld. During my Ph.D., they gave me solid training for being a researcher, from how to think to how to write to how to present. I am grateful to them for all they have done to make me a better researcher. I would like to thank the rest of my thesis committee, Odilon C^ amara and Ming-Deh Huang, also my qualication committee, Shang-Hua Teng, for providing helpful suggestions about my research. I would like to thank my undergraduate research advisors, Pinyan Lu and Ning Chen, who brought me into the research of theoretical computer science. I would like to thank Michael Shindler. During the teaching assistant of CSCI170, I learned a lot about teaching from him. I would like to thank everyone in theoroom: Brendan Avent, Joseph Bebel, Ho Yee Cheung, Yu Cheng, Ehsan Emamjomeh-Zadeh, Li Han Lian Liu, Alana Shine, and Haifeng Xu. It was joyful to discussing research and homework problems with them. ii I thank my friends in USC: Xinran He, Zeng Huang, Chaoran Lu. They provided a lot of help to me and shared a lot of joyful time with me. I thank my ICPC teammates: Jun Chen, Zhiyi Xu, Gaoyuan Chen, Xinpei Yu. They spent lots of time with me discussing and solving algorithm questions. I thank two of my cats, Primal and Dual, who helped me typing a lot of characters in this thesis when I am away from my keyboard. Finally, I would like to thank my family. I am grateful to my parents, Jianhui Qiang and Changfeng Li, for their support and encouragement to me, and for creating an environment that allowed me to dive into computer science. I thank my anc ee, Yuan Jin, for being with me and loving me. My journey will be impossible without their love. iii Table of Contents Acknowledgements ii Abstract vii Chapter 1: Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Constrained Information Structure Design for Bilateral Trade . . . . . . . . . . . . . . . . . . 2 1.1.2 Dice-based Winner-Selection Rules . . . . . . . 6 1.1.3 Bayesian Persuasion . . . . . . . . . . . . . . . 8 1.2 Results and Thesis Organization . . . . . . . . . . . . . 9 1.3 Related Literature . . . . . . . . . . . . . . . . . . . . 10 1.3.1 On Bayesian Persuasion . . . . . . . . . . . . . 10 1.3.2 On Winner-Selection Environments . . . . . . . 13 1.3.3 On Impossibility Results of Bayesian Persuasion 15 Chapter 2: Background and Notation 16 2.1 Submodular Functions . . . . . . . . . . . . . . . . . . 16 2.2 Signaling Schemes and Bayesian Persuasion . . . . . . 17 2.2.1 Matroids . . . . . . . . . . . . . . . . . . . . . . 19 Chapter 3: Signaling in Bilateral Trade 21 3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.1 Signaling Schemes . . . . . . . . . . . . . . . . . 21 3.1.2 Welfare and Sanitized Welfare . . . . . . . . . . 22 iv 3.1.3 Sanitized Welfare Maximization Given Price Points . . . . . . . . . . . . . . . . . . . . . . . 23 3.2 Summary of Results . . . . . . . . . . . . . . . . . . . 24 3.3 Welfare with Limited Communication . . . . . . . . . . 26 3.4 Sanitized Welfare Maximization with a Greedy Algo- rithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.5 Submodularity of Sanitized Welfare . . . . . . . . . . . 34 3.6 Revenue in Bilateral Trade . . . . . . . . . . . . . . . . 38 Chapter 4: Dice-based Winner Selection Rules 41 4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 41 4.1.1 Winner Selection . . . . . . . . . . . . . . . . . 41 4.1.2 Interim Rules . . . . . . . . . . . . . . . . . . . 42 4.1.3 Border's Theorem and Implications for Single- Winner Environments . . . . . . . . . . . . . . 42 4.1.4 Border's Theorem for Matroid Environments . . 43 4.1.5 Winner-Selecting Dice . . . . . . . . . . . . . . 44 4.2 Summary of Results . . . . . . . . . . . . . . . . . . . 45 4.3 Existence of Dice-based Implementations for Matroids . 48 4.3.1 Continuous Winner-Selecting Dice . . . . . . . . 48 4.3.2 Winner-Selecting Dice with polynomially many faces . . . . . . . . . . . . . . . . . . . . . . . . 60 4.4 Ecient Construction of Dice for Single-Winner Environments . . . . . . . . . . . . . . . . . . . . . . . 61 4.4.1 Description of the Algorithm . . . . . . . . . . . 63 4.4.2 Proof of Lemma 4.18 (Correctness) . . . . . . . 65 4.4.3 Proof of Lemma 4.19 (Number of Recursive Calls) 66 4.4.4 Proof of Lemma 4.20 (Runtime per Call) . . . . 69 4.5 Symmetric Dice For I.I.D. Candidates in Single- Winner Setting . . . . . . . . . . . . . . . . . . . . . . 70 4.6 From Winner-Selecting Dice to Interim Rules for Single-Winner Environments . . . . . . . . . . . . . . . 76 v Chapter 5: When Dice do not Work in Bayesian Persuasion 78 5.1 Summary of Results . . . . . . . . . . . . . . . . . . . 78 5.2 Hyperedge Guessing Game . . . . . . . . . . . . . . . . 80 5.3 No Winner-Selecting Dice for Persuasion . . . . . . . . 84 Chapter 6: Conclusion and Future Work 89 Reference List 93 Appendix A Proofs in Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . 98 A.1 Proof of Theorem 3.10 . . . . . . . . . . . . . . . . . . 98 A.2 Proof of Lemma 3.12 . . . . . . . . . . . . . . . . . . . 103 Appendix B A Non-Matroid Example Implementable by Dice . . . . . . . 116 vi Abstract Making choices with a die sounds unhelpful, and may be adopted only by students who do not know the correct answer during exams. However, there are cases for which rolling a die is the best way to solve the problem. Allocating limited resources fairly is a common scenario where randomness is adopted. For example, the H1B lottery used by USCIS to select who will get a visa. But even without considering fairness, making a choice randomly may be the only way to benet oneself. In a repeated rock-paper-scissors game, a deterministic player can never win, since his opponent can observe and play the winning action against his unchanged strategy. In this thesis, we examine several scenarios where randomization does and does not work. We rst study information structure design, also called \persua- sion" or \signaling", in the presence of a constraint on the amount of communication. We focus on the fundamental setting of bilateral trade, which in its simplest form involves a seller with a single item to price, a buyer whose value for the item is drawn from a common prior distribution overn dierent possible values, and a take-it-or-leave-it- oer protocol. A mediator with access to the buyer's type may par- tially reveal such information to the seller in order to further some vii objective such as the social welfare or the seller's revenue. A simple example can show that revealing the information deterministically is not optimal for the social welfare. We study how randomization can help in the communication-constrained setting. In the setting of maximizing welfare under bilateral trade, as our main result, we exhibit an ecient algorithm for computing a M1 M (11=e)-approximation to the welfare-maximizing scheme with at mostM signals. Without allowance of randomization, it is unclear how to design an (approximately) optimal scheme. For the revenue objective, in contrast, we show that a deterministic scheme suces. We next study the existence of dice-based winner-selection rules for given interim rules. In a winner-selection environment, multiple winners are selected from a candidate set, subject to certain feasibility constraints. The interim rule summarizes the probability that each candidate being selected. We show that when the feasibility constraint is a matroid constraint, any feasible interim rule admits a dice-based implementation. A dice-based implementation associates a die with each candidate. To choose the winner, the rule rolls all dice independently and picks the subset that maximizes the sum of rolled values, subject to the feasibility constraint. Dice-based rules generalize both Myerson's auction [48] and order sampling [2], both of which assign (random) values to candidates and choose the set of candidates maximizing the sum of values. viii While dice can implement all matroid winner-selection rules, we also show two cases where they fail. Both of these cases fall in the Bayesian Persuasion model of Kamenica and Gentzkow. For one setting, we show that our positive algorithmic results for bilateral trade do not extend to communication-constrained signaling in the Bayesian Persuasion model. Specically, we show that it is NP-hard to approximate the optimal sender's utility to within any constant factor in the presence of communication constraints. For the other, we treat Bayesian persuasion as a winner-selection environment and show an instance that does not admit a dice-based implementation. ix Chapter 1 Introduction The question of whether God plays dice or not has been discussed by physicists for a long time. In contrast, we humans do play dice a lot. At least for computer scientists, randomization has been shown to be a powerful tool for solving problems. In this thesis, we study how randomization can be applied to choice making with certain constraints, or simply, how to design dice and use them to make decisions. There are environments within which making random choices is in particular preferable because oftentimes they can lead to better outcomes, especially when there are strategic interactions between dierent parties. Making a deterministic decision can put oneself at a disadvantage against others. In a Nash Equilibrium that characterizes the behavior of rational players in a game, players often choose actions strategically and randomly. Randomness continues to be a useful tool when uncertainty and information asymmetry are introduced. Auctions and markets on the Internet feature sellers with privileged information regarding their products, and buyers with private information regarding their willingness to pay. Instead of revealing the information directly to other parties, a player's behavior can be governed by an information structure that randomly maps what he knows to what he will say/do. On the other hand, there are certain objectives that can only be achieved by rolling a die. The most common ones are allocating limited resources fairly. In various online game award item allocations, every team member rolls a die, and the winner will be the one who gets the highest face. However, what if a fair allocation 1 is not desired? Can the game players still use dice to choose the winner when someone who contributes more should win with a specied higher probability, or when there is more than 1 item to be allocated under complicated restrictions on who can get an item? Here is such an example: you need to allocate 6 gold coins to a team of 3 knights, 4 wizards and 6 archers. Each of your team members can get at most one coin. Your allocation should not be too biased, namely, you can give at most 2 coins to knights, 3 coins to wizards, 4 coins to archers. For each member, his contribution level is randomly drawn from a known distribution. Your team has an agreement on what should be the probability of getting a coin for each member and each contribution level. You need to nd a way to allocate the coins that satises all of these requirements. In this thesis, we will discuss these two sides of randomized choice making, in particular, information structure design for bilateral trade and winner-selection rule design with dice. Interestingly, both problems are closely related to Bayesian persuasion, so we also examine the possibility of generalization from both to Bayesian persuasion. 1.1 Motivation 1.1.1 Constrained Information Structure Design for Bilateral Trade Bilateral trade between two parties [42] is one of the most fundamental economic interactions governed by the presence or absence of information. In (a simplied form of) bilateral trade, one side can choose whether to participate in the trade, and by doing so would generate a social surplus which is private information to him. The other side can propose to take a xed amount of the social surplus. Two particularly natural instantiations of bilateral trade are the following: 1. Trade of an item between a seller (who has no value for the item) and a buyer via a posted price. The seller's posted price is the amount of surplus 2 she proposes to take, while the buyer's valuation for the item, drawn from a commonly known distribution, is the amount of social surplus the trade would generate. The buyer chooses whether to accept the seller's posted price. We call the resulting game the pricing game. 2. Trade between an employer and an agent: 1 an employer would like to hire an agent to complete a project, and has (known) utilityu for its completion. The agent has a private cost c (drawn from a known distribution) for completing the project, and the social surplus generated is the dierence uc. Without knowing the cost of the agent, the employer posts a proposed payment p, which is equivalent to posting the share up of the social welfare which she proposes to keep. We call this game the employment game. The buyer/agent is assumed to be rational with quasilinear utility, while the seller/employer aims to maximize her own utility. The buyer/agent accepts an oer if he would derive non-negative utility from it, and rejects it otherwise; this corresponds to the valuation exceeding the price in the pricing game, and the payment exceeding the cost in the employment game. For concreteness, we will state all of our results in the language of the pricing game; however, all results carry over verbatim to the employment game, and we will occasionally remark on the interpretation of results in this context. The seller's chosen oer price to the buyer depends on what she knows about the buyer's value. If the seller has no information other than the distribution of values in the population of potential buyers, she chooses the price P = P () maximizing her revenue Rev(P; ) =P Pr V [VP ]. This leads to a revenue of Rev() = Rev(P (); ) =P Pr V [V P ] for the seller, and a social welfare of Welfare() = Pr [V P ]E V [VjVP ] for both players combined. At the other extreme is the case when the seller is fully informed about V . She can now set a price P = V ; trade will always occur, leading to a maximum social 1 In the literature, this falls into the class of principal-agent models. However, in this thesis, we use the word \principal" for a dierent role, so we will use the non-standard nomenclature to avoid misunderstandings. 3 welfare of E V [V ], which is fully extracted as revenue by the seller. Notice the dierence caused by dierent amounts of information being communicated: with no information, the social welfare can be arbitrarily smaller than with full information. In order to fully inform the seller, very ne-grained information had to be communicated. In reality, for practical and logistical reasons, the information received by the seller about the buyer is typically limited. For example, in the employment game, the information may be provided by a university or certication agency, which may initially only be able to communicate a coarse-grained rating of the agent via a GPA or the performance on a certication test. When the \agent" provides a product (e.g., a piece of clothing or medication), the location or label it is sold under (boutique/brand-name vs. discount/generic) sends a coarse signal to the potential employer/buyer about the distribution of qualities she is to expect. Indeed, the study of communication constraints and their impact on the outcomes of games dates back at least to the work of Blumrosen and Feldman [9], Blumrosen et al. [10] on communication constraints in auctions. With such restrictions, can randomization help us to achieve better social welfare? The question is governed by a more general one: how to optimally inform the seller with limited communication, which is the rst subject of this thesis. 2 This question actually comprises two separate thrusts: (1) What is the inherent price of limited communication? In other words, how much social welfare is lost because the principal can only communicate limited information to the seller? A particularly stark version of this question is the following: in the presence of a self-interested seller, how much social welfare can a principal salvage to a constant factor by sending a single bit of information, which is equivalent up to merely being able to exclude some buyers from the market? (2) What are the algorithmic consequences of limited communication? How well can the principal optimize the welfare with limited communication, if the computation of the signaling scheme has to be ecient as well? 2 We note that an alternate interpretation of this goal is in terms of market segmentation. Specically, how would a market designer optimally partition the market into a limited number of segments? 4 In order to formalize the notion of limited communication and randomization in information structure design, we rst dene information structures. In the setting of bilateral trade, an information structure for the seller is a (possibly randomized) map ' from the realized value V 2R of the buyer to a signal 2 presented to the seller. This in eect partitions the probability histogram of | or equivalently, the population of buyers | into dierent segments, with each corresponding to a signal. Specically, we write '(v;) = Pr['(v) = j V = v], where the randomness is over the internal coins of the signaling scheme'. Receiving signal induces, via Bayes' rule, a posterior distribution for the seller, characterized by Pr [V =v] = Pr [V =v]'(v;) Pr['(V )=] ; here, the randomness in the denominator is over both and the internal coins of'. Upon receiving, the seller's optimal price isP ( ) maximizing Rev(P; ). This price induces a revenue of Rev( ) for the seller and a social welfare of Welfare( ). The expected revenue and social welfare over all draws of the buyer's value and randomness in the scheme ' are then given by Rev('; ) = P 2 Pr['(V ) = ] Rev( ) and Welfare('; ) = P 2 Pr['(V ) = ] Welfare( ), respectively. Using the terminology of dice, an information structure associates a biased die with signals on its sides to each possible value of the buyer. When a buyer arrives, the associated die is rolled and the result is presented to the seller. When all the dice used are single-sided, we call the information structure deterministic, otherwise randomized. We now revisit our motivating examples. Suppose that the distribution has supportfv 1 ;v 2 ;:::;v n g. Then, V can be precisely communicated to the seller by choosing a signal set =f1;:::;ng and setting '(v i ;i) = 1 (and '(v i ;j) = 0 for i6=j), i.e., assign a die with a single side i tov i . By way of contrast, the signaling scheme communicating no information to the seller is implemented with =f1g and '(v i ; 1) = 1 for all i, i.e., assign an identical single-sided die to all v i . Notice that the latter uses much lower \communication complexity," as measured byjj. Also, notice that both these extreme information structures are deterministic and inherently algorithmically ecient: given an explicit representation of and a 5 value V , it is trivial to compute '(V ). For the design of signaling schemes, we use the constraint thatjj = M, for a given bound M, as the \low communication complexity" constraint. In Section 3.1.1, we will show an example for which the optimal signaling scheme under a communication constraint can only be a randomized one. Namely, the assigned dice need to be multi-faced. We would like to understand the impact of the communication complexity on the social welfare that can be (in principle) achieved, and to analyze the algorithmic question of computing optimal signaling schemes ' with limited communication complexity. The \gold standard" for social welfare is E V [V ], which we will call the full-information welfare. Formally, we are interested in the following questions: Given an explicit representation of the distribution and a bound M on the number of available signals, (1) What fraction of the full-information welfare can be obtained by a signaling scheme with M signals? (2) Are there (approximately) optimal and computationally ecient signaling schemes for maximizing social welfare? 1.1.2 Dice-based Winner-Selection Rules For the second part of the thesis, we focus on one of the simplest and most natural classes of decision-making scenarios. In a winner-selection environment, there is a set of candidatesC, each equipped with a random attribute known as its type drawn from a known distribution f. A winner-selection rule is a (randomized) function (or algorithm) which maps each prole of types, one per candidate, to a choice of winning candidates, subject to the requirement that the set of winners must belong to a specied familyI 2 C of feasible sets. The winner-selection rule is also referred to as an ex-post rule since it species the winning probabilities conditioned on the prole of all realized types. A winner-selection rule can be extremely complicated since the rule species outputs for exponentially many possible inputs. In this thesis, we mainly study a subset of simple winner-selection rules which we call dice-based. A dice-based winner-selection rule assigns each type a die and selects the feasible set of winners 6 maximizing the sum of independent dice rolls of candidates. Such rules are used regularly in our daily life and in online/board games. A common example of a winner-selection environment is an auction. In auctions, candidates correspond to bidders, and a winner-selection rule is an allocation rule of the auction. Oftentimes, the goal of auction allocation rule is to maximize the revenue of the auctioneer. Is there any revenue-maximizing allocation rule that is dice-based? The answer is given by Myerson's [48] famous and elegant characterization of revenue-optimal auctions in the special case when there is only a single item to allocate and bidder type distributions are independent. In that setting, Myerson showed that the optimal single-item auction features a particularly structured winner-selection rule: each type is associated with a (randomized) virtual value. Given a prole of reported types, the rule selects the bidder with the highest (non-negative) virtual value as the winner. An important framing device of winner selection is the notion of an interim rule (also called a reduced form) of a winner-selection rule, summarizing the probability that each candidate is selected. We also say that a winner-selection rule implements its corresponding interim rule. We distinguish two classes of interim rules: rst- order and second-order. A rst-order interim rule species, for each candidate i and type t of candidate i, the conditional probability of i winning given that his type is t. When combined with a payment rule, rst-order interim rules suce for evaluating the welfare, revenue, and incentive-compatibility of a single-item auction. A second-order interim rule species more information than a rst-order rule: for each pair of candidates i;j and type t of candidate j, it species the conditional probability ofi winning given thatj has typet. It subsumes a rst-order rule with j =i. It is needed by Bayesian persuasion to evaluate the incentive constraints of a persuasion scheme. In Chapter 4 of the thesis, we will only talk about rst-order interim rules so \rst-order" is often omitted. For a given winner-selection rule, its interim rule is the expected winning probabilities of all types of all candidates, thus the calculation of the interim rule 7 only requires straightforward weighted summation. However, it is much more tricky to get a winner-selection rule from a given interim rule. Two questions can be raised: 1. Is there any winner-selection rule that implements the given interim rule? 2. Can we nd such a winner-selection rule (eciently)? Both questions are resolved by previous research in some settings. The second part of this thesis will answer two similar questions about dice-based rules: 1. Is there any dice-based rule implementing the given interim rule? 2. Can we nd such a rule for a given interim rule eciently? Order sampling [60] may be the rst attempt to answer the second question in a non-trivial setting. When k winners need to be selected from n candidates and there is no uncertainty about the candidates' types, order sampling assigns each candidate a random score variable (a die). The candidates with thek highest score variables will be the winners. Unlike Myerson's auction discussed before, there is no notion of \revenue" or \incentive constraints" in order sampling. The task is merely to nd dice that will induce a prescribed rst-order interim rule. 1.1.3 Bayesian Persuasion A setting that is more general than signaling in bilateral trade has been termed Bayesian Persuasion by Kamenica and Gentzkow [36]: a sender observes a random variable capturing the \state of the world," and can send a signal to a receiver. The receiver, based on the received signal, chooses an action. The utilities of both the sender and the receiver depend on the state of the world and the chosen action, and are not necessarily aligned. Thus, the sender's goal is to design the information structure such that the receiver will choose actions which are in expectation benecial to him 3 . 3 To avoid ambiguities, we always use male pronouns for the sender and female ones for the receiver. 8 Notice that signaling in bilateral trade ts in this framework. In the pricing game, the seller is the receiver, the buyer's valuation is the state of the world, and the sender is a market designer with the goal of maximizing the social welfare or seller's revenue. In the employment game, the employer is the receiver, the employee's cost is the state of the world, and the sender is an educational institution or crowdsourcing website aiming to generate welfare for its participants. Thus, one natural question to ask is whether we can eciently (approximately) optimize the sender's utility subject to a limited number of signals in the general setting. Bayesian persuasion itself can also be viewed as a special case of winner selection, namely, a single signal needs to be selected from among the signal sets. Second-order interim rules are needed for evaluating the incentive constraints of the receiver. Do dice-based rules still exist in these settings? 1.2 Results and Thesis Organization Our results can be divided into three groups: Signaling in Bilateral Trade: In Chapter 3, we study information structure design for bilateral trade subject to a limited number of signals. For the optimization of social welfare, we rst show the power of a single signal. Specically, we prove that a single segment of a market can always result in an ( 1 logn ) fraction of the full-information welfare. Such a characterization leads to a Quasi- PTAS algorithm for optimization of social welfare. Next, we propose a second algorithm which runs in polynomial time and approximates the optimal solution within a factor of M1 M (1 1=e), where M is the signal number limit. For revenue optimization, we have a much more straightforward characterization of the optimal signaling scheme: it will only segment the market into intervals. Therefore, an optimal solution can be found eciently by dynamic programming. Dice-based Winner-Selection Rule: In Chapter 4, we focus on winner-selection environments whose feasible winner sets form a matroid over candidates. In this 9 special setting, we prove that for any winner-selection rule, there exists a dice- based rule that induces the same interim rule. In the simplest case of single item allocation, we give an algorithm that constructs the dice that implement a given feasible interim rule. Impossibility Results in Bayesian Persuasion: In Chapter 5, we show two examples of Bayesian persuasion. The rst one is called hyperedge guessing game. We prove that when there are communication constraints, there is no algorithm which can get a constant approximation for the sender's utility in polynomial time, unless P =NP . In the second example, we show a second-order interim rule that is optimal for a Bayesian persuasion instance but cannot be implemented by any dice-based rule. This rules out persuasion with dice in general. 1.3 Related Literature As a tool of algorithm design, randomization has been adopted by computer scientists for a long time, at least since Rabin [56] proposed a randomized algorithm for primality testing, which was later rened to the famous randomized Miller-Rabin algorithm [57]. Numerous examples can be found in [47]. The proofs of the existence of equilibria for zero-sum [46] and general games [49] were milestones of research on game theory. Since then, randomization has always been a key component to describe players' behaviors in games, which guarantees the existence of equilibria under quite general conditions and expands the research on game theory. In recent research on mechanism design, randomized mechanisms often yield better solutions than deterministic ones [52, 17, 7, 62]. 1.3.1 On Bayesian Persuasion This thesis is in part concerned with Bayesian Persuasion, formalized by Kamenica and Gentzkow [36], generalizing an earlier model by Brocas and Carrillo [14]. Instantiations, variants, and generalizations of the Bayesian Persuasion problem have seen a urry of interest in recent years. For example, persuasion has been 10 examined in the context of voting [4], security [65, 58], multi-armed bandits [39, 41], medical research [37] and nancial regulation [26, 27]. Dughmi and Xu [21] also consider persuasion algorithmically. When multiple receivers are involved, public and private persuasion are discussed in [22, 5]. Most of these results focus on nding or characterizing the optimal signaling scheme, without any exogenously dened constraints on the signaling scheme. The main dierence of our results from these is that we enforce a limit on the number of signals or we require the scheme to be based on independent dice. Chapter 3 focuses on the classical model of bilateral trade (see [42, Chapter 23]). Our choice of protocol, namely the take-it-or-leave-it oer, is arguably the simplest mechanism for bilateral trade, and in the case of the pricing game corresponds to the revenue-optimal mechanism by the classical result of Myerson [48]. The study of the impact of auxiliary information on trade | also known as third degree price discrimination | has a long history, starting at least as early as [55], which studied the eect of price discrimination on welfare. Unlike our work on optimal signaling, the study on third degree price discrimination in economics mostly focuses on the following question: under what conditions will price discrimination increase/decrease the welfare. Aguirre et al. [1] provided a unied answer for this question, with the assumption that the seller's revenue is a concave function of the price. The work most directly related to our work in Chapter 3 is that of Bergemann et al. [8], who examine the eects of information in the same pricing game. Their main result is a remarkable characterization of buyer and seller expected utilities that are attainable by varying the information structure of the seller, i.e., by segmenting the market and allowing the seller to price discriminate between segments. They characterize the space of realizable pairs (r;u) for which there exists an information structure ' such that the seller's expected revenue is r and the buyer's expected utility is u: (r;u) is realizable if and only if r Rev() (i.e., the seller at least matches her \uninformed" revenue),u 0, andr+uE V [V ]. Again, this is achieved by randomization over signals. 11 Implicit in [8] is a family of algorithms | parametrized by the distribution and a realizable pair of utilities (r;u) | which implement a signaling scheme ' realizing the pair of utilities (r;u). When is an explicitly-described distribution with support sizen, the signaling schemes implicit in [8] are ecient; their runtime is a low-order polynomial in n. 4 However, the most interesting signaling schemes implied in [8] | in particular those with largest and smallest u | use as many signals as the support size of the buyer distribution . This realization motivates our examination of schemes with limited communication. Roesler and Szentes [59] also study the impact of information revelation on bilateral trade. In their model, the buyer can observe a signal of his value for the item, and will pay for it if the conditional expected value is weakly larger than the price. The seller will choose the optimal monopoly price according to the buyer's information structure. More generally, Bayesian Persuasion is a special case of optimal information structure design in games. Recent work in computer science has examined this question algorithmically. Babaio et al. [6] examine a model of selling information. In this model, two parties, a buyer and a seller, have private signals about the state of nature. The buyer needs to choose an action from a set of candidates whose payo depends on the state of nature. In order to achieve better payo, the buyer can pay the seller for (partially) revealing his private signal with a signaling scheme. The authors provide ecient mechanisms that extract the buyer's payo as the seller's revenue optimally by designing menus of signaling schemes. Information structure design is also studied in the context of auctions [25, 13, 30, 23, 24]. In all these works, the uncertainty (i.e., state of nature) concerns the item being sold, rather than the type of the buyer as in our model. The rst part of the thesis is related to [24, 23] in that they also examine communication-limited signaling schemes. Dughmi et al. [24] study single-item auctions where the item has exponentially many features and the bidders' values for the item are additive functions of the features. They show how to maximize the revenue when only a small subset of 4 A particularly beautiful example of the schemes implicit in [8] is the greedy algorithm achieving r = Rev() and u =E V [V ] Rev(). 12 the features can be revealed to the bidders. A more relaxed setting of single-item auctions is studied in Dughmi et al. [23]; in this setting, the signaling scheme is only communication-bounded and the buyers' valuation can be arbitrary. The authors give a (1 1=e)-approximation for welfare maximization in this setting. However, these results on auctions cannot be generalized to our bilateral trade setting. There is a major dierence between these two games: in auctions, the seller sends signals to multiple buyers; in bilateral trade, the mediator sends signals to a single seller. The work of Dughmi [19] examines the complexity of signaling in abstract two-player normal form games, while the work of Cheng et al. [18] presents an algorithmic framework for tackling a number of (unconstrained) signaling problems. Dughmi [20] provides a more detailed survey on this topic. An analogy can be drawn between our rst set of results and some of the work on auction design subject to communication constraints. Blumrosen et al. [10] study single-item auctions in which bidders can only communicate a limited number of bits to the auctioneer. They show that even severe bounds on communication only lead to mild losses in welfare and revenue. Moreover, they show that bidders simply report an interval in which their value for the item lies when faced with an optimal auction. Blumrosen and Feldman [9] study communication-constrained mechanism design in single-parameter problems more generally. They show that when the social welfare function is multilinear in the agents' parameters, there is a mechanism that maps the parameters to k signals and only induces an O( 1 k 2 ) loss of the social welfare compared to the optimal (unconstrained) mechanism. 1.3.2 On Winner-Selection Environments The study of the winner-selection environment and interim rules in the second part of this thesis has two origins: (1) Myerson's characterization [48] for revenue- optimal single-item auctions; (2) order sampling [60] which designs dice for drawing samples with prescribed probabilities. These two works show the existence of dice- based rules in two dierent special settings. Our result generalizes both of them by proving the existence of dice-based rules in more general settings. 13 Myerson's characterization extends to single-parameter mechanism design settings more generally (see, e.g., [33]). The (rst-order) interim rule of an auction, also known as its reduced form, was rst studied by Maskin and Riley [43] and Matthews [44]. The inequalities characterizing the space of feasible interim rules were described by Border [11, 12]. Border's analytically tractable characterization of feasible interim rules has served as a fruitful framework for mechanism design, since an optimal auction can be viewed as the solution of a linear program over the space of interim rules. Moreover, this characterization has enabled the design of ecient algorithms for recognizing interim rules and optimizing over them, by Cai et al. [15] and Alaei et al. [3]. This line of work has served as a foundation for much of the recent literature on Bayesian algorithmic mechanism design in multi- parameter settings. As another motivation of our study of dice-based rules, order sampling studies how to sample k winners from n candidates with given inclusion probabilities (interim rule), by assigning a random score variable (die) to each candidate. Ros en [60] showed that parameterized Pareto distributions can be used to implement a given interim rule asymptotically. Aires et al. [2] proved the existence of an order sampling scheme that exactly implements any feasible interim rule. Our existential proof is a generalization of the proof of [2] to settings with multiple types and matroid constraints. To achieve the similar objective of sampling with prescribed probabilities, Sampford [61] and H ajek [31] propose two dierent ways, but neither of them is dice-based. It is important to contrast our dice-based rule introduced in Section 1.1.2 with the characterization of Cai et al. [15]. In particular, the results of Cai et al. [15] imply that every rst-order interim rule can be eciently implemented as a distribution over virtual value maximizers. In our language, this implies the existence of an eciently computable dice-based implementation in which the dice may be arbitrarily correlated. Our result, in contrast, eciently computes a family of independent dice implementing any given rst-order interim rule in single-winner settings, and shows the existence of a dice-based rule in matroid settings. This is 14 consistent with Myerson's 1981 characterization, in which virtual values are drawn independently. Alaei et al. [3] also studied winner-selection environments with at most one winner, with the name \service based environments". For such settings, they proposed a mechanism called stochastic sequential allocation (SSA). The mechanism also implements any feasible rst-order interim rule by creating a token of winning and transferring the token sequentially from one candidate to another, with probabilities dened by an eciently computed transition table. Compared to the SSA mechanism, a dice-based implementation can mimic the mechanism without computing the transition table explicitly. However, it is unclear how to convert a transition table to a dice-based implementation. 1.3.3 On Impossibility Results of Bayesian Persuasion Of particular relevance to our work on Bayesian persuasion is the negative result of Dughmi and Xu [21] for Bayesian persuasion with independent non-identical actions: it is #P-hard to compute the interim rule (rst- or second-order) of the optimal scheme, or more simply even the sender's optimal utility. Most notable about this result is what it does not rule out: an algorithm implementing the optimal persuasion scheme \on the y," in the sense that it eciently samples the optimal scheme's (randomized) recommendation when given as input the prole of action types. Stated dierently, the negative result of Dughmi and Xu [21] merely rules out the Borderian approach for this problem, leaving other approaches | such as the Myersonian one | viable as a means of obtaining an ecient \on the y" implementation. This would not be unprecedented: Gopalan et al. [28] exhibit a simple single-parameter auction setting for which the optimal interim rule is #P hard to compute, yet Myerson's virtual values can be sampled eciently and used to eciently implement the optimal auction. Our negative result in Section 5.3 rules out such good fortune for Bayesian persuasion with independent non-identical actions: there does not exist a (Myersonian) dice-based implementation of the optimal persuasion scheme in general. 15 Chapter 2 Background and Notation Vectors are denoted by bold face. When we writexy for vectorsx;y, we mean that x i y i for all i. We will frequently want to reason about the sums of entries of a vector over a given set of indices. We then write x I = P i2I x i . We also apply this notation for elements of a matrix X = (x i;j ) i;j , writing x I;J = P i2I P j2J x i;j . We will particularly use this notation when I;J are (closed or half-open) intervals of integers. For an integer n, [n] denotes the setf1; 2;:::;ng. For a maximization problem, we say an algorithmA is -approximate if for every instance of the problem, the output value ofA is at least an fraction of the optimal value. An algorithm is called ecient if its run-time is polynomial in the size of the input. is called the approximation ratio. 2.1 Submodular Functions f is a set function over E if it maps subsets of E to real numbers. f : 2 E !R is submodular if for all S;TE: f(S) +f(T )f(S[T ) +f(S\T ); or equivalently, for all STE and e2E: f(S[feg)f(S)f(T[feg)f(T ): 16 A function is called monotone if f(S)f(T ) for all STE. Denition 2.1 A polymatroid associated to a non-negative monotone submodular function f : 2 E !R + is the polytope P =fx2R E + j X e2S x(e)f(S);8SEg . In this thesis, the optimization of a submodular function is often used as a subroutine of our algorithms, and we will utilize the following results on submodular optimization. Proposition 2.2 (Korte and Vygen [38]) Assume f is a submodular function andf(S) can be evaluated eciently for any givenS. For the optimization problem of f, There is an ecient algorithm that nds S that minimizes f(S) [35]. When f is monotone, there is an ecient algorithm that nds S that approximately maximizes f(S) subject tojSj k, with approximation ratio (1 1=e) [50]. 2.2 Signaling Schemes and Bayesian Persuasion In the introduction, we brie y introduced signaling schemes for the specic setting of bilateral trade. Formally, a signaling scheme' randomly maps the state of nature !2 to a signal 2 . ' is dened by specifying the conditional probability '(!;) = Pr['(!) =j!]. An agent who only knows that the state of nature is drawn from a distribution , but does not know the realized state, may get more information about the state of nature when she receives a signal from ': using 17 Bayes' rule, the agent can get a posterior distribution of the state of nature when is signaled, as follows, Pr[!j] = Pr[!]'(!;) P ! 0 2 Pr[! 0 ]'(! 0 ;) : In Bayesian persuasion, there are two players called sender and receiver. The receiver has a set of actions that she can take. Each action i has a type t i , drawn from the set T i , according to a commonly known distribution f i . Each type t i has associated payos s(i;t i ) and r(i;t i ) for the sender and receiver, respectively. The sender (or principal) has access to the actual draws t = (t 1 ;:::;t n ) of the types, and would like to use this leverage to persuade the receiver to take an action favorable to him. Thereto, the sender can commit to a signaling scheme to reveal some of this information to the receiver. It was shown by Kamenica and Gentzkow [36] that the sender can restrict attention, without loss, to direct schemes: randomized functions A mapping type proles to recommended actions. Namely, the signaling scheme takes the type prole of actionst as the state of nature, and the actions as signals. Upon receiving a recommendation i, the receiver's utility of playing action j will be P t j 2T j r(j;t j )Pr[t j j =i]. When the types of actions are always correlated as a single parameter, i.e., t 1 =t 2 = =t n , we omit the types of actions and just consider the only parameter as the state of nature. Naturally, the signaling scheme must be persuasive: if actioni is recommended, the receiver's posterior expected utility from action i must be no less than her posterior expected utility from any other action j: X t i 2T i r(i;t i )Pr[t i j =i] X t j 2T j r(j;t j )Pr[t j j =i]; 8i;j (2.1) In this sense, direct schemes can be viewed as winner selection rules in which the actions are the candidates, and persuasiveness constraints must be obeyed. 18 2.2.1 Matroids In Chapter 4, we will focus on settings in which the feasible sets I are the independent sets of a matroid. Denition 2.3 A matroidM is a pair (E;I), where E is called the ground set andI 2 E is a family of so-called independent sets, satisfying the following three matroid axioms: ; is independent. If AE is independent, any A 0 A is also independent. If A;B are independent, andjAj >jBj, then there exists x2 A such that B[fxg is independent. We will use the following denitions and properties of matroids: Denition 2.4 The rank function r M of a matroidM = (E;I) is a function mapping subsets SE to the maximum cardinality of an independent subset of S: r M (S) = max TS;T2I jTj: Denition 2.5 C E is called a circuit of a matroidM = (E;I) if C is a minimal dependent set of M. Denition 2.6M 0 is a restriction ofM = (E;I) to SE ifM 0 = (E;fA2Ij ASg), denoted byMjS. 1 For more details on matroids, we refer the reader to Oxley [53]. A matroidM = (E;I) is separable if it is a direct sum of two matroidsM 1 = (E 1 ;I 1 ) andM 2 = (E 2 ;I 2 ). Namely, E =E 1 ]E 2 ,I =fA[BjA2I 1 ;B2I 2 g. Note that ifM is non-separable, thenr M (E)<jEj; otherwiseM is the direct sum of singleton matroids. We use the following theorem. 1 Note that we deviate slightly from the standard denition in that we do not restrict the ground set. 19 Theorem 2.7 (Whitney [64]) (1) When M = (E;I) is a non-separable matroid, for every a;b2 E, there is a circuit containing both a and b. (2) Any separable matroidM is a direct sum of two or more non-separable matroids called the components ofM. Here we provide some examples of matroids (E;I): k-uniform matroid:I =fSEjjSjkg. Partition matroid: let k 1 ;:::;k n 2 N and E = ] n i=1 E i . I = fS E j 8ijS\E i jk i g Truncated partition matroid: similar to partition matroid, but additionally, we have a cardinality upper bound k on independent sets. I =fS Ej jSj k;8ijS\E i j k i g. Notice that our example of allocating gold coins has a truncated partition matroid constraint. Graphic matroid: E is the edge set of an undirected graph G.I is the set of all spanning forests of G. 20 Chapter 3 Signaling in Bilateral Trade 3.1 Preliminaries 3.1.1 Signaling Schemes When constructing a signaling scheme, we assume that the distribution of buyer valuations is given explicitly as input. In particular, this means that it must have nite support of size n. We assume that it is given by the valuations v 1 < v 2 < ::: < v n , and their associated probabilities p 1 ;p 2 ;:::;p n , satisfying P i p i = 1. We writev andp for the vectors of all these values and probabilities, respectively. In the introduction, for ease of exposition, we described a signaling scheme ' in terms of its conditional probabilities '(v;) = Pr['(v) = j V = v]. For this chapter, we use notation diering in two ways: (1) since all values are of the form v i , we can index the buyer types by i instead of v, and (2) it is much more convenient to use unnormalized probabilities instead of conditional probabilities: x i; ='(v i ;) Pr [V =v i ] is the probability that the buyer's valuation is v i and the signal is sent. The signaling scheme is then fully described by the matrix X2 [0; 1] nM = (x i; ) i2f1;:::;ng;2f1;:::;Mg , satisfying that P x i; =p i . From now on, we will therefore simply refer to the signaling scheme as X instead of '. We sometimes describe a signal in isolation by a nonnegative type-indexed vector 0 x 0 p. We call such a vector a segment of the market p. Thus, a 21 signaling scheme can be thought of as a family of segments, one per signal, whose sum is the entire marketp. As discussed in the introduction, upon receiving the signal , the seller will choose a priceP () maximizingPPr V [VP ]. This price will always be one of the possible buyer valuationsv i , as any other price could be raised slightly without losing any buyers. Furthermore, by merging signals with the same price into one signal, without loss of generality, there are no two signals for which the seller chooses the same price [36]. Hence, any signaling scheme X induces indices k 1 ;k 2 ;:::;k M such that upon receiving signal , the seller chooses price v k . Without loss of generality, we can rearrange the signals so that k 1 >k 2 >>k M . For any signal , we call the expected welfare resulting from under X served social welfare and dened as w X () = P ik v i x i; ; the social welfare is then W (X) = P w X (). Here we show an example that illustrates the necessity of randomization for optimal welfare maximization. Consider a buyer distribution supported on types (1; 2; 3; 4) with probabilities (2=12; 1=12; 2=12; 7=12), respectively. When M = 2 signals are allowed, the unique optimal signaling scheme obtains full welfare by sending signals with posterior unnormalized probabilities of (2=12; 1=12; 0; 1=12) and (0; 0; 2=12; 6=12). 3.1.2 Welfare and Sanitized Welfare While our goal for most of this chapter is to maximize the social welfare, it turns out that a slightly modied objective function is signicantly more amenable to analysis, both in terms of positive results for bilateral trade and the hardness result for more general persuasion. Specically, there is a designated garbage signal?, and any welfare accrued when? is sent is discounted: we thus dene the sanitized welfare of X to be f W (X) = P 6=? w X (). By designating the signal minimizing w X () as the garbage signal, we observe: 22 Proposition 3.1 For all signaling schemes X, we have that M1 M W (X) f W (X)W (X). In particular, any signaling scheme X maximizing sanitized welfare to within a factor also maximizes social welfare to within a factor M1 M . Since we will always focus on sanitized welfare in the context of welfare maximization, to avoid having to write M 1 for the number of signals everywhere, we will explicitly assume that our signaling schemes can use M signals in addition to the garbage signal?. 3.1.3 Sanitized Welfare Maximization Given Price Points Conceptually, the task of designing a good signaling scheme can be divided into two steps: (1) Choose the seller's price points k 1 >k 2 >>k M for non-garbage signals; (2) Design a signaling scheme X maximizing f W (X) such that the seller's best response to each signal is in fact k . Given the chosen price point indicesk 1 ;:::;k M , the goal of nding the welfare- maximizing signaling scheme is characterized by the following linear program. Maximize P M =1 P n i=k v i x i; subject to P M =1 x i; p i for all i (probability) v k P ik x i; v k P ik x i; for all ;k (revenue) x i; 0 for all i;: (3.1) The probability constraints capture that we do indeed have a valid signaling scheme (with the garbage signal being assigned all residual probabilities p i P M =1 x i; ). The revenue constraints capture that it is a best response for the seller to set the price point k when receiving the signal . Notice that for step (2), the LP (3.1) actually achieves the optimal solution for given price points. The approximation is necessary because for step (1), choosing the optimal price points appears more dicult. 23 3.2 Summary of Results Our rst main result (proved in Section 3.3) shows that signaling schemes even with an extremely limited number of signals are surprisingly powerful in extracting welfare. Theorem 3.2 For any distribution with support size n, there is a single segment of the market (i.e., a non-negative vector indexed by buyer types that is pointwise upper-bounded by the prior over buyer types) with social welfare at least a (1= logn) fraction of the full-information welfare E V [V ]. The (1= logn) bound is tight. Notice that this result is quite surprising. There are value distributions (such as equal-revenue distributions) under which the presence of a self-interested seller results in only a fraction 1=n of the full-information welfare being realized. The theorem states that merely by excluding some buyers, the principal can improve this bound to (1= logn). Perhaps a \natural" conjecture would have been that using M segments, at most a fraction O(M=n) of welfare could be attained in the worst case. The theorem and the subsequent corollary show that this conjecture is false. However, as we will see shortly, it is in fact true for the objective of maximizing the seller's revenue. Applying Theorem 3.2 repeatedly yields Corollary 3.3, which shows that with a relatively small number of signals, the principal can get arbitrarily close to the full-information welfare. Corollary 3.3 For any > 0 and any distribution with support size n, there is a signaling scheme withO(logn log(1=)) signals which obtains a (1) fraction of the full-information welfare. In terms of the number of bits of communication required, Corollary 3.3 implies that communicatingO(log logn+log log(1=)) bits extracts a (1) fraction of the social welfare that could be extracted using logn bits. 24 Corollary 3.3 also has the following algorithmic implication: exhaustively searching over all signaling schemes with (logn log(1=)) signals, one obtains a QPTAS (quasi-polynomial time approximation scheme). Obtaining a truly polynomial-time algorithm appears quite a bit more challenging, although we currently do not have any hardness results, even for exact optimization. Our main technical result is Theorem 3.4, which shows that one can obtain a constant factor approximation to the social welfare in polynomial time. Theorem 3.4 For any M > 1, there is a polynomial-time M1 M (1 1=e) approximation algorithm for the problem of implementing a welfare-maximizing signaling scheme with at most M signals, given an explicit representation of . The proof of this theorem is quite involved. At the heart of it is a proof that the social welfare achieved from a set of signals is submodular. More precisely, the proof focuses on the sanitized social welfare accrued from all signals except the garbage signal. The key insight that completes our two-step approach of designing a signaling scheme is that the optimum social welfare with price set S (and one garbage signal) is a monotone and submodular function ofS. This fact is proved as Theorem 3.11 in Section 3.5. The proof relies heavily on a characterization of the optimal signaling scheme inducing the setS of prices. Although we already showed that given the chosen price points S = fk 1 ;k 2 ;:::;k M g, the optimal signaling scheme can be computed by LP (3.1), we require a better characterization, and thereto show (as Theorem 3.10 in Section 3.4) that it is also the output of a greedy algorithm. Optimizing Seller Revenue The reader may have noticed our focus on social welfare. Almost equally frequently studied is the objective of maximizing seller revenue. It turns out that results for seller revenue are much more straightforward, both technically and in terms of their implications. First, the communication constraint can severely curtail the seller's revenue: when the buyer's value is drawn from an equal-revenue distribution 25 supported on a geometric progression of lengthn, the revenue-maximizing signaling scheme with M signals recovers only an O(M=n) fraction of the full-information revenue. On the other hand, computing the optimal signaling scheme for revenue maximization is straightforward (see Section 3.6): Theorem 3.5 The optimal signaling scheme for maximizing seller revenue groups buyers into M contiguous segments by valuations, i.e., if the same signal is sent for buyers with valuations v<v 0 , then is also sent for all buyers with valuations v 00 2 [v;v 0 ]. As a result, there exists a polynomial-time dynamic programming algorithm which, given an explicit representation of and a boundM on the number of signals, computes a signaling scheme maximizing the seller's expected revenue. For the employment game in Section 1.1.1, Theorem 3.5 conrms (for the purpose of employer utility maximization) the generally agreed-upon form of grading or performance evaluation, wherein the highest performers are grouped together in one category (`A'), followed by the next highest category (`B'), etc. Such signaling schemes are generally not optimal if the goal is to maximize social welfare as we showed before. 3.3 Welfare with Limited Communication In this section, we provide a proof of Theorem 3.2. We then discuss some of its implications, including Corollary 3.3 and a QPTAS for the problem of maximizing social welfare subject to limited communication. We begin by showing a special case of the theorem when eachv i is a power of 2. We then reduce the general case to this special case at a loss of a constant factor. Lemma 3.6 If eachv i is a power of 2, then the ratio of the full-information welfare to the maximum served social welfare of a single segment of the market is at most O(logn). Proof. The lemma is trivial for n = 1, so assume that n 2. Let SW = P i p i v i be the full-information welfare. First, we remove (i.e., exclude from the segment 26 we construct) all types i with p i v i < SW n 2 . Because there are at most n such types, this can decrease the full-information welfare by at most a factor of 1 1=n. Now, we group all (remaining) types i into O(logn) bins according to p i v i . Specically, bin B j ;j 0 contains all types i with p i v i 2 ( SW 2 j+1 ; SW 2 j ]. Since p i v i SW n 2 , there are at most 2 logn bins. Thus, there is at least one bin B j such that P i2B j p i v i n1 n 1 2 logn SW. Fix such a j for the rest of the proof, and dene u = SW 2 j+1 . Let i = minB j be the type in B j corresponding to the smallest value. We dene a market segmentq as follows: Let q i = u v i , let q i = u 2v i for all i2B j with i6=i , and let q i = 0 for i62B j . By denition of the bins, for all i2 B j we have u < p i v i 2u, and therefore p i =4q i p i . In particular, 0qp, and thereforeq describes a valid segment ofp. We next show that, when facing the market segmentq, the seller will choose the price point i . The seller's revenue from choosing i is at least q i v i = u. On the other hand, for any i > i , the seller obtains revenue at most v i P ki;k2B j u 2v k v i u 2v i P 1 k=0 1 2 k =u; here, we used that the valuations are powers of 2. Therefore, v i is the revenue-maximizing price forq. With the seller choosingv i as the price, the served social welfare ofq is X i2B j q i v i 1 4 X i2B j p i v i (n 1) 8n logn SW 1 16n logn SW: Thus,q is a segment with served social welfare at least 1 O(logn) SW. We utilize Lemma 3.6 to prove the upper-bound portion of Theorem 3.2. The idea is to round down all valuations to the nearest power of 2 (losing at most a factor of 2 in the welfare), then utilize Lemma 3.6 to construct a segment for the new valuation distribution achieving at least a factor (1= logn) of the full-information welfare, and nally reconstruct a segment/signal for the original distribution. For the matching lower bound, we use an \equal welfare distribution." 27 Proof of Theorem 3.2. Let v 0 j ; 1 j m, be the valuations of the original distribution, with associated probabilities p 0 j . The new valuations are v i = 2 i , with p i = P 2 i v 0 j <2 i+1 p 0 j . We allow the indexi to be negative for notational convenience. Let n denote the support size ofp, and note that nm. Because valuations were reduced by at most a factor of 2, the full-information welfare of the new distribution is at least half that of the original one, i.e., P i p i v i 1 2 P j p 0 j v 0 j . Letq be a single segment with served social welfare at least a 1=O(logn) fraction of the full-information welfare ofp. By Lemma 3.6, such aq exists. Let i be the price point chosen by the seller underq, and R the seller's revenue. By optimality of i for the seller, we get that R =v i (q i + n X k=i +1 q k ) v i +1 n X k=i +1 q k = 2v i n X k=i +1 q k : (3.2) Hence, q i P n k=i +1 q k , meaning that at least half of the probability mass, and thus half of the seller's revenue R fromq, comes from type i . Now dene a segmentq 0 of the original type distributionp 0 as follows. For type i , take a total of probability massq i from typesj with 2 i v 0 j < 2 i +1 . For all typesi>i , take a total of probability mass q i =8 from types j with 2 i v 0 j < 2 i+1 , and for types j with v 0 j < 2 i , set q 0 j = 0. Observe that the full-information welfare ofq 0 is at least a 1=8 fraction of the served social welfare ofq. It remains to show that a constant fraction of that social welfare is above the seller's revenue-maximizing oer price forq 0 , which we do next. First, note that the price 2 i gives the seller a revenue of at least R=2, even just from all buyers of types j with 2 i v 0 j < 2 i +1 . Next, consider a price of v 0 j 2 i +1 , say 2 i v 0 j 2 i+1 with i i + 1. The revenue of such a price is at most v 0 j X kj q 0 k 2 i+1 X ki q k =8 = 1 4 v i X ki q k R=4: (3.3) 28 Hence, no price v 0 j 2 i +1 can be revenue-maximizing forq 0 , and all types in the segmentq 0 with value at least 2 i +1 are served. It remains to show that a constant factor of the remaining social welfare | associated with values between 2 i and 2 i +1 | is also served. Because the price 2 i dominates all pricesv 0 j 2 i +1 , the seller will choose some price j with 2 i v 0 j < 2 i +1 , and the chosen price must give the seller revenue at least R=2. The calculation in (3.3) implies that the revenue extracted from types j with v 0 j 2 i +1 is at most R=4. Hence, at least a revenue (and thus also social welfare) ofR=4 must come from typesj with 2 i v 0 j < 2 i +1 . By construction, the full-information welfare associated with those types is at most 2 i +1 q i = 2v i q i , which is at most 2R by (3.2). Consequently, at least a 1=8 fraction of the total social welfare associated with those types is served, as needed. In summary, q 0 serves a constant fraction of the full-information welfare of q, which in turn is a 1=O(logn) 1=O(logm) fraction of the full-information welfare of the distributionp 0 . Hence, for any distribution supported on m types, a single segment is enough to serve a 1=O(logm) fraction of the social welfare. To show that this bound is tight, we construct an \equal-welfare distribution" with the property that no single segment extracts more than anO(1= logn) fraction of the full-information social welfare. Let v i = 2 i for 0 i n, and p i = 1 2 i (ni) for 0i<n, withp n = 1 2 n1 =p n1 . Notice that these \probabilities" do not sum to 1; we omit the normalization constants for legibility, since they will cancel out in the subsequent calculations. The full-information welfare of p is P n i=0 p i v i = 2 + P n i=1 1 i = (logn). Now consider any segment q of p, and let v k = 2 k be the seller's revenue-maximizing price for q. If k = n, the served social welfare is v n q n v n p n = 2 = O(1), as needed. Assume now that k <n. For each i with k <in, the optimality of the price v k implies that v i q [i;n] v k q [k;n] , or equivalently q [i;n] v k v i q [k;n] . By choosing i =k + 1, we get thatq [k+1;n] v k v k+1 q [k;n] = q [k;n] 2 , and consequentlyq [k;n] 2q k . We can now bound the served social welfare ofq by a constant: 29 n X i=k v i q i =v k q [k;n] + n X i=k+1 (v i v i1 )q [i;n] 2v k q k + 2 n X i=k+1 (v i v i1 ) v k v i q k = 2v k q k 1 + n X i=k+1 v i v i1 v i ! = v k q k (n + 2k): Because q k p k , we can bound v k q k (n + 2k)v k p k (n + 2k) = n+2k nk 3. In summary, any single segment can obtain at most constant welfare, and thus at most an O(1= logn) fraction of the total social welfare for this instance. To derive Corollary 3.3, simply pick segments greedily, always choosing the next segment to have largest possible served social welfare. By Theorem 3.2, each subsequent segment obtains at least a 1 c logn fraction of the residual welfare at that point, meaning that after adding c logn log(1=) signals, the fraction of the total welfare obtained by the signaling scheme is at least 1(1 1 c logn ) c logn log(1=) 1. Remark 3.7 One can obtain a positive result very directly when the ratio = max i v i min i v i is bounded. Group all buyers according to their values into log bins: for u = min i v i , buyers with 2 j uv i < 2 j+1 u are put into bin j, By considering each bin as a segment, we can see that the revenue (also, the served social welfare) of each segment is at least half of the its full-information welfare. Choosing the best M bins leads to a bound of (min(1;M= log)). However, our result is of more interest when the ratio between the largest and smallest valuations can be very large. To get a quasi-polynomial time approximation scheme (QPTAS) for the problem of nding a welfare-maximizing signaling scheme with at most M signals, consider a desired approximation parameter . We distinguish two cases. If M c logn log(1=) (again, c is the constant in O(logn) in Theorem 3.2), enumerate all possible choices of price points, and nd the truly optimal one (i.e., there is no approximation factor lost in this case). Notice that there are at most n M =O(n M ) such combinations to consider, and for each of them, either the LP (3.1) or the 30 greedy algorithm (Theorem 3.10) allows us to evaluate the maximum welfare attainable with that set of prices. Thus, the running time is quasi-polynomial for xed . When M > c logn log(1=), by Corollary 3.3, choosing the M price points greedily obtains a (1) fraction of the full-information welfare. Thus, it certainly obtains the same fraction of the maximum that could be achieved with M signals. 3.4 Sanitized Welfare Maximization with a Greedy Algorithm In this section, we show that, given the set of price points, an optimal solution to the LP (3.1) (i.e., the problem of maximizing sanitized welfare) can be computed by a greedy algorithm constructing the signals one by one. Besides the faster running time, the main value of the greedy algorithm is as an analysis tool for the problem of choosing the (near-)optimal price points; the analysis for that problem is carried out in Section 3.5. We begin with an algorithm for constructing just one signal with a given price pointk =k . Letp 0 be a vector of residual probabilities for buyer types, i.e., the probability of each type that has not been allocated to any signals previously. Hence, 0 p 0 p. The goal is to construct the probabilities y for a single signal with price point k; that is, we require that 0 y i p 0 i for all i, and the revenue constraintv k P jk y j v i P ji y j must be satised for alli. The revenue constraint can be written as y [i;n] v k v i y [k;n] for all i: (3.4) In an optimal (single) signal, for each i k, either Inequality (3.4) must be tight, or y i = p 0 i ; otherwise, y i could be increased, raising social welfare. For the same reason,y k =p 0 k . Because no buyer of typei<k will ever purchase at pricev k , 31 such buyers contribute 0 to revenue and (social or buyer) welfare. Hence, without loss of generality, the optimum signal has y i = 0 for all i<k. These observations suggest the following algorithm: Gradually raise y k from 0 to p 0 k . Simultaneously raise the y i for i > k in such a way that at all times, for each i, either y i = p 0 i or Inequality (3.4) is tight. This is accomplished by raising the tail sums y [i;n] at a rate of v k v i while y [k;n] is raised at rate 1. Solving for the necessary rates of increase of individualy i gives rise to the algorithmConstruct- One-Signal. ALGORITHM 1: Construct-One-Signal (k;v;p 0 ) 1 y 0 2 I fikjp 0 i > 0g // Throughout, I contains all types for which y i can still be raised. 3 while k2I do 4 m max i2I i 5 m 1 vm // Rate of increase that keeps Inequality (3.4) tight for m. 6 forall the i2Infmg do 7 j minfi 0 >iji 0 2Ig 8 i 1 v i 1 v j // Rate of increase that keeps Inequality (3.4) tight for i. 9 i arg min i2I (p 0 i y i )= i // Index for which p 0 i =y i will become tight first. 10 (p 0 i y i )= i // Amount of increase in y k until y i =p 0 i . 11 forall the i2I do 12 y i y i + i 13 I fikjy i <p 0 i g // Update the set of indices that can still be raised. 14 returny Our rst lemma simply restates the revenue constraint, and captures the fact that if y i <p 0 i at any time during the algorithm, then the revenue constraint (3.4) must be tight for i. The proof is straightforward by induction on iterations. Lemma 3.8 For all indices i and every step of the algorithm, y [i;n] v k y [k;i) v i v k . Furthermore, ifi2I at some point of the algorithm (including at termination), then at that point in time, v i y [i;n] = v k y [k;n] . The latter condition is equivalent to saying that y [i;n] = v k y [k;i) v i v k . The key property of the algorithm Construct-One-Signal is that it maximizes all tail sums y [i;n] : 32 Lemma 3.9 Lety 0 p 0 be any signal. If k is the price chosen by the seller under y 0 , i.e., v k y 0 [k;n] v i y 0 [i;n] for all i, then y [i;n] y 0 [i;n] for all ik. Proof. For contradiction, assume that y [i;n] < y 0 [i;n] for some i, and x the minimum such i. Let ji be the minimum index with y j <p 0 j ; note that such an index must exist, as otherwise, y [i;n] = P ji y j = P ji p 0 j P ji y 0 j = y 0 [i;n] . As a result, the indexj2I at the termination of the algorithm, implying by Lemma 3.8 that v k y [k;n] =v j y [j;n] . Now, we obtain the following contradiction: y [i;n] =y [j;n] + j1 X i 0 =i y i 0 = v k y [k;n] v j + j1 X i 0 =i p 0 i 0 y 0 [j;n] + j1 X i 0 =i y 0 i 0 = y 0 [i;n] : To construct a complete signaling scheme, we invoke the algorithm Construct-One-Signal repeatedly, constructing the signals one at a time. The important part here is that the signals must be constructed in decreasing order of the target price. The intuitive reason is that the inclusion of high-value buyers in a signal makes a higher price more attractive to the seller, thus posing additional constraints on the required probability mass of lower-valued buyers that must be included. Thus, it is always better to include as many high-valued buyers in the high-priced signals as possible, and this is accomplished by constructing those signals rst. Hence, we assume that the signals are sorted in descending order of their prices. ALGORITHM 2: Construct-Signaling-Scheme (v;p;S =fk 1 >k 2 >:::>k M g) 1 p (1) p 2 for 1 to M do 3 x Construct-One-Signal(p () ;v;k ) /* x is column of the signaling scheme X's matrix. */ 4 p (+1) p () x 5 return X Theorem 3.10 The algorithm Construct-Signaling-Scheme solves the linear program (3.1) optimally. 33 The proof of this theorem proceeds by showing that an optimal solution X for the linear program (3.1) can be (gradually) transformed into the solution X constructed by the algorithm Construct-Signaling-Scheme without decreasing its solution quality. It is technically fairly involved, and given in Appendix A.1. 3.5 Submodularity of Sanitized Welfare We prove that the sanitized welfare objective function f W (S) is a submodular function of the setS of chosen price points. (There is also a garbage signal?= 2S.) Let S =fk 1 >k 2 >:::>k m g be k price points. f W (S) is dened as the optimum solution to the LP (3.1) with the given price points. We show the following: Theorem 3.11 If ST and k = 2T , then f W (T[fkg) f W (T ) f W (S[fkg) f W (S). Theorem 3.10 states that the greedy algorithm solves the problem for S[fkg optimally, but the eects of adding k are subtle. It is fairly easy to analyze what happens in the iteration when k itself is added: that the welfare increase is larger for S than for T is easily seen by a simple monotonicity argument, captured by Lemma 3.13. However, the addition of k has \downstream" eects. The construction of subsequent signals with price points k < k will now face dierent residual probabilities, and the resulting reductions in those signals need to be carefully balanced against the gains from the signal with price point k. Part of the complexity arises from the rather complex construction of the signal for price point k itself. It is captured by the algorithm Construct- One-Signal, which itself runs through iterations in which dierent sets I of indices have their probabilities increased. In order to eliminate this source of complexity, we will think of adding the signal with price point k \gradually." Specically, we consider the execution of Construct-Signaling-Scheme in which the execution of Construct-One-Signal for the signal with price point k may be terminated prematurely. An upper bound B on the tail probability is 34 specied, and Construct-One-Signal is stopped when P ik y i = B. After signalk is constructed in this modied way, subsequent signals will be constructed normally by Construct-Signaling-Scheme. A modication of the proof of Theorem 3.10 shows that this modied algorithm optimally solves the LP (3.1) with the added constraint that P ik x i; kB, where k denotes the signal whose price point is k. We write f W (k;B) (S) for the sanitized welfare achieved by the optimum solution with a set of signal price points S[fkg, and the constraint that the probability mass for signal k is at most B. Our main lemma is: Lemma 3.12 If S T , then for any k;B;: f W (k;B+) (T ) f W (k;B) (T ) f W (k;B+) (S) f W (k;B) (S). Lemma 3.12 implies submodularity quite directly, as follows. Proof of Theorem 3.11. Let X and b X be the optimal signaling schemes with price point sets S[fkg and T [fkg, respectively. By Lemma 3.13, when constructing k and ! k , the residual probability for k is more than that of ! k ; therefore,b x [k;n];! kx [k;n]; k by Lemma 3.9. Consider gradually increasing B from 0 tob x [k;n];! k in increments of (varying) , as outlined above. Subsequently, continue increasing B for X only. By adding up the inequality from Lemma 3.12 for each such step, and noting that the subsequent increases of B for X can only further increase the welfare of X, we obtain that f W (S[fkg) f W (S) f W (T[fkg) f W (T ). Because the objective function is submodular (and monotone by Lemma 3.13), the greedy algorithm is known [51] to give a (11=e)-approximation for the problem of maximizing f W (S). Proposition 3.1 then implies that the same greedy algorithm gives a M1 M (1 1=e) approximation for the objective of maximizing the social welfare W (S), proving Theorem 3.4. The proof of Lemma 3.12 is technically quite involved, and given in Appendix A.2. The idea is to rst prove it for suciently small , which allows us to couple the executions tightly. 35 In particular, by comparing the solutions to the linear program (3.1), we can ensure that any constraint that becomes tight in the solution for set T with bound B +, but is not tight with bound B (and similarly for S) would not have become tight for any 0 < . This will localize the changes, and maintain the revenue indierence for the seller. By summing over all such iterations (there will only be nitely many, because is chosen so that at least one more constraint becomes tight), we eventually prove the lemma for all . In the analysis, we are interested in four dierent signaling schemes, constructed by Construct-Signaling-Scheme when run with dierent sets of price points and upper bounds B. We will assume here that k2ST . Specically we dene: X = (x i; ) ;i : the probability mass of type i assigned to signal when the algorithm is run with price point set S and an upper bound of B. X + = (x + i; ) ;i : probability mass for price point set S and an upper bound of B +. b X = (b x i; ) ;i : probability mass for price point set T and an upper bound of B. b X + = (b x + i; ) ;i : probability mass for price point set T and an upper bound of B +. A rst step of the proof, which is both illustrative of the types of arguments made repeatedly and implies monotonicity of the sanitized welfare f W (S) is the following lemma. It shows that if more probability mass can be allocated for one signal, then the eect can never be a decrease in the total allocated probability mass for any type of buyer. Lemma 3.13 (Monotonicity) LetX,X + be optimal signaling schemes for price point set S, with signal set Q. Then, for any k2S;B;, and any buyer type i, we have that x i;Q x + i;Q . As a corollary, for any sets of price points S T , and with Q and b Q as the respective sets of signals, for any B and any buyer type i, we have x i;Q b x i; b Q . 36 Proof. Let k be the signal with price pointk. LetQ 0 Q be an initial segment of Q, i.e., 0 < for every 0 2Q 0 ;2QnQ 0 . We prove by induction onjQ 0 j that x i;Q 0x + i;Q 0 . The base case Q 0 =; is trivial. For the induction step, let 0 be the largest signal inQ 0 (i.e., with smallest price point k 0 ), and distinguish three cases. If 0 < k , i.e., before k is constructed, the execution of Construct-Signaling-Scheme is the same for the bounds B and B +, so x i; 0 =x + i; 0 , and the induction follows directly. When 0 = k , x i; k is the result of Construct-One-Signal with an upper bound ofB, andx + i; k is that with upper bound ofB +. Until the tail sum reaches B, the execution is the same, and subsequently, values can only be raised further for the execution with the bound B +. Thus, x i; kx + i; k , and again, the induction step follows. For 0 > k , assume for contradiction that there exists an i such that x i;Q 0 > x + i;Q 0 ; x the smallest such i. Since p i x i;Q 0 > x + i;Q 0 , we get that the index i is in I for the execution with upper bound B + all the way until 0 is constructed, implying by Lemma 3.8 (and the revenue constraint for the upper bound ofB) that for all 0 , x [i;n]; v k v i v k x [k;i); ; x + [i;n]; = v k v i v k x + [k;i); : By summing over all 2Q 0 , we also obtain that x [i;n];Q 0 X 2Q 0 v k v i v k x [k;i); ; x + [i;n];Q 0 = X 2Q 0 v k v i v k x + [k;i); : By minimality of i, we have that x i 0 ;Q 0 x + i 0 ;Q 0 for all i 0 < i; summing these inequalities gives that x [k;i);Q 0 x + [k;i);Q 0 . Similarly, for any initial segment Q 00 ( Q 0 , the strong induction hypothesis implies that x i 0 ;Q 00 x + i 0 ;Q 00 for all i 0 (in particular,i 0 <i); summing those inequalities, and combining with the one just derived proves that x [k;i);Q 00x + [k;i);Q 00 for all initial segments Q 00 Q 0 (including Q 00 =Q 0 ). 37 Because v k v i v k is monotone non-increasing in (recall that k is decreasing), we can apply Lemma A.5 in the middle step of the following derivation: x + [i;n];Q 0 = X 2Q 0 v k v i v k x + [k;i); : Lemma A.5 X 2Q 0 v k v i v k x [k;i); ; x [i;n];Q 0: (3.5) Because x i;Q 0 >x + i;Q 0 , butx [i;n];Q 0x + [i;n];Q 0 , there must be some j >i such that x j;Q 0 <x + j;Q 0 ; x a minimal suchj. This time, sincep j x + j;Q 0 >x j;Q 0, we can apply Lemma 3.8 (and the revenue constraint for the process with boundB +) to obtain that for all 0 , v j x [j;n]; =v k x [k;n]; v i x [i;n]; ; v j x + [j;n]; v k x + [k;n]; = v i x + [i;n]; : Solving for x [j;n]; and x + [j;n]; , we obtain that x [j;n]; v j v j v i x [i;j); and x + [j;n]; v j v j v i x + [i;j1]; . Summing over all 2Q 0 now gives us that x [j;n];Q 0 v j v j v i x [i;j);Q 0; x + [j;n]; 0 v j v j v i x + [i;j);Q 0 : (3.6) By the denition of i and j, we have that x [i;j);Q 0 > x + [i;j);Q 0 ; substituting this inequality into (3.6) and canceling common terms implies that x [j;n];Q 0 > x + [j;n];Q 0 . Now, we derive a contradiction as follows: by Inequality (3.5), we have x [j;n];Q 0 =x [i;n];Q 0x [i;j);Q 0 (3.5) < x + [i;n];Q 0 x + [i;j);Q 0 = x + [j;n];Q 0 < x [j;n];Q 0: 3.6 Revenue in Bilateral Trade In this section, we prove Theorem 3.5, giving a straightforward dynamic program to compute a signaling scheme maximizing the seller's revenue. Before doing so, we exhibit an equal-revenue distribution for which any signaling scheme withM signals 38 only recovers anO(M=n) fraction of the full-information welfare as revenue. Notice again the contrast to the case of welfare maximization, where even one segment is enough to attain an (1= logn) fraction of the full-information welfare. We dene an equal-revenue distribution as follows. Let the valuations bev i = 2 i for 0 i n, and the probabilities p i = 1 2 i+1 for 0 i < n, and p n = 1 2 n . The full-information welfare is P n i=0 p i v i = n 2 + 1. However, for every segment qp, no matter what pricev i the seller chooses, the revenue cannot be more than v i P ji p j = 1. Therefore, in the worst case, withM signals, the seller can at best get an O(M=n) fraction of the maximum social welfare as his revenue, whereas a fully informed seller would be able to extract the entire full-information welfare, as discussed in the introduction. Next, we turn our attention to the problem of computing the optimum signaling scheme for the seller's revenue. The key insight enabling a dynamic program is that the seller-optimal signaling scheme partitions the buyer types into disjoint intervals, and allocates all probability mass for a given interval to one signal. Lemma 3.14 (Interval Structure of Seller-Optimal Signaling Scheme) W.l.o.g., the seller-optimal signaling scheme X has the following form: There are disjoint intervalsI 1 ;I 2 ;:::;I M of buyer types such that S I =f1;:::;ng, and for each signal , x i; =p i for all i2I (and x i; = 0 for all i = 2I ). Proof. Let k 1 >k 2 >:::>k M be the price points of the signals under X. We will show how to transform X to the claimed form without decreasing the seller's revenue. First, if x i; > 0 for some < M;i < k , then the buyers of type i will not buy when signal is sent, contributing nothing to the seller's revenue. Therefore, setting x i; = 0 instead does not lower the seller's revenue, and increasing x i;M by the same amount again cannot decrease the seller's revenue. Hence, we may assume that for all signals <M, we have x i; > 0 only for ik . Next, ifx i; > 0, thenx i; =p i . We distinguish two cases: if there is unallocated probability mass of typei, thenx i; can simply be raised. Ifx i; 0 > 0 for 0 >, we 39 can lower x i; 0 to 0 while raising x i; by the same amount. Because < 0 M, we have that ik , so the seller's revenue increases by x i; 0 (v k v k 0 ) 0. So far, we have shown that the signals partition the buyer types into sets such that for each buyer type, all of its probability mass goes to its unique designated signal. It remains to show that the partitions are intervals. If not, then there would be two signals 0 > and price points i<i 0 such that x i; =p i ;x i 0 ; 0 =p i 0. Then, reallocating the probability massx i 0 ; 0 to signal instead increases the seller's revenue by at least x i 0 ; 0 (v k v k 0 ) 0. The dynamic program for segmentation into intervals is now standard. Let R(i;m) denote the optimal revenue a seller can obtain from buyer typesfi;i + 1;:::;ng withm signals, when the lowest price isv i . R(i;m) satises the recurrence R(i; 0) = 0 and R(i;m) = max i<i 0 n R(i 0 ;m 1) +v i P i 0 1 j=i p j . The maximum attainable revenue can be found by exhaustive search of R(i;M) over all i. 40 Chapter 4 Dice-based Winner Selection Rules 4.1 Preliminaries 4.1.1 Winner Selection Consider choosing a set of winners from among n candidates. For example, the candidates may be bidders in an auction setting, or actions in a Bayesian persuasion setting; here, we study winner selection more generally. Each candidate could have a type, e.g., a bidder's value for the item in a single-item auction. The type t i 2T i of candidate i is drawn drawn independently from a known distribution f i . We assume without loss of generality that the candidates' type sets are disjoint, and useT = U n i=1 T i to denote the set of all types. Recall that our example of allocating gold coins in the introduction is a winner-selection environment. The team members are the candidates and their contribution levels are their types. The feasible winner sets are dened by a truncated partition matroid. A winner-selection ruleA maps each type prolet = (t 1 ;:::;t n ), possibly randomly, to one of a prescribed family of feasible setsI 2 [n] . When i2A(t), we refer to i as a winning candidate, and tot i as his winning type. Writingf =f 1 f n for the (independent) joint type distribution, we also refer to (f;I) as the winner-selection environment. WhenI is the family of singletons, as in the setting of the single item auction, we call (f;I) a single-winner environment. 41 This general setup captures the allocation rules of general auctions with independent unit-demand buyers, albeit without specifying payment rules or imposing incentive constraints. Moreover, it captures Bayesian persuasion with independent action payos, albeit without enforcing the persuasiveness constraints dened by Inequality (2.1). In most of the remainder of the chapter, we focus on winner selection environments (f;I) whereI is the family of independent sets of a matroidM. We therefore also use (f;M) to denote the environment. 4.1.2 Interim Rules A (rst-order) interim rule x species the winning probability x i (t)2 [0; 1] for all i2 [n];t2 T i in an environment (f;I). If we are given a mechanism, then x i (t) can be considered a summary of its behavior, specifying for each type of each agent the probability that this candidate will win when having his type. x i (t) can also be interpreted prescriptively: we would like to design a mechanism such that agent i, having type t, will win with probability x i (t). More precisely, we say that a winner-selection ruleA implements the interim rulex for a priorf if it satises the following: if the type prole t = (t 1 ;:::;t n ) is drawn from the prior distribution f = f 1 f n , then Pr[i2A(t)j t i = t] = x i (t). An interim rule is feasible (or implementable) within an environment (f;I) if there is such a winner-selection rule implementing it that always outputs an independent set ofI. 4.1.3 Border's Theorem and Implications for Single- Winner Environments The following theorem characterizes the space of feasible interim rules for single- winner environments. 42 Theorem 4.1 (Border [11, 12]) An interim rulex is feasible for a single-winner environment if and only if for all possible type subsets S 1 T 1 ;S 2 T 2 ;:::;S n T n , n X i=1 X t2S i f i (t)x i (t) 1 n Y i=1 1 X t2S i f i (t) ! : (4.1) The following result leverages Theorem 4.1 to show that ecient algorithms exist for checking the feasibility of an interim rule, and for implementing a feasible interim rule. Theorem 4.2 ([15, 3]) Given explicitly represented priors f 1 ;:::;f n and an interim rule x in a single-winner setting, the feasibility of x can be checked in time polynomial in the number of candidates and types. Moreover, given a feasible interim rule x, an algorithm can nd a winner-selection rule implementing x in time polynomial in the number of candidates and types. In our ecient construction for single-winner settings, we utilize a structural result which shows that checking only a subset of Border's constraints suces [11, 45, 15]. This subset of constraints can be identied eciently. Theorem 4.3 (Theorem 4 of [15]) An interim rule x is feasible for a single- winner setting if and only if for all possible 2 [0; 1], the sets S i () =ft2 T i j x i (t)>g satisfy the following Border's constraint: n X i=1 X t2S i () f i (t)x i (t) 1 n Y i=1 0 @ 1 X t2S i () f i (t) 1 A : 4.1.4 Border's Theorem for Matroid Environments For general settings with matroid constraints, Alaei et al. [3] established the following generalized \Border's Theorem." 43 Theorem 4.4 (Theorem 7 of [3]) Let map each type t to the (unique) candidatei witht2T i . An interim rulex is feasible within an environment (f;M) if and only if for all possible type subsets S 1 T 1 ;S 2 T 2 ;:::;S n T n , n X i=1 X t2S i f i (t)x i (t)E tf [r M ((t\S))]; where S = S n i=1 S i . In later sections, we omit the function , and for any type set S just write r M (S) instead of r M ((S)). For Border's constraints above, the left-hand side is just the expected number of winners for type setS, and the right-hand side is the expected maximum number of candidates which can be selected from S. It is easy to see that the constraints are necessary for the feasibility of the given interim rule. Namely, the number of winners cannot be more the rank of candidates that show up. However, it is non-trivial to show they are sucient. 4.1.5 Winner-Selecting Dice We study winner-selection rules based on dice, as a generalization of order sampling to multiple types and general constraints. A dice-based rule xes, for each type t2 T i , a distribution D i;t over real numbers, which we call a die. Given as input the type prolet = (t 1 ;:::;t n ), the rule independently draws a score v i D i;t i for each candidate i by \rolling his die;" it then selects the feasible set of candidates maximizing the sum of scores as the winner set, breaking ties with a predened rule. We stress that the independence of the draws is the key contribution of our approach; when arbitrary dependencies are allowed, any ex-post rule can be considered \dice-based". In our denition of dice-based rules, a feasible set that maximizes the sum of rolled values is chosen as the winner set. Such a way of choosing the winner sets comes from Myerson's optimal auction. In Myerson's nomenclature, each bidder has 44 a random type, namely, the utility for being selected as the winner. For each bidder, Myerson's auction denes a virtual value function that maps his type to a real number called virtual value. It is shown that the interim allocation rule is enough to determine the revenue the auctioneer can get, and the revenue maximizing allocation rule should always choose the feasible bidder set that maximizes the sum of virtual values [32]. For example, in a k-unit auction, given a prole of bidders' utilities, thek bidders with the highest non-negative virtual values will be chosen as winners. Our dice-based rule generalizes Myerson's auction, by mapping the types of candidates to dice, instead of virtual values. In this chapter, we will mainly discuss matroid feasibility constraints, for which a feasible set maximizing the sum of scores can be found by a simple greedy algorithm: candidates are added to the winner set in decreasing order of their scores, breaking ties uniformly at random, as long as the new winner set is still an independent set of the matroid and their scores are positive. Because the types and their scores are drawn at random, this random process results in a distribution over winning sets, which in turn induces winning probabilities of the types, i.e., an interim rule. In other words, as we dened in Section 4.1.2, the dice-based rule implements the interim rule. Let T be the set of all types of all candidates andD = (D t ) t2T be a vector of dice, one per type. Given an interim rule x, and a winner-selection environment (f;I), we say thatD implements x, orD describes winner-selecting dice for x in (f;I), if the dice-based rule given byD implementsx within the environment (f;I). 4.2 Summary of Results As mentioned previously, all our results are restricted to settings in which the candidate type distributions are independent. It follows from Myerson's characterization for single-item auctions that every rst-order interim rule corresponding to some optimal auction admits a dice-based implementation. 45 However, there are also interim rules that are not optimal for any auction. According to [48, 11], the linear program which solves for the revenue-maximizing interim rule is equivalent to the following LP: Maximize P n i=1 P t2T i f i (t)x i (t)v i (t) subject to x is a feasible interim rule. Here v i is the (ironed) virtual value function. When no type has virtual value 0, every optimal solution is at the boundary of the polytope of feasible interim rules. When some types have virtual value 0, and we employ the convention of never awarding those types an item, again we get an interim rule at the boundary. Therefore, any interior point of this polytope cannot arise as the interim rule of any optimal auction. Our main result is the following theorem, showing that every feasible rst-order interim rule with respect to a matroid constraint admits a dice-based rule. Theorem 4.5 Let (f;M) be a matroid winner selection-environment with a total of m types, and letx be an interim rule that is feasible within (f;M). There exist winner-selecting diceD, each of which has at most m + 1 faces, which implement x. This illustrates that the structure revealed by Myerson's characterization for revenue-optimal interim rules is more general, and applies to other settings in which only rst-order interim information is relevant. For example, single-item auctions with (public or private) budgets are such a setting (see, e.g., [54]). This result is also a generalization of order sampling [2, 60]. Recall that for order sampling, we need to sample a subset of size k from a population of n candidates. Here the candidates have no type, or equivalently, deterministic types. For every candidate, we are given the probability that it should be sampled. The sampling process is almost identical to our dice-based rule with a k-uniform matroid constraint: each candidate (since there is no type) rolls a die independently, and the k candidates with highest rolled values are chosen. Aires et al. [2] show 46 that for every feasible interim rule in this k-uniform, no type setting, there are always continuous dice that implement it. Our result generalizes this to the general matroid, multi-type setting. The key property used by the proof of Aires et al. [2] is the negative externalities between the winning probabilities of dierent candidates. Matroids are exactly the natural generalization of such negative externalities. Our proof of Theorem 4.5 is inherently non-constructive. We use monotonicity properties to show that a certain highly nonlinear system of equations has a solution, which gives us the parameters for dice with innite support. We then show, using ideas from linear programming, that the support can be reduced to m + 1 for each die. An interesting question is whether the dice can be eciently computed. While we cannot answer the question in full generality, we answer it positively for single-winner environments in Section 4.4: an algorithm can construct the dice-based rule eciently. When the types are identically distributed, we also constructively show (in Section 4.5) that every rst-order interim rule which is symmetric across candidates admits a symmetric dice implementation; i.e., dierent candidates have the same die for the same type. This is consistent with Myerson's symmetric characterization of optimal single-item auctions with i.i.d. bidders, and generalizes it to any other rst-order single-winner selection setting in which candidates are identical. Single-item auctions with identically distributed budgeted bidders are such a setting, and a symmetric dice-based implementation of the optimal allocation rule was already known from [54]. For single-winner cases, we also show the converse direction: how to eciently compute the rst-order interim rule of a given dice-based winner selection rule. In eect, these results show that collections of dice are a computationally equivalent description of single-winner rst-order interim information. This implies a kind of equivalence between the two dominant approaches for mechanism design: the Myersonian approach based on virtual values (i.e., dice), and the Borderian approach based on optimization over interim rules. 47 4.3 Existence of Dice-based Implementations for Matroids In this section, we prove Theorem 4.5. Recall that, in our nomenclature, the main result of [2] was to show the existence of dice-based rules for a k-uniform matroid, single-type setting with continous dice. In the rst part of our proof, we generalize this result in two ways: from k-uniform matroids to general matroids and from a single type to multiple types drawn from known distributions. As outlined above, the dice resulting from this generalized construction have innite (continuous) support. In the second part of the proof we convert the continuous dice to dice with nitely many faces, while keeping the interim probabilities unchanged. 4.3.1 Continuous Winner-Selecting Dice Theorem 4.6 LetM be a matroid, andx a feasible interim rule within the winner- selection environment (f;M). There exist winner-selecting diceD over R that implementx in (f;M). Recall that the type sets of candidates are disjoint, so we use f(t) and x(t) as shorthand for f i (t) and x i (t), where i = (t) is the candidate for whom t2 T i . Moreover, given a set of types S T , we write S i = S\T i . Recall the Border constraints n X i=1 X t2S i f i (t)x i (t)R(S); (4.2) where R(S) = E tf [r M (t\S)] is the expected rank of types in S which show up, a submodular function over the type set T . The Border constraints can therefore be interpreted as follows: An interim rule x : T ! [0; 1] is feasible for f and M if and only if e x is in the polymatroid given by R(S), where e x(t) := f(t)x(t). Equivalently, x is feasible if and only if the submodular slack function (S) =R(S) P t2S f(t)x(t) is non-negative everywhere. When x is feasible, we call a set S T tight for (f;x;M) if the Border constraint (4.2) corresponding to S is tight atx, i.e., P n i=1 P t2S i f(t)x(t) =R(S). 48 By denition, S =; is always tight. The family of tight sets, being the family of minimizers of the submodular slack function, forms a lattice: the intersection and the union of two tight sets is a tight set. Proposition 4.7 (S) is a submodular function. Proof. Consider ABT and a type t, (A[ftg)(A) =R(A[ftg)R(A)f(t)x(t) R(B[ftg)R(B)f(t)x(t) =(B[ftg)(B): Proposition 4.8 The tight sets for feasible (f;x;M) form a lattice. Proof. Consider (A) =(B) = 0, (A\B)(A) +(B)(A[B) 0: Since the slack function is non-negative, we have (A\B) = 0, namely, A\B is also tight. Similarly, we have that A[B is tight. Therefore, all tight sets form a lattice. Remark 4.9 The tightness of a setS means that the expected number of winners from S equals the expected rank of types in S which show up. In other words, S is tight if and only if a maximum independent subset oft\S is always selected as winners. Because tight sets form a lattice, there are minimal tight sets. By Remark 4.9, whenever some type in such a minimal tight set shows up, a maximum independent subset of it has to win. Therefore, these tight sets need to be treated preferentially, i.e., assigned higher faces on their dice, compared to types outside them. Because they play such an important role, we dene them as barrier sets. Formally, we dene the set of active types T + =ft2T :f(t)x(t)> 0g to be the types who win 49 with positive probability. Barrier sets are subsets of T + . If there is at least one non-empty tight set of active types, we dene the barrier sets as the (inclusion-wise) minimal such sets. Otherwise, we designate the entire setT + of active types as the unique barrier set. To prove Theorem 4.6, we rst assume that the matroidM is non-separable. Separable matroids will be handled in the proof of Theorem 4.6 by combining the dice of their non-separable components. Because barrier sets get precedence, we rst show how to construct dice for barrier sets with Lemma 4.10. Once we have dice for barrier sets, we can repeatedly \peel o" the tight sets and combine their dice, which is captured in Lemma 4.14. We start with the existence of dice for barrier sets: Lemma 4.10 LetM be a non-separable matroid, and x > 0 a feasible interim rule within the winner-selection environment (f;M). Let S be a barrier set for (f;x;M), and dene x S to be (x(t)) t2S . There exists a vector of distributions D = (D t ) t2S overR, such thatD implementsx S in (f;MjS). In the following proofs for continuous dice, mostly we will use the scaled-shifted die, dened as follows. Choose an arbitrary continuous distribution 1 D with support [0;1). For each type t, we assign a parameter t > 0. Also, we choose a global parameter . To sample from D t , we draw a primitive roll v t D and output t v t . Without loss of generality, assume that the barrier set S containsK types, and number them by 1; 2;:::;K. Proof. Whenj(S)j = 1 and S is tight, all types in S belong to the same candidate, so there is no competition between candidates. Because of the tightness of S, we can simply assign a single-sided die with face value 1 to each type in S. Thus in the proof, we assume that either there is more than one candidate with a type in S or that S is not tight. 2 In both cases, this implies that x(t) < 1 for all 1 For example, one can use an exponential distribution. 2 Recall thatS could be the (possibly non-tight) barrier set of all active types, if no non-empty set is tight. 50 t2S. For ifx(t) = 1, the singletonftg would be a proper tight subset ofS, which contradicts the assumption that S is a barrier set. We deneg t (; 1 ;:::; K ) to be the interim probability of typet winning in the environment (f;MjS), when a scaled-shifted die D t with parameters t and is assigned to each t2S and the corresponding dice-based rule is used to select the winner set. Each g t is continuous in all of its parameters. Using S = ( t ) t2S for short, consider the following system of equations with variables S : X t2S x(t)g t (; S ) = X t2S f(t)x(t) (4.3) g t (; S ) =x(t); for all t2 [K]: (4.4) The objective of the proof is to show that the system of equations (4.3){(4.4) admits a solution. This solution will directly induce a dice system that implementsx for types in S. Notice that the system is redundant: Equation (4.3) is implied by the Equations (4.4). Throughout the proof, we use i to denote the prex (; 1 ;:::; i ) and i to denote the sux of parameters ( i ; i+1 ;:::; n ). In particular, 0 = . We prove the following inductively for k = 1;:::;K: For any positive sux k > 0, there is a prex k1 so that ( k1 ( k ); k ) satises the rst k equations in the system (4.3){(4.4). Applying the claim with k =K, for every positive K , there is a solution that satises the rstK equations. Because of the redundancy of the system, satisfying the rst K equations guarantees that the last equation is also satised. For the base case k = 1, consider an arbitrary positive 1 = S . We need to prove the existence of a such that P t2S f(t)g t (; S ) = P t2S f(t)x(t). When = 0, all types get non-negative die rolls. Therefore, a maximum independent subset of t\ S is always selected. Thus, S is tight at (g t (0; S )) t2S , i.e., P t2S f(t)g t (0; S ) = R(S) P t2S f(t)x(t). On the other hand, as increases, the probability that all die rolls are negative goes to 1, meaning that in the limit, no agent wins. Thus, lim !1 g t (; S ) = 0. Furthermore, each g t (; S ) 51 strictly and continuously decreases with . Because 0 < P t2S f(t)x(t) R(S), by the Intermediate Value Theorem, for every S , there is a unique such that P t2S f(t)g t (; S ) = P t2S f(t)x(t). We denote this unique by h 0 ( S ); notice that h 0 ( S ) is a continuous function of S . When S is a tight barrier set, i.e., R(S) = P t f(t)x(t), the equation is satised for = 0, so h 0 ( S ) = 0. This establishes the base case k = 1 of the induction hypothesis. For the inductive step, x an arbitrary k 1, and let k+1 > 0 be arbitrary. For any xed k > 0, by induction hypothesis, there is a unique k1 ( k ; k+1 ) such that ( k1 ( k ; k+1 ); k ; k+1 ) satises the rst k equations of (4.3){(4.4). Lemma 4.11 below shows that there is a unique k = h k ( k+1 ) such that ( k (h k ( k+1 ); k+1 );h k ( k+1 ); k+1 ) satises the rst k + 1 equations of (4.3){(4.4). The inductive claim now follows by dening k+1 ( k+1 ) = ( k (h k ( k+1 ); k+1 );h k ( k+1 )): Lemma 4.11 Fix any feasible x. For every k and sux k+1 , there is a unique k 2 (0;1) such that g k ( k1 ( k ; k+1 ); k ; k+1 ) =x(k). The proof of Lemma 4.11 is based in large part on the following monotonicity properties: Lemma 4.12 Assume that S contains types of more than one candidate or is not tight. For any t k, consider g t ( k1 ( k ; k+1 ); k ; k+1 ) as a function of i , for ik. The function g t satises the following properties: 1. It is weakly decreasing in i , for all i6=t;ik. 2. It is strictly increasing in t . Lemma 4.13 If (t)6=(i), then g t ( S ) is strictly decreasing in i . If in addition ik and tk, then g t ( k1 ( k ); k ) is also strictly decreasing in i . For notational convenience, we dene the shorthand g t ( k ; k+1 ) := g t ( k1 ( k ; k+1 ); k ; k+1 ). Furthermore, for xed k+1 (which will be clear from the context), we dene h t ( k ) to be the t th component of k1 ( k ; k+1 ) t , for any t<k. 52 Proof of Lemma 4.11. As we did in the proof of Lemma 4.10 for the base case of the induction claim, we will examine the limits of g k ( k ; k+1 ) as k ! 0 and k ! 1. By doing so, we will establish that x(k) lies between the two limits. Then, using the continuity and strict monotonicity (by Lemma 4.12) of g k ( k ; k+1 ), the Intermediate Value Theorem implies that there is a unique k satisfying g k ( k ; k+1 ) =x(k). To compute lim k !1 g k ( k ; k+1 ), consider the set of types A =ft < k j lim k !1 h t ( k ) =1g. As k !1, with probability approaching 1, A[fkg will dominate all other types. We distinguish two cases, based on whether lim k !1 h 0 ( S ) is nite or not. { If lim k !1 h 0 ( k ) 6= 1, the parameter will be nite in the limit. Therefore, a maximum independent subset of A[fkg will be selected as winners, implying that A[fkg is tight in the limit. Because k1 ( k ; k+1 ) ensures that the rst k equations of (4.3){ (4.4) are satised, each type t 2 A wins with probability f(t)x(t). Because A[fkg is tight, lim k !1 f(k)g k ( k ; k+1 ) = R(A[fkg) P t2A f(t)x(t). From the Border constraint for the set A[fkg, f(k)x(k)+ P t2A f(t)x(t)<R(A[fkg); otherwiseA[fkg(S would be a tight set, contradicting that S is a barrier set. Rearranging, we have shown that lim k !1 f(k)g k ( k ; k+1 )>f(k)x(k). { If lim k !1 h 0 ( k ) =1, then all types t > k will get negative die rolls with probability approaching 1, so lim k !1 g t ( k ; k+1 ) = 0 for all types t>k. Thus, lim k !1 f(k)g k ( k ; k+1 ) + P t<k f(t)x(t) = P t2S f(t)x(t), i.e., lim k !1 f(k)g k ( k ; k+1 ) = P tk f(t)x(t) > f(k)x(k), because all types in a barrier set are active by denition. To compute lim k !0 g k ( k ; k+1 ), dene the type set A = f1 t < k j lim k !0 h t ( k )> 0g[fk+1;:::;Kg. As k ! 0, with probability approaching 1,A will dominate all other types because for t = 2A,h t ( k ) goes to 0. Again, we distinguish two cases, based on whether lim k !1 h 0 ( k ) = 0 or not. 53 { If lim k !1 h 0 ( k ) = 0, then in the limit, a maximum independent subset of A must always be chosen as winners, so A approaches tightness. Again, k1 ( k ; k+1 ) ensures that g t ( k ; k+1 ) = x(t) for t < k, so we have lim k !0 P t>k f(t)g t ( k ; k+1 ) = R(A) P t2A;t<k f(t)x(t). Combined with P tk f(t)g t ( k ; k+1 ) = P tk f(t)x(t), which is also ensured by k1 ( k ; k ), we obtain that lim k !0 f(k)g k ( k ; k+1 ) = X tk f(t)x(t) R(A) X t2A;t<k f(t)x(t) ! : Rearranging the Border constraint corresponding to A Snfkg gives us that 0> P t>k f(t)x(t) R(A) P t2A;t<k f(t)x(t) becauseA(S is not tight. Adding f(k)x(k) on both sides, we get f(k)x(k) > P tk f(t)x(t) R(A) P t2A;t<k f(t)x(t) . Finally, canceling outf(k), we obtain that x(k)> lim k !0 g k ( k ; k+1 ). { If lim k !0 h 0 ( k ) > 0, as k ! 0, type k will get a negative roll with probability approaching 1, so lim k !0 g k ( k ; k+1 ) = 0<x(k). In summary, we have shown that lim k !0 g k ( k ; k+1 ) < x(k) < lim k !1 g k ( k ; k+1 ). Because g k ( k ) is continuous and monotone in k , by the Intermediate Value Theorem, there exists a unique k such that g k ( k ) = x(k). In the proofs for Lemmas 4.12 and 4.13, we will often want to analyze the eect of keeping i for all i6= t unchanged, while changing t to 0 t . Thereto, we always consider the following coupling of dice rolls in the two scenarios, which we call primitive-roll coupling for t: all dice will obtain the same primitive rolls v under the two scenarios, but scale them dierently. More precisely, we consider the rolls ( i v i ) i2S and ( 0 i v i 0 ) i2S , for any given primitive rolls v, where 0 i = i for i6=t;ik. Recall that all i for i<k are functions of k . 54 Proof of Part 1 of Lemma 4.12. To show that g t ( k ; k+1 ) is weakly decreas- ing in i for allik;i6=t, we again use induction onk. The induction hypothesis is the following: 1. Each entry of k ( k+1 ) is weakly increasing in i for i>k. 2. g t ( k+1 ) is weakly decreasing in i for i6=t;i>k. We begin with the base case k = 0. The rst part of the base case | that 0 ( S ) is weakly increasing in i | has been shown in the proof of Lemma 4.10. To prove the second part of the base case, we use primitive-roll coupling for i with i < 0 i . For every scenario with primitive rolls v in which t is not a winner with S , we show that t cannot become a winner with 0 S . Increasing i to 0 i can only (weakly) increase the threshold = 0 ( S ), by the rst part of the base case. Let A be the set of types with rolls higher than the roll of type t when the rolls are derived fromv using S . Since t is not winning, we have 0 = r M (A[ftg)r M (A) r M (A[ft;ig)r M (A[fig): The inequality is due to the submodularity of the matroid rank function, and shows formally that (potentially) addingi to the set of types with higher rolls than t cannot help t become a winner. For the induction step, consider some k 1 and x a k+1 . By the induction hypothesis, each entry of k1 ( k ; k+1 ) is weakly increasing in i for i k, and g t ( k ; k+1 ) is weakly decreasing in i for i6=t;ik. We rst show that each entry of k ( k+1 ) is weakly increasing in i for i > k. The key is component k of k ( k+1 ). By the second part of the induction hypothesis, applied with t = k, g k ( k1 ( k ; k+1 ); k ; k+1 ) is weakly decreasing in i . Since g k ( k1 ( k ; k+1 ); k ; k+1 ) is dened 55 as the winning probability of k with the given parameters, when i is raised to 0 i > i , in order to keep g k ( k1 ( 0 k ; 0 k+1 ); 0 k ; 0 k+1 ) = x(k) = g k ( k1 ( k ; k+1 ); k ; k+1 ), we require 0 k k . By applying the induction hypothesis twice, once for i and once for k , all entries of k1 ( k ; k+1 ) weakly increase. Therefore, all entries of k ( k+1 ) are weakly increasing in all i for i>k. To prove the second part of the inductive step, recall that g t ( k+1 ) = g t ( k ( k+1 ); k+1 ) = g t (h k ( k+1 ); k+1 ). For t k, g t ( k+1 ) = x(t) is a constant, and in particular weakly decreasing in i . Consider some t>k;i>k;t6=i. First, h k ( k+1 ) is weakly increasing in i by the rst part of the induction hypothesis. By the second part of the induction hypothesis, g t ( k ; k+1 ) is weakly decreasing in all of its variables except t , so substituting k =h k ( k+1 ) shows that g t (h k ( k+1 ); k+1 ) is weakly decreasing in i . Proof of Part 2 of Lemma 4.12. Recall that t k. To show that g t ( k ) is strictly increasing in t , we will show that at least one typeik hasg i ( k ) strictly decreasing in t . By Part 1 of the lemma, eachg i ( k ) forik is weakly decreasing in t . By denition, k1 ( k ) ensures that the rst k equations of the system (4.3){(4.4) are satised; this implies that P jk f(j)g j ( k ) = P jk f(j)x(j). Thus, if at least one of the g i ( k ) is strictly decreasing in t , to keep the summation P jk f(j)g j ( k ) unchanged, g t ( k ) must increase strictly in t . We consider two possible cases for S, as permitted by the assumption of the lemma: S is not tight. In this case, we rst show that h 0 ( S ) is a strictly increasing function of t . Consider the primitive-roll coupling for 0 t > t . For any primitive rolls v, the number of winners will not decrease when t increases to 0 t . Thus, we can bound the summation P i2S f(i)g i (; S ) P i2S f(i)g i (; 0 S ). Because S is not tight, with non-zero probability, v is such that v i i < 0 for all i. And because v is drawn from a continuous distribution, with positive probability, v t 0 t > 0 as well. In that case, t is the only candidate with a positive die roll. This implies 56 that P i2S f(i)g i (; S ) < P i2S f(i)g i (; 0 S ). Because h 0 ( 0 S ) is dened as the unique satisfying the equation P i2S f(i)g i (; 0 S ) = P i2S f(i)x(i), we obtain that h 0 ( 0 S )>h 0 ( S ). Now consider any primitive rollsv under which a type i6= t wins, andv is such that v i i h 0 ( S ) > 0 but v i i h 0 ( 0 S ) < 0. Such rollsv must occur with positive probability, because all the dice are fully supported over (0;1). In that case, i is no longer a winning type under 0 S . Thus g i ( S ) is strictly decreasing in t . There is more than one candidate, i.e., there is a type i with (i)6=(t). { If there is such a type i k with (i)6= (t), then g i ( k ) is strictly decreasing in t by the second part of Lemma 4.13. { Otherwise, all types t 0 k have (t 0 ) = (t). Dene 0 k = k for all entries except type t, where it equals 0 t > t . Consider a type i < k with (i)6= (t). By denition of k1 (), we get that g i ( 0 k ) = x(i) =g i ( k ). And by the rst part of Lemma 4.13, g i ( k1 ( k ); k )> g i ( k1 ( k ); 0 k ). So i wins with strictly higher probability under the parameters ( k1 ( 0 k ); 0 k ) than under the parameters ( k1 ( k ); 0 k ). As shown in Part 1 of Lemma 4.12, g i ( S ) is weakly decreasing in all of its variable except i . Thus, the only way that the winning probability of agenti can increase is for thei th component of k1 ( 0 k ) to be strictly greater than the i th component of k1 ( k ). Finally, consider some type t 0 k;t 0 6= t with (t 0 ) = (t) 6= (i). Because all components of k1 ( k ) weakly increase going from k to 0 k , and the i th component strictly increases, we get that g t 0( k ) < g t 0( 0 k ). Thus, we have shown that there is a t 0 6= t such that g t 0( k ) is strictly decreasing in t . Proof of Lemma 4.13. We prove both statements using primitive-roll coupling. For the second part of the lemma, notice that when i is increased to 0 i forik, all 57 components of k1 ( k ) weakly increase. Thus, all components of ( k1 ( k ); k ) are weakly larger than those of ( k1 ( 0 k ); 0 k ), and 0 i > i , while 0 t = t . Thus, we can apply primitive-roll coupling in both cases. Since the matroid is non-separable, there is a circuit C that contains (t) and (i). Under the parameter vector S , with non-zero probability (over primitive rolls v), the candidates in C get the highest rolls, (t) (barely) wins with the second- lowest roll, and (i) gets the lowest roll among all candidates in C and does not win. When i increases to 0 i , with non-zero probability, the scaled roll for type t becomes the lowest so that t ceases to be a winner. Combining this with Part 1 of Lemma 4.12 (which states that g t ( S ) is weakly decreasing in i ), we have that g t ( S ) is strictly decreasing in i . To generalize Lemma 4.10 to arbitrary sets of types, we will need a construction that allows us to \scale" the faces of some dice such that they will always be above/below the faces of another set of newly introduced dice; such a construction will allow us to give dice for types in barrier sets higher faces than other dice. For the types of full-support distributions over [0;1) we have been using so far, this would be impossible. There is a simple mapping that guarantees our desired properties: we map faces from (0;1) to the set (1; 2) by mapping all positive s7! 2 1 1+s , and mapping all negative s to1. Notice that the new dice implement the same interim rule as the old ones: in matroid environments, the set maximizing the sum of die rolls is determined by the greedy algorithm, and hence, only the relative order between die faces matters. With the help of this mapping, we prove the following lemma, similar to Lemma 4.10. Lemma 4.14 Let x > 0 be a feasible interim rule within a winner-selection environment (f;M), whereM is a non-separable matroid. Fix a tight set S, and let b S be a minimal tight set that includes S as a proper subset, if such a set exists; otherwise let b S =T . Given diceD = (D t ) t2S that implementx S in (f;MjS), there are diceD 0 = (D 0 t ) t2 b S which implementx b S in (f;Mj b S). 58 Proof. First, we dene the diceD 0 t for typest2S: they are obtained by applying the transformation described previously to the corresponding dice D t . Thus, the range of positive faces is (1; 2) for these D 0 t . For types t2 b SnS, we construct new dice D 0 t in a similar way to the proof of Lemma 4.10. The construction is essentially identical, but all positive die faces s are mapped to 1 1 1+s . Notice that this mapping ensures that all die faces are strictly less than 1, and hence, these types t will always lose to types t 0 2 S. The main other change in the proof is to change the denition of g t (; b SnS ): it is now the interim winning probability of the type t2S 0 when winners are selected with the die rollsD 0 , assuming all types outside b S always get negative die rolls. None of the monotonicity properties of the functionsg t () will be aected under this new denition; thus, the same proof will go through, with only a slight change in the limits of f(k)g k ( k ; k+1 ): lim k !1 f(k)g k ( k ; k+1 )2f X tk f(t)x(t);R(A[fkg) X t2A f(t)x(t)g lim k !0 f(k)g k ( k ; k+1 )2f0; X tk f(t)x(t) 0 @ R(A) X t2AnB f(t)x(t) 1 A g: Proof of Theorem 4.6. First consider the case whenM is non-separable. Let T be the type set with m types. We dene the dice system as follows: First, for all types t with x(t) = 1, assign them a point distribution (single-sided die) at 2, so they always win. Next, for all types t with x(t) = 0, assign them a point distribution at1, so they never win. Next, we create dice for barrier sets S according to Lemma 4.10. Then, starting from the barrier sets, we repeatedly apply Lemma 4.14 to construct dice for larger tight sets b S)S (or all ofT ) implementing x b S . If M is separable, let M 1 ;:::;M k be the components of M. Using the construction from the previous paragraph for eachM j , letD j be the dice set constructed for M j . Since there is no circuit containing two candidates from dierent components, the winner set of one component has no eect on the winner 59 set of any other component. Thus, the union S k j=1 D j of dice implements the desired interim rule. 4.3.2 Winner-Selecting Dice with polynomially many faces The proof is based on the following generalization of the fundamental theorem of linear programming to uncountable dimensions. Theorem 4.15 (Theorem B.11 from [40]) Let f 1 ;:::;f m : X ! R be Borel measurable on a measurable spaceX, and let be a probability measure onX such that f i is integrable with respect to for each i = 1;:::;m. Then, there exists a probability measure ' onX with nite support, such that Z X f i d' = Z X f i d for all i. Moreover, there is such a ' whose support consists of at most m + 1 points. Proof of Theorem 4.5. Theorem 4.6 establishes that there is a vector of probability measuresD = (D t ) t2T over V = (1; 2)[f1g, satisfying the following for all i and t i 2T i : Z V X t2T i Y j6=i f(t j ) Z Z V n1 w t i (s 1 ;:::;s n )dD t 1 (s 1 )dD tn (s n )dD t i (s i ) =x(t i ): (4.5) where w t i (s 1 ;:::;s n ) equals 1 if type t i is a winning type with dice rolls s 1 ;:::;s n , and equals 0 otherwise. For a xed typet 2T i and an equation corresponding to (j;t 0 j ), we can change the order of integration to makedD t the outermost integral, and isolate the terms that do not involve dD t . Specically, we dene q j;t 0 j (s i ) = X t2T t j =t 0 j ;t i =t Y k6=j f(t k ) Z Z V n1 w t j (s 1 ;:::;s n )dD t 1 (s 1 )dD tn (s n ) 60 to be the inner integral over distributions of t6=t , as a function of s i , and c j;t 0 j = 8 > > > > > < > > > > > : 0; for t 0 j =t Z V X t2T j ;t i 6=t Y k6=j f(t k ) Z Z V n1 w t j (s 1 ;:::;s n ) dD t 1 (s 1 )dD tn (s n )dD t j (s j ); for t 0 j 6=t as the component of the integral that does not have dD t involved. Thus, Equation (4.5) can be rewritten as Z V q j;t 0 j (s i )dD t (s i ) +c j;t 0 j =x(t 0 j ): By Theorem 4.15,D t can be changed to a measureD 0 t with support of sizem + 1, such that for all j and t j 2T j , Z V q j;t j (s i )dD t (s i ) = Z V q j;t j (s i )dD 0 t (s i ): In other words, we can replace D t with D 0 t , which has at most m + 1 faces. Applying the same procedure to each type t in turn, all dice can be replaced by dice with at most m + 1 faces. 4.4 Ecient Construction of Dice for Single- Winner Environments In the preceding section, we proved the existence of winner-selecting dice by a \construction." However, the construction involves repeated appeals to the Intermediate Value Theorem, and is thus inherently non-computational. It is certainly not clear how to implement it eciently. In this section, we show that when the matroidM is 1-uniform, i.e., at most one winner can be selected, winner- selecting dice can be computed eciently. We prove the following. 61 Theorem 4.16 Consider a winner-selection environment withn candidates, where each candidate i's type is drawn independently from a prior f i supported on T i , and at most one winner can be selected. If x is a feasible interim rule for f =f 1 f n , an explicit representation of the associated dice can be computed in time polynomial in n and m = P i jT i j. We use the same notation f(t) and x(t) as in Section 4.3. In the single-winner setting, the functionR(S) =E tf [r M (t\S)] can be treated as a natural extension off to subsets ofT : givenST , we letR(S) =f(S) = 1 Q n i=1 (1 P t2S i f(t)), which is the probability that at least one type in S shows up. Border's constraints can then be written as follows: P t2S f(t)x(t) f(S) for all S T: The slack function becomes f;x (S) = f(S) P t2S f(t)x(t) and is nonnegative everywhere whenx is feasible. Tight sets and barrier sets are dened as the special case of the denition for general matroids in Section 4.3. The algorithm Find Barrier Set, given as Algorithm 3, simply implements the denition of barrier sets as minimal non-empty tight sets, and therefore correctly computes a barrier set. ALGORITHM 3: Find Barrier Set(f;x) 1 T + ft2T :f(t)x(t)> 0g. 2 Dene g(S) =f(S) P t2S f(t)x(t) for ST + . 3 if minfg(S) :;(ST + g6= 0 then 4 return T + . 5 else 6 T T + . 7 while there is a type t2T such that minfg(S) :;(ST nftgg = 0 do T T nftg. 8 return T . The following lemma characterizes the key useful structure of barrier sets for the single-winner setting. Lemma 4.17 Let f and x be such that x is feasible for f. If there are multiple barrier sets for (f;x), then there is a candidate i such that each barrier set is a singletonftg with t2T i . 62 In other words, either there is a unique barrier set, or all barrier sets are singletons of types from a single candidate. Proof. Let A;B be any two barrier sets. Because A and B are both tight, the lattice property of tight sets implies that A\B =;. We rst show that there is a candidate i with AT i and BT i . Suppose not for contradiction; then, there exist i6=j and typest i 2A\T i andt j 2B\T j . With non-zero probability, the typest i andt j show up at the same time. However, according to the denition of a tight set, when a type in A shows up, the winner must be a candidate with type in A, and the same must hold for B. Then, the winner's type would have to be inA\B with non-zero probability. This contradicts the disjointness of A and B. It remains to show that all barrier sets are singletons. Since A is tight and all types in A belong to the same candidate, P t2A f(t)x(t) = f(A) = P t2A f(t). Hence,x(t) = 1 for allt2A, and because barrier sets are minimally tight, A must be a singleton. 4.4.1 Description of the Algorithm Given a prior f and a feasible interim rule x, the recursive procedure Contruct Dice, shown in Algorithm 4, returns a family of diceD implementing x forf. It operates as follows. There are two simple base cases: when no candidate ever wins, and when a single type of a single candidate always wins. In the recursive case, the algorithm carefully selects a typet and awards its die the highest-valued face M 0 . It assigns this new face a probability as large as possible, subject to still permitting implementation ofx. We chooset as a member of a barrier set; this is important in order to guarantee that the algorithm makes signicant progress. The subroutineDecrement, shown as Algorithm 5, essentially conditions both f and x on the face M 0 not winning. Specically, Decrement computes the conditional type distribution f 0 , and an interim rulex 0 , such that if there were a dice implementation ofx 0 for f 0 , then adding M 0 to the die of t would yield a set of dice implementingx for f. 63 ALGORITHM 4: Contruct Dice (f;x) Input : PDFs f 1 ;:::;f n supported on disjoint type sets T 1 ;:::;T n . Input : Interim rulex feasible for f. Output : Vector of dice (D t ) t2 U n i=1 T i . 1 Let T = U i T i . 2 Let T + i =ft2T i :f i (t)x i (t)> 0g, and let T + = U n i=1 T + i . 3 if T + =; then 4 for all types t2T , let D t be a single-sided die with a1 face. 5 else if there is a type t 2T + with f(t )x(t ) = 1 then 6 Let D t be a single-sided die with a +1 face. 7 for all other types t2Tnft g, let D t be a single-sided die with a1 face. 8 else 9 Let T =Find Barrier Set(f;x). 10 Let t 2T be a type chosen arbitrarily. 11 Let (f 0 ;x 0 ) =Decrement(f;x;t ;q ), for the largest value of q 2 [0;f(t )x(t )] such thatx 0 is feasible for f 0 . /* Note that f(t )x(t )< 1. */ 12 Let (D 0 t ) t2T Contruct Dice(f 0 ;x 0 ). 13 Let M be the maximum possible face of any die D 0 t , and M 0 := max(M; 0) + 1. 14 Let D t =D 0 t for all types t6=t . 15 Let D t be the die which rolls M 0 with probability q f(t ) , and D 0 t with probability 1 q f(t ) . 16 return (D t ) t2T . ALGORITHM 5: Decrement(f;x;t ;q) /* q 0 is the probability allocated to the highest face. Because it is a contribution to the unconditional winning probability f(t )x(t ) of type t , and we separated out the case that a single type has unconditional winning probability 1, q satisfies qf(t )x(t )< 1. */ 1 if q =f(t ), then let f 0 (t ) 0 and x 0 (t ) 0 2 else let f 0 (t ) f(t )q 1q and x 0 (t ) f(t )x(t )q f(t )q . 3 Let i be such that t 2T i . 4 for all t2T i ;t6=t , let f 0 (t) f(t) 1q and x 0 (t) x(t). 5 for all t2TnT i , let f 0 (t) f(t) and x 0 (t) x(t) 1q . 6 return (f 0 ;x 0 ): We now provide a formal analysis of our algorithm. Theorem 4.16 follows from Lemmas 4.18{4.20. Lemma 4.18 If Contruct Dice terminates, it outputs dice implementingx for f. 64 Lemma 4.19 Contruct Dice terminates after at most m 2 recursive calls. (Recall that m = P i jT i j.) Lemma 4.20 Excluding the recursive call, each invocation of Contruct Dice can be implemented in time polynomial in n and m. 4.4.2 Proof of Lemma 4.18 (Correctness) We prove the lemma by induction over the algorithm's calls. Correctness is obvious for the two base cases: when T + =; (no type should win), and when there exists a type t with f(t )x(t ) = 1 (t always shows up and should always win). For the inductive step, suppose that the recursive call in step 12 returns diceD 0 = (D 0 t ) t2T , correctly implementingx 0 for f 0 , and letD = (D t ) t2T be the new dice dened in steps 14 and 8. We analyze the interim winning probability of each type when using the dice- based winner selection rule given byD. For each type t, let v t D t be a roll of the die for type t, and for each i, let t i f i be a draw of a type; all v t and t i are mutually independent. In other words, we may assume that the die of every type is rolled (including types that do not show up), then the type prole is drawn independently. The winning type is then the type t i with largest positive v t i ; if all v t i are negative, then no type wins. Let t 2 T i be as dened in step 10. LetE be the event that i has type t and that v t = M 0 , and letE be its complement. By independence of the random choices, the probability ofE isf(t ) q f(t ) =q . Typet always wins under the eventE. Conditioned onE, each v t (including v t ) is distributed as a draw from D 0 t , the type vector t is distributed as a draw from f 0 1 f 0 n , and the v t 's and t are mutually independent. By the inductive hypothesis, conditioned on E, each type t wins with probability f 0 (t)x 0 (t). Using the denition of f 0 and x 0 from the Decrement subroutine, the total winning probability for t is q 1 + (1q )f 0 (t )x 0 (t ) =q + (1q ) f(t )x(t )q 1q =f(t )x(t ). Fort6=t , the total winning probability is q 0 + (1q )f 0 (t)x 0 (t) = (1q ) f(t)x(t) 1q =f(t)x(t). 65 Therefore, the interim winning probability for each type t is x(t), and the diceD implementx for f. 4.4.3 Proof of Lemma 4.19 (Number of Recursive Calls) The following lemma is essential in that it shows that invoking Decrement maintains feasibility and tightness of sets. Lemma 4.21 Let f, x, t and q be valid inputs for Decrement, and f 0 , x 0 the output of the call to Decrement(f;x;t ;q). Let S be any set of types with t 2S. Then, 1. The Border constraint forS is satised for (f 0 ;x 0 ) if and only if it is satised for (f;x). 2. The Border constraint for S is tight for (f 0 ;x 0 ) if and only if it is tight for (f;x). Proof. We will show that the slack for every S3 t satises f 0 ;x 0(S) = f;x (S) 1q , which implies both claims. Let i be such that t 2T i . Using the denitions of f 0 andx 0 , f 0 ;x 0(S) =f 0 (S) X t2S f 0 (t)x 0 (t) = 1 1 f(S i )q 1q Y i6=i (1f(S i )) ( P t2S f(t)x(t))q 1q = 1 1q 1 (1f(S i )) Y i6=i (1f(S i )) X t2S f(t)x(t) ! = f;x (S) 1q : We are now ready to prove Lemma 4.19. We will show that with each recursive invocation of Contruct Dice (step 12), at least one of the following happens: (1) The number of active typesjT + j decreases; (2) The size of the barrier setjT j decreases. 66 Notice that the number of active types or the size of a barrier set never increase. Because the size of the barrier set can only decrease at most m times, the number of active types must decrease at least every m recursive invocations. It, too, can decrease at most m times, implying the claim of the lemma. Let T , t , and (q ;f 0 ;x 0 ) be as chosen in steps 9, 10, and 11, respectively. Let candidate i be such that t 2 T i . If q = f(t )x(t ), then the type t will be inactive in (f 0 ;x 0 ), and there will be one fewer active type in the subsequent invocation of Contruct Dice (step 12). We distinguish the casesjT j = 1 and jT j> 1. IfjT j = 1, then T =ft g. By denition of a barrier set, T is tight, implying (for a singleton set) that x(t ) = 1. We claim that q is set to f(t ) = f(t )x(t ) in step 11, implying that the number of active types decreases. To prove that q = f(t ), we will show that this choice of q is feasible in the invocation of Decrement(f;x;t ;f(t )). Consider the b f;b x resulting from such an invocation ofDecrement. Lemma 4.21 implies that the feasibility of each Border constraint corresponding to a set S3t is preserved for ( b f;b x). For type sets S excluding t , b f;b x (S) = b f(S) X t2S b f(t)b x(t) = 1 1 f(S i ) 1f(t ) Y i6=i (1f(S i )) P t2S f(t)x(t) 1f(t ) = 1 1f(t ) 1 (1f(S i ) +f(t )) Y i6=i (1f(S i )) X t2S f(t)x(t) +f(t )x(t ) !! = f;x (S[ft g) 1f(t ) ; which is nonnegative becausex is feasible forf. Therefore, step 11 indeed chooses q =f(t ). 67 Next, we consider the casejT j > 1, and assume that q < f(t )x(t ) (since otherwise, we are done). Then, the set of active types T + is the same for both (f;x) and (f 0 ;x 0 ). If the instance (f 0 ;x 0 ) for the recursive call has multiple barrier sets, then by Lemma 4.17, they are all singletons, and indeed the size of the barrier set in the next recursive call (which is 1) is strictly smaller thanjT j. So we assume that (f 0 ;x 0 ) has a unique barrier set T 0 . BecausejT j> 1, Lemma 4.17 implies thatT is the unique barrier set for (f;x). Therefore, by the denition of barrier sets, T (and hence also t ) is contained in every tight set of active types for (f;x) (if any). Because t is contained in all tight sets, Lemma 4.21 implies that for every q 2 [0;f(t )x(t )], the result of Decrement(f;x;t ;q) does not violate any constraints which are already tight for (f;x), and in fact preserves their tightness. Because all other constraints have slack, the optimal q is strictly positive. By assumption, we also have that q < f(t )x(t ); therefore, a non-empty set S 0 which was not tight for (f;x) must have become tight for (f 0 ;x 0 ) = Decrement(f;x;t ;q ). By Lemma 4.21, this setS 0 does not includet . Because discarding inactive types preserves tightness, we may assume without loss of generality that S 0 T + . We have shown the existence of a non-empty tight setS 0 T + nft g for (f 0 ;x 0 ). By denition, the barrier set T 0 is the (unique, in our case) minimal tight set for (f 0 ;x 0 ), so T 0 S 0 . We distinguish two cases, based on the possible denitions of barrier sets: If T =T + , then because S 0 (T + =T , the barrier set T 0 is strictly smaller than T . If T is tight for (f;x), then it is also tight for (f 0 ;x 0 ) by Lemma 4.21; since both S 0 and T are tight for (f 0 ;x 0 ), we get that T 0 T \S 0 T nft g is strictly smaller than T . 68 4.4.4 Proof of Lemma 4.20 (Runtime per Call) There are only two steps for which polynomial runtime is not immediate: the computation of q in step (11) of Contruct Dice, and nding a non-empty minimizer of a submodular function in steps (3) and (7) of Find Barrier Set. We prove polynomial-time implementability of both steps in the following lemmas. Lemma 4.22 In step 11 of Contruct Dice, q can be computed in poly(m) time. Proof. Let f, x, and t be as in step 11, and let the candidate i be such that t 2T i . For each q2 [0;f(t )x(t )] [0; 1), let (f q ;x q ) =Decrement(f;x;t ;q) be the result of running Decrement(f;x;t ;q) with parameter q. Lemma 4.21 implies that all Border constraints for S3t remain feasible for (f q ;x q ). For type sets S63 t , we can write the slack in the corresponding Border constraint as a function of q as follows: fq;xq (S) =f q (S) X t2S f q (t)x q (t) = 1 1 f(S i ) 1q Y i6=i (1f(S i )) P t2S f(t)x(t) 1q = 1 1q 1 (1f(S i )q) Y i6=i (1f(S i )) X t2S f(t)x(t)q ! = 1 1q ( f;x (S)qf(SnS i )): The preceding expression is nonnegative if and only if q h(S) := f;x (S) f(SnS i ) . Therefore, q is the minimum of f(t )x(t ) and min STnft g h(S). The function h(S) does not appear to be submodular, and hence ecient minimization is not immediate. We utilize Theorem 4.3 to reduce the search space and compute q eciently. For an interim rulex :T! [0; 1], we call ST a level set ofx if there exists an 2 [0; 1] such that S =ft2T :x(t)>g. If q = min S63t h(S), then at least 69 one Border constraint just becomes tight at q = q . Theorem 4.3 implies that at least one level set ofx q corresponds to one of these newly tightened constraints, and h is minimized by such a level set. It follows that, in order to compute q , it suces to minimize h over all those sets S63t which could possibly arise as level sets of somex q for q2 [0;f(t )x(t )]. Lett 1 ;:::;t K be the types inT i nft g, ordered by non-increasingx(t); similarly, let t 0 1 ;:::;t 0 L be the types in TnT i , ordered by non-increasing x(t). The relative order of types inT i nft g is the same underx q (t) as underx(t), becausex q (t) =x(t); similarly, the relative order of types inTnT i is the same underx q (t) as underx(t), because x q (t) = x(t) 1q . Therefore, the familyfft 1 ;:::;t k ;t 0 1 ;:::;t 0 ` g :kK;`Lg includes all level sets of everyx q excluding t . There are at most m 2 type sets in this family, and those sets can be enumerated eciently to minimize h. Lemma 4.23 There is an algorithm for computing a non-empty minimizer of a submodular function in the value oracle model, with runtime polynomial in the size of the ground set. Proof. For a submodular function g : 2 T !R, let g t (S) =g(S[ftg), for t2T . g t is also submodular, and can be minimized in time polynomial injTj [29]. LetS t be a minimizer ofg t andt 2 arg min t2T g t (S t ). S t [ft g is a non-empty minimizer of g. 4.5 Symmetric Dice For I.I.D. Candidates in Single-Winner Setting When the candidates' type distributions are i.i.d., i.e., T i and f i are the same for all candidates i, it is typically sucient to restrict attention to symmetric interim rules. For such rules, x i (t) is equal for all candidates i. In the i.i.d. setting, we therefore notationally omit the dependence on the candidate and let T refer to the common type set of all candidates,f to the candidates' (common) type distribution, andx(t) to the probability that a particular candidate wins conditioned on having 70 type t. However, even when the candidates are identical and the interim rule is symmetric, Algorithm 4 typically produces dierent dice for dierent candidates. In this section, we design an algorithm specically for the case ofn 2 i.i.d. candidates that guarantees thatD i;t is the same for all candidatesi. We call such a dice-based rule symmetric. Theorem 4.24 Consider a winner-selection environment withn candidates, where each candidate has types T , each candidate's type is drawn independently from the prior f on T , and at most one candidate can be selected as the winner. If x : T ! [0; 1] is a feasible symmetric interim rule, then it admits a symmetric dice-based implementation. An explicit representation of the associated dice can be computed in time polynomial in n and m =jTj. Algorithm 6, which is similar to Algorithm 4, recursively constructs a dice-based implementation for the symmetric interim rulex. ALGORITHM 6: Contruct Symmetric Dice(n;f;x) Input : Number of candidates n 2. Input : PDF f supported on T . Input : Symmetric interim rulex :T! [0; 1] feasible for f n . Output : Set of dice (D t ) t2T . 1 Let T + ftjf(t)x(t)> 0g be the set of active types. 2 ifjT + j 1 then 3 for all t2TnT + , let D t be a single-sided die with a1 face. 4 for all t2T + (if any), let D t be a die with two sidesf1; 1g and probability 1 n p 1nf(t)x(t) f(t) of coming up 1. 5 else 6 Let T =Find Barrier Set IID(n;f;x) . 7 Let t 2T be arbitrary. 8 Let (f 0 ;x 0 ) Decrement Symmetric Dice(n;f;x;t ;q ), for the largest value q 2 [0;f(t )] such thatx 0 is feasible for f 0 . 9 (D 0 t ) t2T Contruct Symmetric Dice(n;f 0 ;x 0 ). 10 Let M be the maximum possible face of any D 0 t , and M 0 := max(M; 0) + 1. 11 Let D t =D 0 t for t6=t . 12 Let D t be the die which rolls M 0 with probability q f(t ) , and rolls D 0 t with probability 1 q f(t ) . 13 return (D t ) t2T . 71 ALGORITHM 7: Decrement Symmetric Dice(n;f;x;t ;q) /* As in the algorithm Decrement, q is the probability assigned to the highest face. Unlike in the algorithm Decrement, due to symmetry, for any one candidate, the probability of winning thanks to this highest face is only (1 (1q) n )=n. */ 1 if f(t ) =q, then let f 0 (t ) 0;x 0 (t ) 0. 2 else let f 0 (t ) f(t )q 1q and x 0 (t ) f(t )x(t )(1(1q) n )=n (f(t )q)(1q) n1 : /* q< 1 because jT + j 2. */ 3 for all t6=t do let f 0 (t) f(t) 1q and x 0 (t) x(t) (1q) n1 . 4 return (f 0 ;x 0 ). In the i.i.d. setting, only the symmetric constraints from Theorem 4.1 suce to characterize feasibility [11]; namely, n X t2S f(t)x(t) 1 1 X t2S f(t) ! n ; (4.6) for all S T . Theorem 4.3 then implies that it suces to check Inequality (4.6) for sets of the form S() =ft2T :x(t)>g with 2 [0; 1]. We let n;f;x (S) denote the slack in the symmetric Border constraint forST . Similarly to Section 4.4, when x is feasible for (n;f), we call a set S T tight for (n;f;x) if n;f;x (S) = 0. Also similar to Section 4.4, we let T + = ft2T :f(t)x(t)> 0g be the set of active types, and we dene the barrier sets to be the minimal non-empty tight sets of active types if any non-empty tight sets exist; otherwise, T + is the unique barrier set. The algorithm Find Barrier Set IID is essentially identical to Find Barrier Set from Section 4.4; hence, we omit it here. The following lemma shows that the barrier set is unique in the i.i.d. setting, and that therefore, every tight set contains it. Lemma 4.25 In the i.i.d. setting, there is a unique barrier set. Proof. The uniqueness of the barrier set can be shown by a proof very similar to the proof of Lemma 4.17. Assume for contradiction that there are two barrier sets A and B. Minimality implies that A and B are disjoint. Since there are at least 72 2 candidates, with positive probability, at least one candidate has a type in A and at least one candidate has a type in B. In this event, tightness implies that the winner's type must lie in both A and B, contradicting their disjointness. We now sketch the proof of Theorem 4.24, which is very similar to that of Theorem 4.16. Proof of Theorem 4.24. First, we show correctness. Assuming that the algorithm terminates, we show that it outputs a set of dice implementing the symmetric rulex for f n . In the base casejT + j 1, our choice of dice for types t = 2T + ensures that they are never chosen as winner. If there is a (unique) typet2T + , then the probability of its die rolling 1 is 1 n p 1nf(t)x(t) f(t) . The probability that a particular candidate has type t and rolls 1 is therefore 1 n p 1nf(t)x(t), and the probability that at least one candidate has type t and rolls 1 is nf(t)x(t). Thus, the probability that some candidate has type t and wins is nf(t)x(t); by symmetry and the uniform tie breaking rule, the probability that a specic candidate has type t and wins is f(t)x(t). For the inductive step, consider a recursive call toContruct Symmetric Dice. Assume by induction hypothesis that the dice collection (D 0 t ) t2T implementsx 0 for (f 0 ) n , and consider the winner selection rule implemented by the dice (D t ) t2T for the joint type distribution f n . The probability of at least one candidate rolling the highest face M 0 is 1 (1 q ) n . Conditioned on no candidate rolling the highest face M 0 , the conditional type distribution of each candidate isf 0 , the conditional distribution of typet's die isD 0 t , and all these distributions are independent. The probability of a specic candidate winning with type t6=t is (1q ) n f 0 (t)x 0 (t) =f(t)x(t). Next consider the winning probability for typet . There are two ways in which a candidate with typet could win: by rolling the highest faceM 0 and winning the uniformly random tie breaking if necessary, or by having the highest face when no candidate rolls the faceM 0 . Similar to the argument in the base case, the probability that some candidate wins by rollingM 0 is (1 (1q ) n ), so the probability that a 73 specic candidate wins by rollingM 0 is 1 n (1(1q ) n ). The probability of winning without rolling the highest face is (1q ) n f 0 (t )x 0 (t ). In total, the probability of a specic candidate winning with typet is 1 n (1 (1q ) n ) + (1q ) n f 0 (t )x 0 (t ) = f(t )x(t ), so for all types t, the interim winning probability of each type t is x(t), as claimed. Next, we bound the number of iterations by m 2 . As before, we show that with each recursive call, either the size of the unique barrier set decreases, or the number of active types decreases. The following equation captures the key part of the analogue of Lemma 4.21, showing that invoking Decrement Symmetric Dice(n;f;x;t ;q) preserves slackness and tightness of every set S3t . n;f 0 ;x 0(S) = 1 1 X t2S f 0 (t) ! n n X t2S f 0 (t)x 0 (t) = 1 1 P t2S f(t)q 1q n n P t2S f(t)x(t) (1 (1q) n )=n (1q) n = 1 (1q) n (1q) n 1 X t2S f(t) ! n n X t2S f(t)x(t) + (1 (1q n )) ! = n;f;x (S) (1q) n : Now consider a call to Contruct Symmetric Dice. There are again two cases, based on the cardinality of the barrier setT . IfjT j> 1, then eitherq =f(t ), and the number of active types obviously decreases, orq <f(t ), and a new tight set is created in step (8). In the latter case, the size of the barrier set decreases in the next recursive call; as in the proof of Lemma 4.19, this follows from the uniqueness of the barrier set (Lemma 4.25), the fact that Decrement Symmetric Dice preserves tightness and slackness of sets containing the type t (including, in particular, the tight set T ), and the fact that tight sets are closed under intersection. 74 If T = ft g is a singleton, then tightness implies that nf(t )x(t ) = 1 (1 f(t )) n . The following calculation then shows that invoking Decrement Symmetric Dice(n;f;x;t ;q) with q = f(t ) does not violate the feasibility of any set S63t . n;f 0 ;x 0(S) = 1 1 X t2S f 0 (t) ! n n X t2S f 0 (t)x 0 (t) = 1 1 P t2S f(t) 1q n n X t2S f(t)x(t) (1q) n = 1 (1q) n (1q) n 1q X t2S f(t) ! n n X t2S f(t)x(t) ! = 1 (1q) n 1 1f(t ) X t2S f(t) ! n n X t2S f(t)x(t) (1 (1f(t )) n ) ! = n;f;x (S[ft g) (1q) n 0: Because feasibility is also preserved for sets S 3 t (as mentioned above), this implies that q =f(t ); as a result, the number of active types decreases. Finally, it remains to show that each recursive call can be implemented in time polynomial in n and m. For step (6), the proof is essentially identical to the analogous claim in Section 4.4 and is therefore omitted. For computing q in Step (8), the proof is similar to the analogous claim in Section 4.4. Let (f q ;x q ) = Decrement Symmetric Dice(n;f;x;t ;q); as shown above, this operation preserves the feasibility of every constraint corresponding to S3t . For sets S63t , there exists a threshold h(S) for the maximum value of qf(t )x(t ) such that n;fq;xq (S) 0, and this threshold can be computed numerically. As in the proof of Lemma 4.20, Theorem 4.3 implies that it suces to restrict attention to level sets of the form S() =ft6=t :x q (t)>g for some q2 [0;f(t )] and 2 [0; 1]. 75 In the i.i.d. setting, there are at most m 1 such level sets, since the relative order of types t6=t by x q (t) does not depend on q (and hence, can be computed fromx). As in the proof of Lemma 4.20, computingq then reduces to minimizing h(S) over these m 1 level sets. 4.6 From Winner-Selecting Dice to Interim Rules for Single-Winner Environments Having shown how to compute winner-selecting dice from a given interim rule, we next show the easier converse direction: how to compute the interim rule given winner-selecting dicefD i;t g in single-winner environments. As before, we denote the type set of candidate i by T i , and assume without loss of generality that the type sets of dierent candidates are disjoint. For simplicity of exposition, we assume that each die has a given nite support 3 , and we write U i := S t2T i supp(D i;t ) for the combined support of candidate i's dice. We also assume that we can evaluate the probability Pr[D i;t =u] of the face labeled withu, for all candidatesi, typest, and faces u2U i . Theorem 4.26 Consider a single-winner selection environment with n candidates and independent priors f 1 ;:::;f n , where f i is supported on T i . Given dicefD i;t j i2 [n];t2 T i g represented explicitly, the interim rule of the corresponding dice- based winner selection rule can be computed in time polynomial in n, m = P i jT i j, and the total support sizej S i U i j of all the dice. Proof. First, we can compute the probability mass function of each candidatei's (random) score v i . Given u2U i , we have Pr[v i =u] = P t2T i f i (t) Pr[D i;t =u]: From this probability mass function, we easily compute Pr[v i u] for each u2 U i by the appropriate summation. When all dice faces are distinct, this is all 3 Our approach extends easily to the case of continuously supported dice, so long as we can perform integration with respect to the distributions of the various dice. 76 we need; since Pr[v i 0 < u] = Pr[v i 0 u] for i 0 6= i and u2 U i , the interim rule is given by the following simple equation: x i (t) = X u2supp(D i;t ) Pr[D i;t =u] Y i 0 6=i Pr[v i 0u]: When the dice's faces are not distinct, recall that we break ties uniformly at random. To account for the contribution of this tie-breaking rule, we need the distribution of the number of other candidates that tie candidatei's score ofu; this is a Poisson Binomial distribution. More, precisely, we need the Poisson Binomial distribution with the n 1 parameters Pr[v i 0=u] Pr[v i 0u] i 0 6=i ; we denote its probability mass function by B i;u . It is well known, and easy to verify, that a simple dynamic program computes the probability mass function of a Poisson Binomial distribution in time polynomial in its number of parameters. Therefore, we can computeB i;u (k) for each i2 [n];u2 U i and k2f1;:::;n 1g. The interim rule is then given by the following equation: x i (t) = X u2supp(D i;t ) Pr[D i;t =u] Y i 0 6=i Pr[v i 0u] ! n1 X k=0 B i;u (k) k + 1 : It is easy to verify that all the above computations satisfy the claimed runtime. 77 Chapter 5 When Dice do not Work in Bayesian Persuasion 5.1 Summary of Results For signaling scheme design for bilateral trade with limited signals, we have found ecient (approximation) algorithms for both revenue and social welfare maximization. The approximation of social welfare is achieved by exploiting submodularity of the objective value. While it may be natural to conjecture that the submodularity property carries over from bilateral trade to more general persuasion games, this is not the case: we establish a strong hardness result for maximizing the sender's utility in general. Theorem 5.1 For any constant c > 0, it is NP-hard to construct a signaling scheme approximating the maximum expected sender utility to within a factor c, given an explicit representation of a Bayesian Persuasion game and a bound M on the number of signals. Next, we consider if there are simple representations of Bayesian persuasion. Choosing a signal to send can be considered as a winner-selection environment, so we would like to investigate whether all Bayesian persuasion instances admit dice-based implementations. If all Bayesian persuasion games could be described with rst-order interim rules, then our results of Chapter 4 would imply that there is an optimal dice- based scheme. Unfortunately, a rst-order interim rule is not enough to express 78 the persuasiveness constraints (2.1). From a rst-order interim rule, one can only infer, for each candidate i, the conditional type distribution of i in the event that i is chosen as the winner. Therefore, we dene second-order interim rules, 1 which convey strictly more information as needed for describing the incentive constraints of Bayesian persuasion. Such a rule species, for each pair of candidates i and i 0 (where i 0 may or may not be equal to i), the conditional type distribution of i 0 in the event that i is chosen as the winner. Formally, a second-order interim rule X species x i;i 0 ;t 2 [0; 1] for each pair of candidates i;i 0 2 [n], and type t2T i 0. We say that a winner-selection ruleA implementsX for a priorf if it satises the following: if the type prolet = (t 1 ;:::;t n ) is drawn from the prior distribution f =f 1 f n , then Pr[i2A(t)j t i 0 = t] = x i;i 0 ;t . A second-order interim rule is feasible if there is a winner selection rule implementing it. When a second-order interim rule is given, one can get the conditional type distribution of any action j when action i is selected as the winner, as follows: Pr[t j =tji is winner] = f j (t j )x i;j;t P t j 2T j f j (t j )x i;j;t j : Notice that the above conditional probability can be used to check the persuasiveness constraints (2.1). Thus a second-order interim rule gives a complete description for a solution of a Bayesian persuasion problem. When the candidate type distributions are non-identical, we show an impossibility result. We construct an instance of Bayesian persuasion with actions with independently but non-identically distributed types, and show that no optimal persuasion scheme for this instance can be implemented by dice. Since second-order interim rules are sucient for evaluating the objective and constraints of Bayesian persuasion, this implies that there exist second-order interim rules which are not dice-implementable. This rules out the Myersonian approach for characterizing and 1 Our notion of second-order interim rules is dierent from the notion dened in [16]. Because Cai et al. [16] consider correlation in types, their notion of second-order interim rules is aimed at capturing the allocation dependencies arising through such type correlation, rather than solely through the mechanism's choice. 79 computing optimal schemes for Bayesian persuasion with independent non-identical actions, complementing the negative result of [21] which rules out the Borderian approach for the same problem. Our impossibility result disappears when actions are i.i.d., since second-order interim rules collapse to rst-order interim rules in symmetric settings. In particular, our results for rst-order interim rules, combined with those of [21], imply that Bayesian persuasion with i.i.d. actions admits an optimal dice-based scheme, which can be computed eciently. 5.2 Hyperedge Guessing Game In this section, we present the proof of Theorem 5.1. More accurately, the following theorem shows the hardness of maximizing sanitized sender utility (dened in Section 3.1.2) to within any constant. Theorem 5.2 Unless P = NP, for every constant c > 0, there is no polynomial- time algorithm for the following problem. Given a Bayesian persuasion game ( ;p;A;u S ;u R ) and cardinality constraintM on the number of signals, construct a signaling scheme X using at most M signals such that the sanitized sender utility e U(X) under X is at least c e U(X ), where X is the signaling scheme maximizing e U(X). Because the sender utility and sanitized sender utility are within a factor of M1 M of each other, this implies the same hardness result for the sender utility, proving Theorem 5.1. We prove Theorem 5.2 by establishing hardness for a game we call the Hypergraph Edge Guessing Game (HEGG). There is a hypergraph H = (V;E) which is commonly known to the sender and receiver. The state of nature is a hyperedge e 2E, drawn from the uniform distribution. The receiver has two types of actions available: trying to guess the hyperedge, or \hedging her bets" by guessing a vertex v2 V . If she guesses an edge e, then 80 she gets 1 if her guess was correct (e =e ), and 0 otherwise. If she guesses a vertex v, she gets 1=d v (the degree of v) if v is incident on e , and 0 otherwise. The sender's utility is determined by the receiver's guess. If the receiver guesses an edge, the sender gets utility 0, regardless of whether the guess is correct. If the receiver guesses a vertex v, the sender has utility 1=d v (the same as the receiver) if v is incident on e , and 0 otherwise. Since the sender has access to e , it is his goal to design a signaling scheme that narrows down the possible states of nature for the receiver enough that she can get an incident vertex, but not so much as to induce her to guess a hyperedge. This is accomplished by making the posterior distribution conditioned on any signal uniform across edges incident on a particular vertex. Ideally, we would like this to be the case for all signals, but this may simply be impossible. However, we can achieve it for all but one signal. Denition 5.3 A signaling scheme X is vertex-centric if for all signals except at most one, there exists a node v = v() such that x v;e = x v;e 0 for all hyperedges e;e 0 3v, and x v;e = 0 for all hyperedges e63v. That is, in a vertex-centric signaling scheme, all but one signal induce a uniform posterior distribution over edges incident on one vertex. Lemma 5.4 For any signaling scheme X, there is a vertex-centric signaling schemeX 0 with e U(X 0 ) e U(X), and which can be constructed fromX in polynomial time. Proof. Consider any signaling scheme X, characterized by the probabilities x e; that the state of the world is e and the sender sends the signal . (Recall that these are not conditional probabilities.) For each signal , there is a unique (after tiebreaking) action that the receiver takes, either a hyperedge e or a vertex v. If the receiver chooses a hyperedge e, the sender's utility is 0; for a vertex v, it is u() = 1 dv P e incident on v x e; . Let? be the designated garbage signal; without loss of generality (by renaming), it minimizesu(?). First, we may assume w.l.o.g. that the receiver does not choose 81 a hyperedge for any signal6=?. Otherwise, since the sender's utilityu() = 0, we could reallocate all probability mass from to? without changing the sanitized sender utility; under the new signal (which is never sent), the receiver w.l.o.g. plays a vertex. Consider any signal 6=?, and let v be the vertex the receiver chooses in response to. First, if there is any hyperedgee not incident onv withx e; > 0, we can safely lowerx e; to 0 (reassigning the probability mass to?), without changing the receiver's action (because e was not incident on v, this change cannot make v less attractive), and without aecting u(). Let d = d v be the degree of v, and e 1 ;e 2 ;:::;e d the hyperedges incident on v, sorted such that x e 1 ; x e 2 ; ::: x e d ; . If x e 1 ; > x e d ; , then the receiver's expected utility from choosing e 1 is x e 1 ; , whereas her utility from choosing v is 1 d P d i=1 x e i ; <x e 1 ; . This would contradict the receiver's playing v. Notice that the changes do not aect the utility under any signal except the garbage signal, so the sanitized sender utility stays the same. Now consider the following optimization problem: nd a vertex-centric signaling scheme X with a dedicated garbage signal? that maximizes the sanitized sender utility e U(X). By denition of a vertex-centric signaling scheme, x e;v =x e 0 ;v for all hyperedgese;e 0 incident onv; we denote this quantity byy v . Then, the probability of sending the signal inducing the receiver to choose v is P e3v x e;v = d v y v , and the resulting sender utility conditioned on sending it is 1=d v . A vertex-centric signaling scheme is entirely determined by the M 1 vertices and their associated probabilities y v ; hence, the optimization problem can be expressed as follows. Maximize kyk 1 subject to P v2e y v 1 jEj for all e2E; kyk 0 M 1; y 0: 82 The rst constraint captures that the total probability of all signals sent when the state of the world is e can be at most the probability that the state of the world ise, which is 1=jEj. Rescaling ally v values by a factorjEj and removing that constant factor from the objective gives us the following equivalent characterization. Maximize kyk 1 subject to P v2e y v 1 for all e2E; kyk 0 M 1; y 0: (5.1) Notice that Program (5.1) would exactly be an Independent Set characterization if the y v were restricted to be integral. The following lemma shows that the upper bound on the support ofy is enough to ensure that the optimal solution cannot be approximated to within any constant (when the hyperedges are large enough). Lemma 5.5 For any constant r 1, unless P = NP, the optimum solution of Program (5.1) cannot be approximated to within a factor better than 1=r. Proof. We give a reduction from the gap version of Independent Set. Given a graph G = (V;E) and any constant > 0, and the promise that the largest independent set of G has size either less than n or more than n 1 , it is NP-hard to answer \No" in the former case and \Yes" in the latter [34]. For our reduction, we specically choose = 1 r+2 . Given G, we create a hypergraph H = (V;E 0 ) on the same node set, whose hyperedges are exactly the cliques of size r + 1 in G, i.e., E 0 = fS V j S is a clique of size r + 1 in Gg. The constraint on the support size of y is M 0 = n 1 . Notice that the reduction is computed in time O(n r+1 ), which is polynomial in n for constant r. We will show that if G has an independent set of size n 1 , then the objective value of Program (5.1) is M 0 , whereas if G has no independent set of size n , then the objective value is less than M 0 =r. First, suppose thatG has an independent setS of sizeM 0 . Consider the solution y to Program (5.1) which setsy v = 1 for allv2S, andy v = 0 for all others. Because 83 S is independent inG, it contains at most one vertex from each (r+1)-clique; hence, the proposed solution is valid, and it achieves an objective value of M 0 . Conversely, lety be a solution to Program (5.1), and assume that its objective value is at least 1 r M 0 . Let S be the set of all indices v such that y v > 1 r+1 . Then, by the assumed lower bound on the objective value,jSj M 0 r 2 . Consider the subgraph G[S] induced by S in G. By the constraint for each hyperedge, G[S] contains no (r + 1)-clique; otherwise, the corresponding y v would add up to more than 1. Now, Ramsey's Theorem implies that G[S] contains an independent set of size (n 1=r ), as follows. Recall that the Ramsey Number R(r + 1;b) is the minimum sizes such that each graph of sizes contains a clique of sizer+1 or an independent set of sizeb. BecauseR(r+1;b) r+b1 r 2O(b r ) [63], and G[S] is a graph of size at least M 0 r 2 not containing any (r + 1)-clique, it must contain an independent set of size at least ((M 0 =r 2 ) 1=r ) = (n (1)=r ) = !(n ). This completes the proof. Proof of Theorem 5.2. The proof is now straightforward. Given an instance of the Independent Set problem, we construct the instance of the HEGG according to the proof of Lemma 5.5, setting the allowed number of signals to M = 1 +M 0 . Feasible solutions to Program (5.1) exactly capture vertex-centered signaling schemes, and the objective value is the sanitized sender utility (scaled by jE 0 j). 5.3 No Winner-Selecting Dice for Persuasion In this section, we investigate the existence of winner-selecting dice for instances of Bayesian persuasion. Our main result (Theorem 5.6) is to exhibit an instance with independent non-identical actions for which there is no optimal signaling scheme that can be implemented using winner-selecting dice. This result is contrasted with Theorem 5.7, which shows that when the actions' types are not just independent, but identically distributed as well, a dice-based implementation always does exist. 84 Theorem 5.6 There is an instance of Bayesian persuasion (given in Table 5.1) with independent actions which does not admit a dice-based implementation of any optimal signaling scheme. Consequently, there exists a second-order interim rule which does not admit a dice-based implementation. Theorem 5.7 Every Bayesian persuasion instance with i.i.d. actions admits an optimal dice-based signaling scheme. Moreover, when the prior type distribution is given explicitly, the corresponding dice can be computed in time polynomial in the number of actions and types. The negative result of Theorem 5.6 has interesting implications. Since second- order interim rules summarize all the attributes of a winner selection rule relevant to persuasion, second-order interim rules, unlike their rst-order brethren, can in general not be implemented by dice. Most importantly, this result draws a sharp contrast between persuasion and single-item auctions, despite their supercial similarity: it rules out a Myerson-like virtual-value characterization of optimal persuasion schemes, and it joins the #P-hardness result of [21] as evidence of the intractability of optimal persuasion. Proof of Theorem 5.6. The persuasion instance, shown in Table 5.1, features three actionsfA;B;Cg, each of which has two typesf1; 2g. The types of the dierent actions are distributed independently. In the instance, the sender's utility from any particular action is a constant, independent of the action's type. action type 1 2 A 0:5 (100; 2) 0:5 (100;1) B 0:99 (1; 3) 0:01 (1;1) C 0:5 (0; 0) 0:5 (0; 6) Table 5.1: A Persuasion instance with no dice-based implementation. The notation p (s;r) denotes that the type (s;r) (in which the sender and receiver payos are s and r, respectively) has probability p. 85 One (optimal, as we will show implicitly) signaling scheme is the following. (In writing a type vector, here and below, we use to denote that the type of an action is irrelevant.) If the type vector is (1;; 1), then recommend action A. If the type vector is (1;; 2), then recommend each of A;C with equal probability 1 2 . If the type vector is (2; 1;), then recommend action B. If the type vector is (2; 2;), then recommend action C. While this is not the unique optimal scheme, we next prove that none of the optimal persuasion schemes admit a dice-based implementation. The given signaling scheme recommends action A with probability 3=8 overall, actionB with probability 99 200 overall, and actionC with the remaining probability. No persuasive signaling scheme can recommend A with probability strictly more than 3=8, because conditioned on receiving the recommendation A, action C must be at least twice as likely to be of type 1 as of type 2, in addition to actionA being of type 1 with probability 1. Similarly, no persuasive scheme can recommend B with probability strictly more than 99 200 , because actionC must be at least as likely to be of type 1 as of type 2 whenC is recommended, in addition to actionB being of type 1 with probability 1. Hence, any optimal signaling scheme must recommend A with probability 3=8 andB with probability 99 200 , and the given scheme is in fact optimal. Suppose for a contradiction that there exist dice (D i;j ) i2fA;B;Cg;j2f1;2g implementing an optimal signaling scheme. We gradually derive properties of these optimal signaling schemes, eventually leading to a contradiction. 1. Since action A can never be recommended when it has type 2 (the receiver would never follow the recommendation), it must be recommended with probability 3 4 conditioned on having type 1. 2. In particular, whenever the type prole is (1;; 1), action A must be recommended, regardless of the type of action B. This is because action 86 C must be at least twice as likely of type 1 as of type 2 for a recommendation of A to be persuasive. 3. Therefore, all faces onD A;1 must be larger than all faces onD B;1 and onD B;2 . 4. Because of this, actionB can never be recommended when the type prole is (1;; 2). 5. Thus, when the type prole is (1;; 2), the signaling scheme has to recommend each of A and C with probability 1 2 . (The recommendation could of course follow dierent distributions based on the type of B; such a correlation is immaterial for our argument.) 6. Given that action B cannot be recommended when action A has type 1, or when action B has type 2, it must always be recommended for type vectors (2; 1;). 7. This implies that all faces of D B;1 must be larger than all faces on D C;1 and on D C;2 . 8. This is a contradiction to Step 5, which states that with positive probability, D C;2 beats D B;1 . Thus, we have proved that there is no dice-based implementation of any optimal signaling scheme for the given instance. Proof of Theorem 5.7. When the actions' (or more generally: candidates') type distributions are i.i.d., i.e.,T i andf i are the same for all candidatesi, Dughmi and Xu [21] have shown that there is an optimal symmetric signaling scheme, or more generally a symmetric second-order interim rule X. We show that any symmetric second-order interim rule X is uniquely determined by its rst-order component, a fact implicit in [21]. For symmetric rules, x i;i 0 ;t depends only on whether i = i 0 or i6= i 0 , but not on the identities of the candidates i and i 0 . Therefore, X can be equivalently described by two type-indexed vectorsy andz, where y t =x i;i;t for all candidates i, andz t =x i;i 0 ;t for all candidatesi andi 0 withi6=i 0 . The vectory is a rst-order 87 interim rule, and we refer to it as the rst-order component ofX. IfX is feasible and implemented byA, then y t = x i;i;t = Pr[A(t) = ij t i = t] for all candidates i, soy is the rst-order interim rule implemented byA. For every candidate i and type t, we have 1 = n X i 0 =1 Pr[A(t) =i 0 jt i =t] = (n 1)z t +y t : Therefore,z = 1y n1 , and the rst-order component of a symmetric second-order interim rule suces to fully describe it. The second-order rule is also, by the preceding argument, eciently computable from its rst-order component, and is feasible if and only if its rst-order component is a feasible symmetric interim rule. Moreover, by [15], feasibility of symmetric second-order interim rules can be checked in time polynomial in the number of types and candidates, and given a feasible symmetric second-order interim rule X, a winner selection rule implementing X can be evaluated in time polynomial in the number of types and candidates. 88 Chapter 6 Conclusion and Future Work In this thesis, we examined how randomization can help us with making choices. Specically, we studied signaling scheme design for bilateral trade and dice-based rules in winner-selection environments. Randomization plays dierent roles in these two scenarios: in bilateral trade, randomization is necessary for optimizing the social welfare when there is a communication constraint; for winner-selection environments, randomization generalizes the simple and intuitive Myerson's auction from revenue optimization: dice-based rules exist for all feasible interim rules with matroid constraints. Randomization is used with constraints for both of the two problems: the communication constraint in Bayesian persuasion can be viewed as constraint on the dice (distributions) that the sender can use to implement the signaling scheme: the dice can only use a small number of faces; the requirement of using independent dice in dice-based rules is essential for making it a valid generalization of Myerson's auction and order sampling. In this thesis, we have examined a landscape of Bayesian persuasion. Start with the simple revenue/welfare maximization in bilateral trade under communication constraints, we showed ecient (approximation) algorithms that solve them. Then we moved on to general Bayesian persuasion with a single-dimensional state of nature under communication constraints and proved that it is hard to approximate within any constant factor. Finally for the most general Bayesian persuasion, we proved that the simple yet general dice-based rules cannot be the optimal signaling 89 schemes. These three landmarks provide some guidance on potentially future work on Bayesian persuasion. The other contribution of this thesis is that we compared auction design, which is mechanism design with money, and Bayesian persuasion, which can be considered \mechanism design with information". Our discussion on dice-based rules gave a clear separation of these two kinds of mechanism design: one admits simple dice-based rules and the other does not. This thesis only partially addressed what we can and cannot achieve under constraints of randomization. There are many interesting open questions on the same topic, several of which are listed below. Eciently optimize the social welfare or buyer's value in bilateral trade with a communication constraint. In Chapter 3, we only gave a Quasi-PTAS and a constant-factor approximation for maximizing the social welfare. Is it possible to nd the optimal social welfare in polynomial time, or can it be proved to be NP-hard? Bergemann et al. [8] also consider maximizing the buyer's utility. It is not hard to see that given the price points for each signal, the buyer's utility can be maximized by a linear program very similar to (3.1). However, it is not clear that the overall objective function is still submodular, or whether a similar greedy algorithm to the one from Section 3.4 optimally solves the corresponding LP. For the persuasion problem with limited communication, we establish that no approximation of social welfare to within any constant is possible. Can this result be strengthened to logarithmic or polynomial hardness? Explore the eect of other constraints on Bayesian persuasion. In this thesis, we only examine the eect of communication constraints on Bayesian persuasion. Will there be interesting results for other natural constraints on signaling schemes? Here we list two of them. { The signals are sent by sampling, namely, the sender does not reveal the signaling scheme to the receiver directly. Instead, he provides some 90 sample pairs of signal and state of nature to the receiver, and the receiver will learn the signaling scheme from these samples. In this case, a signal which appears with tiny probability may not help because it will not show up in most cases. { When the state of nature is drawn from an ordered set, a signal should always induce a posterior distribution supported on a continuous range of states of nature. For example, when you send signals about a student's performance, you should not bundle the best and worst students together without including students in between. For both of these constraints, we can ask two questions: (1) can we design an (approximately) optimal signaling schemes for simple settings like bilateral trade? (2) How hard is it to persuade in the general setting? Understand the limits of dice-based winner selection rules. We have shown that dice-based winner selection rules can implement all rst- order interim rules with matroid constraints, but not all second-order interim rules; in particular, there are instances of Bayesian persuasion in which no optimal signaling scheme can be implemented using dice. While our existence proof uses matroid properties, matroid constraints are not the limit of implementability by dice: in Appendix B, we show an example in which the feasible sets do not form a matroid, yet every feasible interim rule within the environment is implementable with dice. This rules out a characterization of the form \a feasibility constraintI has all feasible (x;f) implementable by dice if and only ifI is a matroid." In fact, we do not know of any feasibility constraint and corresponding rst-order feasible interim rule for which a dice-based implementation can be ruled out, though we strongly suspect that such examples exist. A diculty in verifying our conjecture is that we are not aware of a useful general technique for proving the non- existence of a dice-based implementation for a given interim rule. Notice that our proof in Section 4.3.2 implies that any dice-based rule can be reduced to a 91 rule only using dice with nitely many faces. Even with this assumption, it is hard to use a computer to enumerate all dice-based rules and check whether they implement an interim rule. On the other hand, even if the verication can be done eciently, the space of interim rules and set systems is relatively large. Find an ecient algorithm that constructs dice-based rules from given interim rules in matroid settings. To derive an ecient algorithm from our existential proof, the functions g t andh t would have to be evaluated eciently. However, even forg 1 , we do not have an ecient way to evaluate the interim winning probabilities when the dice are continuous. h t is dened using the Intermediate Value Theorem, so it seems a binary search may help. But notice that the function g t is dened recursively; in order to evaluate h t , we need to trace back to g 1 , but this will induce exponentially many binary searches. Furthermore, even if a set of continuous dice is given, our proof that the number of faces can be made nite is nonconstructive. It is unclear how to convert continuous dice to dice with nitely many faces in polynomial time. 92 Reference List [1] Inaki Aguirre, Simon Cowan, and John Vickers. Monopoly price discrimination and demand curvature. American Economic Review, 100(4):1601{15, 2010. [2] Nibia Aires, Johan Jonasson, and Olle Nerman. Order sampling design with prescribed inclusion probabilities. Scandinavian Journal of Statistics, 29(1): 183{187, 2002. [3] Saeed Alaei, Hu Fu, Nima Haghpanah, Jason D. Hartline, and Azarakhsh Malekian. Bayesian optimal auctions via multi-to single-agent reduction. In Proc. 13th ACM Conf. on Electronic Commerce, page 17, 2012. [4] Ricardo Alonso and Odilon C^ amara. Persuading voters. American Economic Review, 106(11):3590{3605, 2016. [5] Itai Arieli and Yakov Babichenko. Private bayesian persuasion. 2016. [6] Moshe Babaio, Robert Kleinberg, and Renato Paes Leme. Optimal mechanisms for selling information. In Proceedings of the 13th ACM Conference on Electronic Commerce, pages 92{109. ACM, 2012. [7] Xiaohui Bei, Ning Chen, Nick Gravin, and Pinyan Lu. Budget feasible mechanism design: from prior-free to bayesian. In Proceedings of the forty- fourth annual ACM symposium on Theory of computing, pages 449{458. ACM, 2012. [8] Dirk Bergemann, Benjamin Brooks, and Stephen Morris. The limits of price discrimination. Amer. Econ. Rev., 105(3):921{957, 2015. [9] Liad Blumrosen and Michal Feldman. Implementation with a bounded action space. In Proc. 7th ACM Conf. on Electronic Commerce, pages 62{71. ACM, 2006. [10] Liad Blumrosen, Noam Nisan, and Ilya Segal. Auctions with severely bounded communication. Journal of Articial Intelligence Research, 28:233{266, 2007. [11] Kim C. Border. Implementation of reduced form auctions: A geometric approach. Econometrica, 59(4):1175{1187, 1991. 93 [12] Kim C. Border. Reduced form auctions revisited. Economic Theory, 31(1): 167{181, 2007. [13] Peter Bro Miltersen and Or Sheet. Send mixed signals: Earn more, work less. In Proc. 13th ACM Conf. on Electronic Commerce, pages 234{247, 2012. [14] Isabelle Brocas and Juan D. Carrillo. In uence through ignorance. The RAND Journal of Economics, 38(4):931{947, 2007. ISSN 1756-2171. [15] Yang Cai, Constantinos Daskalakis, and S. Matthew Weinberg. An algorithmic characterization of multi-dimensional mechanisms. In Proc. 44th ACM Symp. on Theory of Computing, pages 459{478, 2012. [16] Yang Cai, Constantinos Daskalakis, and S. Matthew Weinberg. Optimal multi- dimensional mechanism design: Reducing revenue to welfare maximization. In Proc. 53rd IEEE Symp. on Foundations of Computer Science, pages 130{139, 2012. [17] Ning Chen, Nick Gravin, and Pinyan Lu. On the approximability of budget feasible mechanisms. In Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms, pages 685{699. Society for Industrial and Applied Mathematics, 2011. [18] Yu Cheng, Ho Yee Cheung, Shaddin Dughmi, Ehsan Emamjomeh-Zadeh, Li Han, and Shang-Hua Teng. Mixture selection, mechanism design, and signaling. In Proc. 56th IEEE Symp. on Foundations of Computer Science, pages 1426{1445, 2015. [19] Shaddin Dughmi. On the hardness of signaling. In Proc. 55th IEEE Symp. on Foundations of Computer Science, pages 354{363, 2014. [20] Shaddin Dughmi. Algorithmic information structure design: a survey. ACM SIGecom Exchanges, 15(2):2{24, 2017. [21] Shaddin Dughmi and Haifeng Xu. Algorithmic bayesian persuasion. In Proc. 48th ACM Symp. on Theory of Computing, 2016. [22] Shaddin Dughmi and Haifeng Xu. Algorithmic persuasion with no externalities. In Proceedings of the 2017 ACM Conference on Economics and Computation, pages 351{368. ACM, 2017. [23] Shaddin Dughmi, Nicole Immorlica, and Aaron Roth. Constrained signaling in auction design. In Proc. 25th ACM-SIAM Symp. on Discrete Algorithms, pages 1341{1357, 2014. 94 [24] Shaddin Dughmi, Nicole Immorlica, Ryan O'Donnell, and Li-Yang Tan. Algorithmic signaling of features in auction design. In Algorithmic Game Theory - 8th International Symposium, SAGT 2015, Saarbr ucken, Germany, September 28-30, 2015, Proceedings, pages 150{162, 2015. [25] Yuval Emek, Michal Feldman, Iftah Gamzu, Renato Paes Leme, and Moshe Tennenholtz. Signaling schemes for revenue maximization. In Proc. 13th ACM Conf. on Electronic Commerce, pages 514{531, 2012. [26] Wolfgang Gick and Thilo Pausch. Persuasion by stress testing: Optimal disclosure of supervisory information in the banking sector. Discussion Paper 32/2012, Deutsche Bundesbank, 2012. [27] Itay Goldstein and Yaron Leitner. Stress tests and information disclosure. Journal of Economic Theory, 2018. [28] Parikshit Gopalan, Noam Nisan, and Tim Roughgarden. Public projects, boolean functions and the borders of Border's Theorem. In Proc. 16th ACM Conf. on Economics and Computation, page 395, 2015. [29] Martin Gr otschel, L aszl o Lov asz, and Alexander Schrijver. Geometric algorithms and combinatorial optimization, volume 2. Springer Science & Business Media, 2012. [30] Mingyu Guo and Argyrios Deligkas. Revenue maximization via hiding item attributes. In Proc. 23rd Intl. Joint Conf. on Articial Intelligence, 2013. [31] Jaroslav H ajek. Asymptotic theory of rejective sampling with varying probabilities from a nite population. The Annals of Mathematical Statistics, pages 1491{1523, 1964. [32] Jason D Hartline. Mechanism design and approximation. Book draft. October, 122, 2013. [33] Jason D. Hartline. Mechanism design and approximation. Now Publishers, 2013. [34] Johan H astad. Clique is hard to approximate withinn 1" . Acta Mathematica, 182:105{142, 1999. [35] Satoru Iwata, Lisa Fleischer, and Satoru Fujishige. A combinatorial strongly polynomial algorithm for minimizing submodular functions. Journal of the ACM (JACM), 48(4):761{777, 2001. [36] Emir Kamenica and Matthew Gentzkow. Bayesian persuasion. Amer. Econ. Rev., 101(6):2590{2615, 2011. 95 [37] Anton Kolotilin. Experimental design to persuade. Games and Econ. Behav., 90:215{226, 2015. [38] Bernhard Korte and Jens Vygen. Combinatorial optimization, volume 2. Springer, 2012. [39] Ilan Kremer, Yishay Mansour, and Motty Perry. Implementing the \wisdom of the crowd". Journal of Political Economy, 122(5):988{1012, 2014. [40] Jean-Bernard Lasserre. Moments, positive polynomials and their applications, volume 1. World Scientic, 2010. [41] Yishay Mansour, Aleksandrs Slivkins, and Vasilis Syrgkanis. Bayesian incentive-compatible bandit exploration. In Proc. 16th ACM Conf. on Economics and Computation, pages 565{582, 2015. [42] Andreu Mas-Collel, Michael D. Whinston, and Jerry R. Green. Microeconomic Theory. Oxford University Press, 1995. [43] Eric Maskin and John Riley. Optimal auctions with risk averse buyers. Econometrica, 52(6):1473{1518, 1984. [44] Steven A. Matthews. On the implementability of reduced form auctions. Econometrica, 52(6):1519{1522, 1984. [45] Konrad Mierendor. Asymmetric reduced form auctions. Economics Letters, 110(1):41{44, 2011. [46] Oskar Morgenstern and John Von Neumann. Theory of games and economic behavior. 1944. [47] Rajeev Motwani and Prabhakar Raghavan. Randomized algorithms. Chapman & Hall/CRC, 2010. [48] Roger B. Myerson. Optimal auction design. Math. Oper. Res., 6(1):58{73, 1981. [49] John F Nash et al. Equilibrium points in n-person games. Proceedings of the National Academy of Sciences, 36(1):48{49, 1950. [50] George L Nemhauser, Laurence A Wolsey, and Marshall L Fisher. An analysis of approximations for maximizing submodular set functionsi. Mathematical programming, 14(1):265{294, 1978. [51] George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 14:265{294, 1978. 96 [52] Noam Nisan and Amir Ronen. Algorithmic mechanism design. Games and Economic behavior, 35(1-2):166{196, 2001. [53] James G Oxley. Matroid theory, volume 3. Oxford University Press, USA, 2006. [54] Mallesh M. Pai and Rakesh Vohra. Optimal auctions with nancially constrained buyers. J. Econ. Theory, 150:383{425, 2014. [55] Arthur C. Pigou. The Economics of Welfare. Macmillan, 1920. [56] Michael O Rabin. Probabilistic algorithms, algorithms and complexity: New directions and recent trends (jf traub, ed.), 1976. [57] Michael O Rabin. Probabilistic algorithm for testing primality. Journal of number theory, 12(1):128{138, 1980. [58] Zinovi Rabinovich, Albert Xin Jiang, Manish Jain, and Haifeng Xu. Information disclosure as a means to security. In Proc. 14th Intl. Conf. on Autonomous Agents and Multiagent Systems, pages 645{653, 2015. [59] Anne-Katrin Roesler and Bal azs Szentes. Buyer-optimal learning and monopoly pricing. Technical report, Mimeo, London School of Economics, 2016. [60] Bengt Ros en. Asymptotic theory for order sampling. Journal of Statistical Planning and Inference, 62(2):135{158, 1997. [61] MR Sampford. On sampling without replacement with unequal probabilities of selection. Biometrika, 54(3-4):499{513, 1967. [62] Yaron Singer. Budget feasible mechanisms. In Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on, pages 765{774. IEEE, 2010. [63] Jacobus Hendricus van Lint and Richard Michael Wilson. A Course in Combinatorics. Cambridge University Press, 2001. [64] Hassler Whitney. On the abstract properties of linear dependence. American Journal of Mathematics, 57(3):509{533, 1935. [65] Haifeng Xu, Zinovi Rabinovich, Shaddin Dughmi, and Milind Tambe. Exploring information asymmetry in two-stage security games. In Proc. 29th AAAI Conf. on Articial Intelligence, pages 1057{1063, 2015. 97 Appendix A Proofs in Chapter 3 A.1 Proof of Theorem 3.10 In this section, we prove Theorem 3.10. Theorem 3.10 The algorithm Construct-Signaling-Scheme solves the linear program (3.1) optimally. Proof. Let X be an optimal solution for the linear program (3.1), and X the signaling scheme constructed by the algorithm Construct-Signaling-Scheme. We will show that X can be (gradually) transformed into X without decreasing its solution quality, proving optimality of X. First, we may assume without loss of generality that x i; = 0 for all i < k , since setting them to 0 aects neither the objective value nor the constraints. Let x ,x denote column of X, X , i.e., the vector of probabilities that constitute signal . Assume thatX6=X . Let be the smallest index (i.e., with largest price) such thatx 6=x . Let i be minimal such that x i; 6=x i; . For notational convenience, since we will mostly focus on the signal , we write y = x and y = x . Let p 0 =p P 0 < x 0 =p P 0 < x 0 be the vector of residual probabilities at the time that signal was greedily constructed. We now distinguish two cases: 98 Case 1: y i <y i : By Lemma 3.9, y [i;n] y [i;n] , and because y i <y i , we get that y [i+1;n] >y [i+1;n] . In particular, there must be an index i 0 > i such that y i 0 > y i 0; let i 0 be the smallest such index. Let = min(y i 0y i 0;y i y i )> 0. We next show that under X , all signals combined must use all probability mass of type i 0 , i.e., P M 0 =1 x i 0 ; 0 = p i 0. If this were not the case, then dene = min(;p i 0 P M 0 =1 x i 0 ; 0)> 0, and consider modifyingX by updatingy 0 i 0 =y i 0+ and y 0 i = y i (and leaving y 0 j = y j for all j6= i;i 0 ). By choice of , this new solutionX 0 does not violate the non-negativity or total probability constraints, and we claim that (1) it satises the revenue constraint (3.4) for allj, and (2) its welfare is strictly higher than that of X . To check the revenue constraints, notice rst that the seller's revenue under X 0 for indices ji and j >i 0 is unchanged, so (3.4) still holds for such j. It remains to consider j2fi + 1;:::;i 0 g. Fix one such j. By denition of i and i 0 , we have that y j 0y 0 j 0 for k j 0 <i 0 ; in particular, we can infer for our j that y 0 [k;j) = j1 X j 0 =k y 0 j 0 j1 X j 0 =k y j 0 = y [k;j) : (A.1) Becausey [i;n] y [i;n] andy i y i , and the denition ofi 0 , we get that for all j2fi + 1;:::;i 0 g, y [j;n] y [j;n] + = y 0 [j;n] : (A.2) Combining these two inequalities with the fact that X satises all revenue constraints, in particular v j y [j;n] v k y [k;n] , we get that v k y 0 [k;j) (A.1) v k y [k;j) (v j v k )y [j;n] (A.2) (v j v k )y 0 [j;n] : 99 Addingv k y 0 [j;n] to both sides now shows thatX 0 satises the revenue inequality for j. However, notice that the objective value has increased by (v i 0 v i ), contradicting the optimality of X . Hence, we have shown that P M 0 =1 x i 0 ; 0 = p i 0. We will show that there is another signal 0 > such that we can redistribute probability mass between signals ; 0 without aecting either the objective or the constraints, while making X and X more similar. Because y i 0 >y i 0 and P M 0 =1 x i 0 ; 0 =p i 0, there must be a signal 0 > such that x i 0 ; 0 > x i 0 ; 0. Fix 0 to be the smallest such index, and dene = min(x i 0 ; 0;). Consider the modied signaling scheme X 0 with y 0 i =y i ; y 0 i 0 =y i 0 +; x 0 i; 0 =x i; 0 +; x 0 i 0 ; 0 =x i 0 ; 0; x 0 j; 00 =x j; 00 for all other j; 00 : Because this assignment only redistributes probability mass, the probability mass constraints cannot be violated, and the social welfare stays the same. Non- negativity follows from the choice of . That y 0 satises the revenue constraints follows exactly the same proof as in the previous denition of y 0 , because the adjustment is the same. Finally, for all j 2 fi + 1;:::;i 0 g, the tail sums of probabilities have decreased under 0 , implying that so has the revenue for those prices. This means that all revenue constraints are also satised for 0 . This modication makes y i and y i more similar, and repeating this procedure with another signal 0 as needed, they will eventually become the same. Case 2: y i >y i : Analogous to the rst case, we will begin by showing that P M 0 =1 x i; 0 = p i . For contradiction, assume that P M 0 =1 x i; 0 <p i , and let = min(y i y i ;p i P M 0 =1 x i; 0). 100 Dene an improved signaling scheme by setting y 0 i = y i +, and y 0 j = y j for all j6=i. The probability mass constraints are still satised, and nothing has changed for indices j >i, so those revenue constraints are still satised. Next, consider an index j2fk + 1;:::;ig. Because i is the rst index with y i 6= y i , we get that y [k;j) =y [k;j) . By Lemma 3.9, y [i+1;n] y [i+1;n] , and because y i y i + , we obtain that y [j;n] y [j;n] + = y 0 [j;n] . Combining this and the previous inequality with the revenue constraint fory at indexj (namely, thatv k y [k;n] v j y [j;n] ), we conclude | analogously to the previous case | that v k y 0 [k;j) =v k y [k;j) (v j v k )y [j;n] (v j v k )y 0 [j;n] : Adding v k y 0 [j;n] to both sides now establishes that the revenue constraint is satised at index j. Because the objective value strictly increased, we obtain a contradiction to the optimality of X . Hence, from now on, we assume that P M 0 =1 x i; 0 =p i . As in Case 1, there must be some signal 0 > withx i; 0 > 0. Our goal is again to increase y i to make it closer to y i , and do so by reassigning probability mass from signal 0 . However, in this case, doing so involves a more careful reallocation of probability mass to ensure all revenue constraints are satised. In fact, we will use the algorithmConstruct-One-Signal to construct a vectord describing the probability reallocation, and then set y 0 =y +d; x 0 0 =x 0d: The \residual probability vector"r in this case is dened asr i = min(x i; 0;y i y i ) and r j =x j; 0 for j6=i. It captures the fact that we can at most reallocate all of the probability mass of x 0, but also must ensure that the new signal satises y 0 i y i . Since the modied version of signal must satisfy the revenue constraints 101 with target price v k , we make this price the target of the construction of d, by settingu i =v k ,u j =v j forj >i, andu j = 0 forj <i. Then, the change vector is dened asd =Construct-One-Signal(r;u;i). We now want to show that the signalsy 0 ;x 0 0 dened above satisfy all constraints. First, because probability mass only gets moved around between signals, the welfare stays the same, and the probability and non-negativity constraints cannot get violated becauseConstruct-One-Signal at most usesr j units of probability in coordinate j. We therefore focus on verifying the revenue constraints. We begin with x 0 0, which only saw its probabilities decrease. The seller's revenue at the target indexk 0 decreased byv k 0 d [i;n] . Consider any indexj >k 0. The seller's revenue at index j decreased by v j d [j;n] . If j = 2 I at the termination of the algorithm, then d j = r j , and x 0 j; 0 = 0, meaning that price j is no more attractive than j + 1 or j 1 to the seller. Otherwise, Lemma 3.8 implies that v j d [j;n] =u j d [j;n] Lemma 3.8 = u i d [i;n] = v k d [i;n] v k 0 d [i;n] : In particular, j cannot have become more attractive to the seller than k 0. We next verify that the revenue constraints are also satised for the signaly 0 . Here, the seller's revenue for price point j increased by v j d [j;n] . For all j >i, the algorithmConstruct-One-Signal ensures thatv j d [j;n] =u j d [j;n] u i d [i;n] = v k d [i;n] , so the increase in revenue is no larger for price point j than for k . Thus, we have obtained the revenue constraint v k y 0 [k;n] v j y 0 [j;n] . By rewriting y 0 [k;n] =y 0 [j;n] + (y 0 [k;n] y 0 [j;n] ) and rearranging, we obtain the useful form y 0 [j;n] v k y 0 [k;j) v j v k : (A.3) The slightly tricky part is the indicesj2fk +1;:::;ig. While the probabilities y 0 j for j < i do not increase, the tail probabilities y 0 [j;n] do by virtue of increases in y 0 j for j i; hence, we need to also consider these price points. Fix such a j2fk + 1;:::;ig. We will rst show that y 0 [j;n] y [j;n] . 102 Leti 0 >i be a smallest index withy i 0 <p i 0 P 00 < x i 0 ; 00 . (If no suchi 0 exists, then leti 0 =n + 1.) We will rst show thaty 0 [i 0 ;n] y [i 0 ;n] . This holds trivially when i 0 = n + 1. Otherwise, we apply Lemma 3.8 to the construction of y, and obtain that that y [i 0 ;n] = v k y [k;i 0 ) v i 0v k . Now, notice that for all j 0 2fk ;:::;i 0 1g, we have that y 0 j 0y j 0, for dierent reasons. 1. For j 0 > i, this follows because the denition of i 0 implies that y j 0 = p j 0 P 00 < x j 0 ; 00 is as large as it can possibly be. 2. For j 0 =i, it follows because y 0 i =y i +y i . 3. For j 0 <i, it follows because y 0 j 0 =y j 0 =y j 0. This implies that y 0 [k;i 0 ) y [k;i 0 ) , and hence | using Inequality (A.3) | that y 0 [i 0 ;n] y [i 0 ;n] . Finally, the previous three cases show that for the xed j, y 0 [j;n] =y 0 [i 0 ;n] + i 0 1 X j 0 =j y 0 j 0 y [i 0 ;n] + i 0 1 X j 0 =j y j 0 = y [j;n] : Having shown thaty 0 [j;n] y [j;n] , we next apply Lemma 3.8 toy at price pointj to obtain thaty [j;n] v k y [k;j) v j v k . Combining this with the fact thaty [k;j) =y 0 [k;j) for j <i by the third case of the above case distinction, we obtain thaty 0 [j;n] v k y 0 [k;j) v j v k , which is an equivalent way of rewriting the revenue constraint. Again, this modication makes y i and y i more similar, and repeating this procedure with additional signals 0 as needed, they will eventually become the same. A.2 Proof of Lemma 3.12 In this section, we provide a proof of Lemma 3.12, restated here for convenience. Lemma 3.12 If S T , then for any k;B;: f W (k;B+) (T ) f W (k;B) (T ) f W (k;B+) (S) f W (k;B) (S). 103 We will prove this lemma for suciently small , which allows us to couple the executions tightly; the inequalities can then be added to imply the lemma for arbitrary . By comparing the solutions to the linear program (3.1), we can ensure that any constraint that becomes tight in the solution for set T with bound B +, but is not tight with bound B (and similarly for S) would not have become tight for any 0 < . This will localize the changes, and use revenue indierence for the seller. By summing over all such iterations (there will only be nitely many, because is chosen so that at least one more constraint becomes tight), we eventually prove the lemma. In the analysis, we are interested in four dierent signaling schemes, constructed by Construct-Signaling-Scheme when run with dierent sets of price points and upper bounds B. We will assume here that k2ST . Specically we dene: X = (x i; ) ;i : the probability mass of type i assigned to signal when the algorithm is run with price point set S and an upper bound of B. X + = (x + i; ) ;i : probability mass for price point set S and an upper bound of B +. b X = (b x i; ) ;i : probability mass for price point set T and an upper bound of B. b X + = (b x + i; ) ;i : probability mass for price point set T and an upper bound of B +. To avoid notational confusion, we will use to denote signals under the signaling scheme for S[fkg andk to denote their price points; signals under the signaling scheme for T[fkg are denoted by !, and their price points by b k ! . Since we are interested in the change in the signaling scheme as we increase the bounds from B to B +, we dene i =x i; x + i; and b ! i =b x i;! b x + i;! . In order to understand the i better, consider the eect of changing the total probability constraint for the signal with price point k fromB toB +. When the 104 signal with price point k is constructed, we may now add some more probability mass for types ik. Subsequently, by Lemma 3.13, the algorithm continues with less (or equal) residual probability mass for all types. This might in turn mean that for later signals with price points k 0 < k, the probability mass for some type i might get used up earlier. In turn, this will speed up the addition of probability mass for typesi 0 2fk 0 +1;:::;i1g. In a sense, what happens is that the additional probability mass for the signal with price point k \displaces" some of that mass from other signals, increasing dierent probability masses. Consider some signal 2Q. Let Q =f 0 2Qj 0 g be the set of signals constructed up to . We are interested in which price points may change their overall allocation of probability in the signals constructed up to as a result of the increase from B to B +. Notice that the candidates are only those that did not have their probability mass already used up when the construction reached with an upper bound of B. To capture this, we dene k < e 1 < e 2 < ::: < e m to be the indices which still had probability mass available after signal was constructed, i.e., such that x e j ;Q <p e j . For notational convenience, we dene e m +1 =n + 1 and v n+1 =1. Similarly, dene b k ! <b e ! 1 <b e ! 2 < ::: <b e ! b m! to be the indices with b x b e ! j ; b Q ! <p b e ! j . We are particularly interested in indices whose overall total probability mass (at the end of Construct-Signaling-Scheme) can increase. For ease of notation, we therefore dene b S = maxf j k 2 Sg;b ! T = maxf! j b k ! 2 Tg, and e j = e b S j ;b e j = b e b ! T j ; note that this implies x e j ;Q < p i for all j, and similarly for b e j . One type i<k could also see an increase, namely, if the displaced probability by an increase for a type i 0 > k which eventually became saturated causes an increase in probability mass for some later signal with targetk 0 <k. The only such target types would bee 0 = maxfikjx i;Q <p i g andb e 0 = maxfikjb x i; b Q <p i g, respectively. Notice that by making small enough, we can ensure that x + e j ;Q < p e j and b x + b e ! j ; b Q ! <p b e ! j for all indices e j ,b e ! j and signals ;!. 105 Dene to be the supremum of all such values. This choice of ensures that at least one of the above inequalities becomes tight, but for any 0 < , we have x + e j ;Q <p e j andb x + b e ! j ; b Q ! <p b e ! j . With the chosen, for the index (or indices) where the inequality becomes tight, we still have a tight revenue constraint, because in the execution of Construct-One-Signal, the index was removed from I at the same time as k , in the last round of the iteration. By choosing the proper, at least one ofe j andb e ! j is removed from the index set for the next larger valueB 0 =B +. Because there are only nitely many candidate indices, nitely many updates will reach the B such thatb x k;! k =p k when running with a bound ofB. Beyond that value ofB, the signal! k cannot be further raised. As we argued above, the increase in probability mass for the signal with price index k will reduce the probability mass available for other signals constructed subsequently in the algorithm Construct-Signaling-Scheme. Let be such a signal, with price point k <k. A lack of available probability mass for low-value buyer types when is constructed may make its target price v k less attractive to the seller; to compensate, the signal must reduce the amount of probability mass for high-value buyers it uses. The following lemma captures the necessary reduction. Lemma A.1 Fix some signal , and let i<i 0 ;j <j 0 be indices such that each of i;i 0 ;j;j 0 is either equal to k 0 or to one of the e j 00. Then, [j;j 0 ) 1 v j 1 v j 0 = [i;i 0 ) 1 v i 1 v i 0 : (A.4) An analogous characterization holds for theb e ! j 00 and b ! [j;j 0 ) . Proof. Let k 0 = k . Because x e j ;Q < p e j , we can apply Lemma 3.8 to the iteration in which was constructed, and infer that for all jm , v k 0x [k 0 ;n]; =v e j x [e j ;n]; ; v k 0x + [k 0 ;n]; =v e j x + [e j ;n]; ; whence v k 0 [k 0 ;n] =v e j [e j ;n] follows. 106 [e j ;e j 0 ) = [e j ;n] [e j 0 ;n] = v k 0 v e j [k 0 ;n] v k 0 v e j 0 [k 0 ;n] = 1 v e j 1 v e j 0 v k 0 [k 0 ;n] Hence, [e j ;e j 0 ) 1 v e j 1 v e j 0 =v k 0 [k 0 ;n+1) for allj <j 0 m + 1, and it is easy to see that this calculation applies for j =k 0 as well. Lemma A.2 The e j ;b e j ;e j ;b e ! j satisfy the following subsequence properties. 1. Each indexb e j also appears as an element in the sequence of e j . 2. Let k 0 be a price point and ;! signals such that has price point k 0 under X and ! has price point k 0 under b X. Then, each indexb e ! j also appears as an element in the sequence of e j . 3. If> 0 , then each indexe j k 0 also appears as an element in the sequence of e 0 j . Similarly forb e ! j andb e ! 0 j . Proof. Let Q S , Q T denote the sets of signals corresponding to the price point sets S;T , respectively. 1. By Lemma 3.13, applied to S T , we get that x i;Q S b x i;Q T , so whenever b x i;Q T <p , we also have x i;Q S <p . 2. Analogous to the rst part after applying Lemma 3.13 with Q = Q S and b Q = Q T ! : The corresponding price point sets arefk 2 Sj k k 0 g f b k ! 2Tj b k ! k 0 g, respectively. 3. Analogous to the rst part after applying Lemma 3.13 with Q = Q S 0 and Q 0 =Q S . Lemma A.3 Let 0 > k be a signal constructed after the one with price point k. Then, the total probability mass under signal 0 for the \initial saturated segment" [k 0;e 0 1 ) cannot increase when B is raised to B +. That is, 0 [k 0;e 0 1 ) 0. If 107 furthermore, e 0 1 k, then 0 [k 0;e 0 1 ) = 0. An analogous statement holds for b ! 0 [ b k ! 0;b e ! 0 1 ) in place of 0 [k 0;e 0 1 ) . Proof. By denition of e 0 1 , all the probability mass for indices i2 [k 0;e 0 1 ) is used up by signals 1;:::; 0 under both X and X + . Hence, X 0 x [k 0;e 0 1 ); =p [k 0;e 0 1 ) = X 0 x + [k 0;e 0 1 ); : Taking the dierence and solving gives us that 0 [k 0;e 0 1 ) = P < 0 [k 0;e 0 1 ) 0 by monotonicity (Lemma 3.13). We prove the second part of the lemma by induction over 0 > k . For the base case, let 0 be the least such that e 1 k. Consider a signal < 0 . If k e 0 1 , then part 3 of Lemma A.2 would imply that e 0 1 also appears as a e j for some j; in particular, it would imply that e 1 e 0 1 k, contradicting the minimality of 0 . Hence, k > e 0 1 , meaning that no probability mass is allocated to signals < 0 for types [k ;e 0 1 ). Because by denition, all probability mass for such types is used up by signals up to and including signal 0 , we conclude that x [k 0;e 0 1 ); 0 =p [k 0;e 0 1 ) =x + [k 0;e 0 1 ); 0 , and therefore 0 [k 0;e 0 1 ) = 0. For the induction step, we use our result from the rst part that 0 [k 0;e 0 1 ) = P < 0 [k 0;e 0 1 ) . We will show that [k 0;e 0 1 ) = 0 for all < 0 . First, when k > e 0 1 , we get that x [k 0;e 0 1 ); = x + [k 0;e 0 1 ); = 0, implying that [k 0;e 0 1 ) = 0. Otherwise, when k e 0 1 , we rst observe that x [k 0;k ); = x + [k 0;k ); = 0, so [k 0;e 0 1 ) = [k;e 0 1 ) . Lemma A.2 implies that e 1 e 0 1 , so we can apply Lemma A.1 and the induction hypothesis to show that [k;e 0 1 ) Lemma A.1 = [k;e 1 ) ( 1 v k 1 v e 0 1 )=( 1 v k 1 v e 1 ) I.H. = 0; completing the proof. We next get to the key lemma for submodularity. It compares the eects of the increase of under S and T on the same signal . At a very high level, it says that the eects on in terms of the allocated probability mass of low-value types is 108 more severe forT than forS. The precise form is a bit subtle. To avoid notational confusion, we will use to denote signals under the signaling scheme for S[fkg andk to denote their price points; signals under the signaling scheme for T[fkg are denoted by !, and their price points by b k ! . Recall that signals are sorted by decreasing price points, so that k +1 <k ; b k !+1 < b k ! . We frequently want to nd, for a given signal !, the signal with closest greater (or equal) price point to b k ! . Hence, for any signal ! under T[fkg, we deneb!c = maxfjk b k ! g. Lemma A.4 Let k be the signal with price point k for S[fkg, and ! k the signal with price point k for T[fkg. Let ! 0 ! k be any signal, with price point k 0 = b k ! 0. Then, b! 0 c X = k [k;b e ! 0 j ) ! 0 X !=! k b ! [ b k!;b e ! 0 j ) : Proof. We prove the statement by induction on! 0 . The base case! 0 =! k is true because Lemma A.1, applied with i = j = k, i 0 =b e ! k j and j 0 = n + 1 implies that k [k;b e ! k j ) = b ! k [k;b e ! k j ) =v k ( 1 v k 1 v b e ! k j ). We now focus on the induction step, and distinguish three cases. 1. If b k ! 0 = 2 S, then P b! 0 c = k [k;b e ! 0 j ) = P b! 0 1c = k [k;b e ! 0 j ) . By induction hypothesis, the latter is at most P ! 0 1 !=! k b ! [ b k!;b e ! 0 j ) . By Lemmas A.1 and A.3, b ! 0 [ b k ! 0;b e ! 0 j ) 0, and adding this inequality proves the induction step. 2. If b k ! 02 S and e b! 0 c 1 k, then write 0 =b! 0 c. By Lemma A.2, b e ! 0 j = e 0 j 0 for some j 0 , so we can apply Lemma A.1 and Lemma A.3 to conclude that 0 [k 0;b e ! 0 j ) = 0 [k 0;e 0 1 ) ( 1 v k 0 1 v b e ! 0 j )=( 1 v k 0 1 v e 0 1 ) = 0. As in the previous case, we get that b ! 0 [ b k ! 0;b e ! 0 j ) 0, and adding both terms to the inequality obtained from the induction hypothesis now completes the inductive step. 109 3. Otherwise, we are in the case that b k ! 0 2 S and e b! 0 c 1 > k; we again write 0 =b! 0 c. First, we are going to show that the lemma holds for 0 with j = 1. By Lemma 3.13, we obtain that 0 X = k [k;b e ! 0 1 ) = 0 X = k x [k;b e ! 0 1 ); x + [k;b e ! 0 1 ); 0; whereas by denition ofb e ! 0 1 , ! 0 X !=! k b ! [ b k!;b e ! 0 1 ) = ! 0 X !=! k b x [ b k!;b e ! 0 1 );! b x + [ b k!;b e ! 0 1 );! = p [ b k!;b e ! 0 1 ) p [ b k!;b e ! 0 1 ) = 0: Thus, we have shown that 0 X = k [k;b e ! 0 1 ) ! 0 X !=! k b ! [ b k!;b e ! 0 1 ) = 0 X = k X !:k b k!<k 1 ; b k!k b ! [ b k!;b e ! 0 1 ) : The induction hypothesis implies the same inequality with 00 < 0 in place of 0 , so that we have for all 00 0 : 00 X = k [k;b e ! 0 1 ) X !:k 00 b k!k b ! [ b k!;b e ! 0 1 ) = 00 X = k X !:k b k!<k 1 ; b k!k b ! [ b k!;b e ! 0 1 ) : (A.5) To extend the result to j > 1, consider any signal 2f k + 1;:::; 0 g. By Part (2) of Lemma A.2,b e ! 0 1 is equal to e 0 j for some j. Because e 0 1 >kk , 110 by Part (3) of Lemma A.2, bothe 0 1 andb e 0 1 occur ase j ,e j 0 for somej;j 0 . We are therefore allowed to apply Lemma A.1, and we can write 0 X = k [k;b e ! 0 j ) = 0 X = k [k;b e ! 0 1 ) ( 1 v k 1 v b e ! 0 j )=( 1 v k 1 v b e ! 0 1 ); ! 0 X !=! k b ! [ b k!;b e ! 0 j ) = 0 X = k X !:k b k!<k 1 ; b k!k b ! [ b k!;b e ! 0 1 ) ( 1 v b k! 1 v b e ! 0 j )=( 1 v b k! 1 v b e ! 0 1 ) 0 X = k ( 1 v k 1 v b e ! 0 j )=( 1 v k 1 v b e ! 0 1 ) X !:k b k!<k 1 ; b k!k b ! [ b k!;b e ! 0 1 ) : Because b e ! 0 j b e ! 0 1 , the function x 7! ( 1 x 1 v b e ! 0 j )=( 1 x 1 v b e ! 0 1 ) is increasing in x for x < v b e ! 0 1 . Furthermore, by Inequality (A.5), we have domination of all prex sums, so we can apply Lemma A.5 with a = [k;b e ! 0 1 ) , b = P !:k b k!<k 1 ; b k!k b ! [ b k!;b e ! 0 1 ) , andc = ( 1 v k 1 v b e ! 0 j )=( 1 v k 1 v b e ! 0 1 ) to conclude that P 0 = k [k;b e ! 0 j ) P ! 0 !=! k b ! [ b k!;b e ! 0 j ) , completing the inductive step. Lemma A.5 Let a 1 ;:::;a n and b 1 ;:::;b n be any numbers such that for all indices i n, the prexes satisfy that P i j=1 a j P i j=1 b j . Then, for any coecients c 1 c 2 :::c n 0, we have that P n j=1 c j a j P n j=1 c j b j . Proof. By deningc n+1 = 0, we can writec j = P n i=j (c i c i+1 ). Now, we get that n X j=1 c j a j = n X j=1 n X i=j a j (c i c i+1 ) = n X i=1 (c i c i+1 ) i X j=1 a j n X i=1 (c i c i+1 ) i X j=1 b j = n X j=1 c j b j : The inequality followed by the assumption on the prexes and the non-negativity of all the c i c i+1 terms. Proof of Lemma 3.12. Let k be the signal with price point k to which probability mass was added. This addition can only lead to an increase in social 111 welfare via an increase in buyer types that were previously not allocated to any signal, meaning that we are interested only in typesi with P x i; <p i . In addition to the indices e j dened earlier for j 0, dene e 0 >e 1 >e 2 >::: to be all of the indices i<e 0 with x Q; <p i . We will rst show that those types i < e 0 do not actually lead to any welfare changes under X. Thereto, consider some type e j with j < 0. Under X, buyers of this type can only be allocated to a signal with k e j . For all such signals , by denition, we have that e 1 e j < e 0 k, so Lemma A.3 implies that [k;e 1 ) = 0. By Part (3) of Lemma A.2,e j ande j+1 also occur ase j 0,e j 00, so we can apply Lemma A.1 to conclude that [e j ;e j+1 ) = 0, and by summing obtain that for all i< 0, X e j () = X [e j ;e j+1 ) = X :ke j [e j ;e j+1 ) = 0 In the step labeled (*), we are using the fact that e i is the only index in the range [e i ;e i+1 ) at which the probability mass can increase, and the probability mass for no other type will decrease. By Part 1 of Lemma A.2, theb e j form a subsequence of the e j . To compare the eects of changes in the welfare, we partition the e j into the segments formed by theb e j ; formally, we denebe j c = maxfb e j 0jb e j 0 e j g. Let j be the index j such that e j =b e 1 . 112 We can now express the change in social welfare when addingk toS as follows: f W (X) f W (X + ) = X j v e j X k e j = X jj v e j X k [e j ;e j+1 ) + j 1 X j=0 v e i X k e j () = X jj v e j X k [k;e 1 ) ( 1 v e j 1 v e j+1 )=( 1 v k 1 v e 1 ) + j 1 X j=0 v e i X k e j = X jj v e j ( 1 v e j 1 v e j+1 ) X k [k;e 1 ) =( 1 v k 1 v e 1 ) + j 1 X j=0 v e i X k e j : (A.6) In the step labeled (*), we applied Lemma A.1. We were allowed to do so, because e j+1 e j b e 1 e 1 kk , allowing us to apply Part (3) of Lemma A.2. We rst analyze the second term P j 1 j=0 v e i P k e j . Ifb e 0 is dened, then j 1 X j=0 v e j X k e j v b e 0 X k j 1 X j=1 e j = v b e 0 X k [k;b e 1 ) Lemma A.4 v b e 0 X !! k b ! [ b k!;b e 1 ) : Otherwise (b e 0 is not dened), we will simply use the bound that P k [k;e 1 ) 0 by Lemma 3.13. For notational convenience, in this case, we will write v b e 0 = 0. We next consider the rst factor of the rst term of Equation (A.6). Because e j be j c, we obtain that X jj v e j ( 1 v e j 1 v e j+1 ) X jj v be j c ( 1 v e j 1 v e j+1 ) = X j 0 1 v b e j 0 X j:be j c=b e j 0 ( 1 v e j 1 v e j+1 ) = X j 0 1 v b e j 0 ( 1 v b e j 0 1 v b e j 0 +1 ): 113 To analyze the second factor of the rst term in Equation (A.6), we rst apply Lemma A.1 to rewrite P k [k;e 1 ) =( 1 v k 1 ve 1 ) = P k [k;b e 1 ) =( 1 v k 1 v b e 1 ). We also write X !! k b ! [ b k!;b e 1 ) =( 1 v b k! 1 v b e 1 ) = X k X !:k b k!<k 1 ; b k!k b ! [ b k!;b e 1 ) =( 1 v b k! 1 v b e 1 ) X k 1=( 1 v k 1 v b e 1 ) X !:k b k!<k 1 ; b k!k b ! [ b k!;b e 1 ) : By Lemma A.4, we have that P b! 0 c = k [k;b e ! 0 j ) P ! 0 !=! k b ! [ b k!;b e ! 0 j ) for all ! 0 ! k . We can therefore apply Lemma A.5 witha = [k;b e ! 0 1 ) ,b = P !:k b k!<k 1 ; b k!k b ! [ b k!;b e 1 ) , and c = 1=( 1 v k 1 v b e 1 ), and conclude that X k [k;e 1 ) =( 1 v k 1 v e 1 ) = X k [k;b e 1 ) =( 1 v k 1 v b e 1 ) X k 1=( 1 v k 1 v b e 1 ) X !:k b k!<k 1 ; b k!k b ! [ b k!;b e 1 ) X !! k b ! [ b k!;b e 1 ) =( 1 v b k! 1 v b e 1 ): 114 Recalling that P k [k;e 1 ) =( 1 v k 1 ve 1 ) 0 by Lemma 3.13, we now put all of these inequalities together, yielding that f W (X) f W (X + ) X j 0 1 v b e j 0 ( 1 v b e j 0 1 v b e j 0 +1 ) X !! k b ! [ b k!;b e 1 ) =( 1 v b k! 1 v b e 1 ) +v b e 0 X !! k b ! [ b k!;b e 1 ) = X j 0 1 v b e j 0 X !! k b ! [ b k!;b e 1 ) ( 1 v b e j 0 1 v b e j 0 +1 )=( 1 v b k! 1 v b e 1 ) +v b e 0 X !! k b ! [ b k!;b e 1 ) Lemma A.1 = X j 0 1 v b e j 0 X !! k b ! [b e j 0;b e j 0 +1 ) +v b e 0 X !! k b ! [ b k!;b e 1 ) = X j 0 0 v b e j 0 X !! k b ! b e j 0 = f W ( b X) f W ( b X + ): Finally, by noticing that the respective increases in sanitized welfare are the negatives of the terms here (i.e., f W (X + ) f W (X) and f W ( b X + ) f W ( b X), respectively), we complete the proof of Lemma 3.12. 115 Appendix B A Non-Matroid Example Implementable by Dice In this section, we show a non-matroid example for which all the feasible rst-order interim rules are implementable by dice. Notice that nding an interim rule that is implementable by dice within an environment is easy: one can calculate the interim winning probability of a collection of trivial winner-selecting dice. What we want is to show that dice exist for all feasible interim rules of a non-matroid environment. Consider a winner-selection environment with 4 candidates, numbered 1; 2; 3; 4. The feasible winner sets aref1; 2g;f3; 4g and their subsets, i.e., all singletons and the empty set. All candidates have a single type, so the type distribution is trivial. We denote the unique type of candidate i by i. Consider any ex-post rule, i.e., a distribution over the feasible setsf1; 2g,f3; 4g, f1g,f2g,f3g,f4g,;. Let the probability of selecting winner set S bey S . We have x(1) =y f1;2g +y f1g x(2) =y f1;2g +y f2g x(3) =y f3;4g +y f3g x(4) =y f3;4g +y f4g (B.1) Notice that any feasible interim rule can be implemented by a distributiony in this way. 116 Without loss of generality, assume thatx(1)x(2) andx(3)x(4). Whenever x is feasible, we dene the following set of dice. Each die has1 on one side; the other side's value and probability are given by Table B.1. Type t positive face probability of positive face 1 100 x(1) 2 2 x(2)x(1) 1x(1)x(3) 3 10 x(3) 1x(1) 4 1 x(4)x(3) 1x(2)x(3) Table B.1: Dice forx One can easily verify that these dice implement the desired interim rule, if the probabilities of positive face fall in [0; 1]. Substituting (B.1) into the probabilities from Table B.1, we will have Type t probability of positive face 1 y f1;2g +y f1g 2 y f2g y f1g 1y f1;2g y f1g y f3;4g y f3g 3 y f3;4g +y f3g 1y f1;2g y f1g 4 y f4g y f3g 1y f1;2g y f2g y f3;4g y f3g Since P S22 [4] y S = 1 and x(1) x(2);x(3) x(4), all the probabilities fall in [0; 1] Thus for any feasible interim rulex in this winner-selection environment, there is a collection of dice implementx. 117
Abstract (if available)
Abstract
Making choices with a die sounds unhelpful, and may be adopted only by students who do not know the correct answer during exams. However, there are cases for which rolling a die is the best way to solve the problem. Allocating limited resources fairly is a common scenario where randomness is adopted. For example, the H1B lottery used by USCIS to select who will get a visa. But even without considering fairness, making a choice randomly may be the only way to benefit oneself. In a repeated rock‐paper‐scissors game, a deterministic player can never win, since his opponent can observe and play the winning action against his unchanged strategy. In this thesis, we examine several scenarios where randomization does and does not work. ❧ We first study information structure design, also called “persuasion” or “signaling”, in the presence of a constraint on the amount of communication. We focus on the fundamental setting of bilateral trade, which in its simplest form involves a seller with a single item to price, a buyer whose value for the item is drawn from a common prior distribution over n different possible values, and a take‐it‐or‐leave‐it‐offer protocol. A mediator with access to the buyer's type may partially reveal such information to the seller in order to further some objective such as the social welfare or the seller's revenue. A simple example can show that revealing the information deterministically is not optimal for the social welfare. We study how randomization can help in the communication‐constrained setting. ❧ In the setting of maximizing welfare under bilateral trade, as our main result, we exhibit an efficient algorithm for computing an (M-1)/M(1-1/e)-approximation to the welfare‐maximizing scheme with at most M signals. Without allowance of randomization, it is unclear how to design an (approximately) optimal scheme. For the revenue objective, in contrast, we show that a deterministic scheme suffices. ❧ We next study the existence of dice‐based winner‐selection rules for given interim rules. In a winner‐selection environment, multiple winners are selected from a candidate set, subject to certain feasibility constraints. The interim rule summarizes the probability that each candidate being selected. We show that when the feasibility constraint is a matroid constraint, any feasible interim rule admits a dice‐based implementation. A dice‐based implementation associates a die with each candidate. To choose the winner, the rule rolls all dice independently and picks the subset that maximizes the sum of rolled values, subject to the feasibility constraint. Dice‐based rules generalize both Myerson's auction and order sampling, both of which assign (random) values to candidates and choose the set of candidates maximizing the sum of values. ❧ While dice can implement all matroid winner‐selection rules, we also show two cases where they fail. Both of these cases fall in the Bayesian Persuasion model of Kamenica and Gentzkow. For one setting, we show that our positive algorithmic results for bilateral trade do not extend to communication‐constrained signaling in the Bayesian Persuasion model. Specifically, we show that it is NP‐hard to approximate the optimal sender's utility to within any constant factor in the presence of communication constraints. For the other, we treat Bayesian persuasion as a winner‐selection environment and show an instance that does not admit a dice‐based implementation.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on information design for online retailers and social networks
Asset Metadata
Creator
Qiang, Ruixin
(author)
Core Title
Do humans play dice: choice making with randomization
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
02/25/2019
Defense Date
10/19/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Bayesian persuasion,bilateral trade,OAI-PMH Harvest,order sampling,signaling scheme,virtual value function
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Dughmi, Shaddin (
committee chair
), Kempe, David (
committee chair
), Câmara, Odilon (
committee member
), Huang, Ming-Deh (
committee member
)
Creator Email
rqiang@usc.edu,ruixin.qiang@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-127579
Unique identifier
UC11675396
Identifier
etd-QiangRuixi-7114.pdf (filename),usctheses-c89-127579 (legacy record id)
Legacy Identifier
etd-QiangRuixi-7114.pdf
Dmrecord
127579
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Qiang, Ruixin
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
Bayesian persuasion
bilateral trade
order sampling
signaling scheme
virtual value function