Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays on fair scheduling, blockchain technology and information design
(USC Thesis Other)
Essays on fair scheduling, blockchain technology and information design
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Essays on Fair Scheduling, Blockchain Technology and Information Design by Justin A. Mulvany A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (BUSINESS ADMINISTRATION) May 2024 Copyright 2024 Justin A. Mulvany Dedicated to my family ii Acknowledgements I thank my advisors Kimon Drakopoulos and Ramandeep Randhawa for their constant guidance, support and patience over the years. It is because of their knowledge and expertise that I was able to grow as a scholar. They introduced me to new ideas and perspectives that have led to fruitful research. They guided me through the times I doubted myself. Simply put, I could not have asked for better advisors. I would like to thank Phebe Vayanos for being a part of my committee. I am grateful for her knowledge and encouragement. I took her linear optimization course during my first semester at USC. It was an amazing course, and I used the tools she taught me repeatedly in my research. In addition, I want to thank professors Vishal Gupta, Paat Rusmevichientong, Greys Sosiˇ c, Mika ´ Sumida, Andrew Daw, Song-Hee Kim, Amy Ward, Amber Puha, Ruth Williams, Irene Lo and Hamid Nazerzadeh for the knowledge they imparted to me through courses, conversations and collaboration. I am grateful to all the faculty that I have met during the PhD program, as well as the DSO PhD students, who I thank for all the fun times that we shared. I am incredibly appreciative to have gotten to know such friendly and helpful people. Finally, I thank my family. Without their constant support, I could not have finished this degree. I especially want to thank my wife, Trang, for her love. iii Table of Contents Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Chapter 1: Fair Scheduling of Heterogeneous Customer Populations . . . . . . . . . . . . . 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.1 Definition of Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.2 The Fair FIFO-policy is Inefficient . . . . . . . . . . . . . . . . . . . . . . 10 1.3.3 The Efficient cµ-Policy is Unfair . . . . . . . . . . . . . . . . . . . . . . . 10 1.3.4 Fairness Constrained Delay Minimization Problem . . . . . . . . . . . . . 12 1.3.5 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4 Population-Unaware Policy Design . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.4.1 Simplified Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . 17 1.4.2 Fair, Population-unaware Policies May Not Perform Better Than FIFO . . . 19 1.4.3 Existence of Non-FIFO, Fair, Population-Unaware Policies . . . . . . . . . 20 1.4.4 Cost Comparison for Different Policies . . . . . . . . . . . . . . . . . . . 23 1.5 Population-Aware Policy Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.5.1 Two Populations and Two Types . . . . . . . . . . . . . . . . . . . . . . . 25 1.5.1.1 Cases with No Cost For Fairness . . . . . . . . . . . . . . . . . . 26 1.5.1.2 Cases with Cost For Fairness . . . . . . . . . . . . . . . . . . . 30 1.5.1.3 High Utilization Implies No Cost For Fairness . . . . . . . . . . 31 1.5.2 Generalizing Insights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 1.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 1.6.1 Extension to Non-Uniform Holding Costs . . . . . . . . . . . . . . . . . . 36 1.6.1.1 Population Fair Policies and Their Existence with Non-uniform Holding Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 1.6.1.2 Fair, Population-Unaware Policies with Non-uniform Holding Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 1.6.1.3 Fair, Population-Aware Policies with Non-uniform Holding Costs 40 1.6.2 Choosing Among Fair, cµ-Preserving Policies . . . . . . . . . . . . . . . . 41 iv 1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Chapter 2: Blockchain Mediated Persuasion . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.2 Implementing Mediated Persuasion on the Blockchain . . . . . . . . . . . . . . . . 50 2.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.4 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.4.1 Blockchain Mediated Persuasion . . . . . . . . . . . . . . . . . . . . . . . 54 2.4.2 Strategies and Solution Concept . . . . . . . . . . . . . . . . . . . . . . . 56 2.4.3 Simplifying the Optimization Problem . . . . . . . . . . . . . . . . . . . . 57 2.5 The Value of Blockchain Mediated Persuasion . . . . . . . . . . . . . . . . . . . . 60 2.5.1 Primal Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.5.2 Dual Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 2.6 Blockchain Mediated Information Selling . . . . . . . . . . . . . . . . . . . . . . 65 2.6.1 Selling Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 2.6.2 Improvement due to Blockchain Mediated Persuasion . . . . . . . . . . . . 68 2.6.3 The value of cost differentiation . . . . . . . . . . . . . . . . . . . . . . . 70 2.7 Optimal BMP Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Chapter 3: Learning Networks via Persuasion . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.3 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.4 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 3.4.1 Learning Is Possible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 3.4.2 The Sequentially Truthful (ST) Policy . . . . . . . . . . . . . . . . . . . . 88 3.4.3 The Sequentially Correlated Persuasion (SCP) Policy . . . . . . . . . . . . 89 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 A Proofs for Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 B Proofs for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 B.1 Reduction of the Signal Space . . . . . . . . . . . . . . . . . . . . . . . . 114 B.2 Constructing an Optimal Mechanism for the Information Selling Problem . 118 B.3 Remaining Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 B.4 Calculations for Table 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 C Proofs for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 C.1 Proof of Theorem 3.4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 C.2 Proof of Theorem 3.4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 C.3 Remaining Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 v List of Figures 1.1 Inefficiency of the FIFO-policy for two populations A and B. . . . . . . . . . . . . 11 1.2 Unfairness of the cµ-policy across populations A and B. . . . . . . . . . . . . . . . 11 1.3 Similar population compositions: Population-aware matches cµ performance and Population-unaware performs well. . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.4 Dissimilar population compositions: Population-aware performs well, but population-unaware does not perform that well. . . . . . . . . . . . . . . . . . . . 24 1.5 Structure of fair cµ-preserving policies. Type-1 customers are prioritized over type-2 customers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.6 Different fair cµ-preserving policies characterized by different values of parameter w. Illustration of Proposition 1.5.1 with λA = λB = 1, µ1 = 3, µ2 = 2, αA,1 = 0.53, and αB,1 = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 1.7 Structure of optimal, fair policy when no fair cµ-preserving policy exists. Some type-2 customers are prioritized over type-1 customers. . . . . . . . . . . . . . . . 31 1.8 The cost of optimal, fair policies for varying utilization with service parameters µ1 = 4, µ2 = 2, and population-B composition fixed at αB,1 = .1. . . . . . . . . . . 32 1.9 The waiting time cost disparity of the FIFO and cµ policies for varying utilization with hold cost parameters c1 = 10, c2 = 5, and population compositions fixed at αA,1 = .9 and αB,1 = .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 1.10 Selecting between different cµ-preserving fair, population-aware policies. Parameters λA = λB = 1, µ1 = 3, µ2 = 2, αA,1 = .53, αB,1 = .5. . . . . . . . . . . . 42 2.1 Illustration of the implementation of a signaling mechanism using blockchain technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.2 The set co(GP) for a piecewise-constant payoff function. The prior is µ0 = 0.2. At the optimal solution, the signal mixes between the vertices of the orange plane which correspond to the beliefs µ1 = 0,µ2 = 0.4 and µ3 = 1. . . . . . . . . . . . . 62 2.3 Characterization of achievable frontier with commitment (left) and through a mediator (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 vi 2.4 The value of the Consultant as a function of the Firm’s belief. . . . . . . . . . . . . 67 2.5 The achieved Sender’s value with Blockchain Mediated Persuasion, when the mechanism only charges for one signal. . . . . . . . . . . . . . . . . . . . . . . . 69 2.6 The achieved Sender’s value as a function of the high cost κh, when κℓ = 0. . . . . 71 2.7 The achieved Sender’s value as a function of the gas fees g, where κℓ = g and κh = 2g. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 2.8 The achieved Sender’s value with the optimal BMP mechanism (red line), vs. the optimal committed persuasion mechanism (green line). . . . . . . . . . . . . . . . 75 3.1 Illustration of the event E1. We learn customer 1’s outgoing neighborhood is N out 1 (G) = {4,5} by observing customers 4 and 5 deviate from their recommended “buy” action. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 3.2 Illustration of the correlation structure employed by the SCP policy . . . . . . . . . 90 3.3 Comparison of revenue generation across SCP, CBP and ST policies . . . . . . . . 91 vii Abstract This work explores optimizing service systems with fairness considerations, harnessing blockchain technology for trustworthy mediation, and leveraging information design for network learning. Broadly speaking, this dissertation proposes innovative solutions to problems arising in modern operations. Below, we briefly outline the contributions of each chapter. Fair Scheduling of Heterogeneous Customer Populations The prioritization of access to scarce resources is ubiquitous throughout modern society, particularly in the context of service management. For instance, retailers prioritize customers paying with credit cards by providing dedicated lanes for electronic payment, call centers prioritize incoming calls based on request category such as account cancellation or tech support, and government services like TSA PreCheck effectively prioritize pre-screened flyers. Such prioritization ensures system capacity is first given to shorter jobs, so that overall waiting in the system is reduced. While optimizing system performance is clearly an important criterion, we must also recognize that customers are not siloed individuals but also belong to population groups within society that may differ on attributes such as gender, race, age, nationality, income level, place of residence, etc. Providing equitable access to individuals across different population groups is often an important consideration within service systems. For instance, consider a situation with two population groups where one population, say A, has a higher proportion of fast jobs than the other population, say B. Operational efficiency implies prioritizing customers with shorter service requirements, leading to much longer waiting times for members of population B. Thus, optimal scheduling without equity considerations can produce inequitable outcomes. viii In Chapter 1, we study a fairness vs. efficiency trade-off that arises in service systems catering to customers differentiated in their service requirements and population membership. For ease of exposition, we primarily focus on settings where customer service types are distinguished by mean processing time. Each customer belongs to exactly one service type and one population group. Each population is comprised of a fixed, exogenously determined, proportion of customers belonging to each service type. We solve a system-level average delay minimization problem with an imposed equity constraint that requires all populations to have the same expected waiting time. We first limit attention to population-unaware policies that schedule customers only on the basis of their service type and not their population membership. This class of policies is applicable to settings where population membership is either unknown or protected. We prove that, for many model specifications, there exist unaware policies that outperform First-In-First-Out (FIFO). Numerically, we find that these policies can capture a large fraction of the overall benefits of prioritization when the populations are not too disparate in their service type composition. We then expand our policy set to include population-aware policies that can schedule customers on the basis of both their service type and their population membership. We formally characterize these fair-aware policies and further we prove that, under certain parameter conditions, there exist population-aware policies that achieve the same optimal wait times as those obtained by the cµ priority policy that ignores fairness considerations. Specifically, we show that fairness need not come at the cost of efficiency when system utilization is sufficiently high. Our work illustrates the additional complexities that must be addressed when considering fairness in service systems. Ignoring the fact that customers belong to different population groups, and that these groups may be disproportionately impacted by optimal priority policies, can lead to societal inequities in our key operational metrics. We show that these inequities can be effectively combated in a straightforward manner and potentially without sacrificing any operational efficiency. ix Blockchain Mediated Persuasion1 In the classic Bayesian Persuasion model studied by Kamenica and Gentzkow (2011), there are two players: the first, called Sender, wishes to persuade the second, called Receiver, to take a desired action. Provided that Sender is ex-post better informed about the underlying state of the world, Sender can leverage their informational advantage by communicating with Receiver via a signal mechanism. By careful design of the mechanism, Sender is able to manipulate Receiver’s posterior beliefs to obtain higher expected payoffs. However, Sender’s ability to effectively manage Receiver’s beliefs largely hinges on the assumption that Sender can credibly commit to a signal mechanism. In most cases, it is not ex-post optimal for Sender to follow the mechanism, but instead to deviate and send the message that generates the highest payoff. Consequently, Receiver may not have faith in Sender’s ability to commit. In this case, all bets are off: persuasion devolves into cheap talk. An alternative approach that relaxes this commitment assumption is mediated persuasion, as studied by Salamanca (2021), where a credible mediator commits to a signal mechanism on behalf of Sender. Rather than sending a message directly to Receiver, Sender instead provides a reported state of the world to the mediator, and the mediator sends a message to Receiver according to the mechanism. Yet, for the mediator to be credible in the eyes of both Sender and Receiver, it is natural to require that the mediator be transparent, unchanging, and reliable. Otherwise, Sender or Receiver may not have faith that the mediator is capable of reliably committing to the signal mechanism. We argue that modern blockchain technology allows for the implementation of such a mediator as a smart contract. Simply put, smart contracts are pieces of code that are run on a blockchain, e.g. Ethereum. Once a smart contract is deployed it is completely public, cannot be modified, and reliably executes on demand. However, there is a caveat. Running smart contracts is costly, which means Sender must pay a (possibly message-dependent) cost to communicate with Receiver in this way. In Chapter 2, we present a costly mediated persuasion model where Sender sends a reported 1This portion of the abstract is adapted from the author’s preliminary work appearing in the Proceedings of the 24th ACM Conference on Economics and Computation. See (Drakopoulos, Lo, et al., 2023) for reference. x state of the world to a Blockchain Mediated Peresuasion (BMP) mechanism. The BMP mechanism then generates a message for Receiver and charges Sender a corresponding message fee. Receiver then updates their posterior based on the message and takes an optimal action. Surprisingly, we find that this costly blockchain mediation performs better than costless mediation in certain settings. We prove that, without loss of generality, one can consider straightforward and direct BMP mechanisms where Receiver follows a recommended action distribution and Sender truthfully reports the state of the world. We develop a geometric characterization of the optimal BMP value by constructing a certain convex hull that corresponds to distributions over the players’ incentive compatibility constraints and Sender’s payoff as a function of Receiver’s posterior belief and corresponding action distribution. In addition, we fully characterize the optimal payoff frontier as a function of the prior belief. Our results allow us to distinguish between two potentially valuable properties of BMP mechanisms: their ability to mediate and guarantee mappings between reports and messages, and their ability to price differentiate between signal realizations. Price differentiation effectively relaxes Sender’s incentive compatibility constraint, effectively making it easier for Sender to report the true state of the world. To illustrate our results, we consider an information selling problem in which a firm (Receiver) chooses between two possible investments with positive payoff only when the correct investment is chosen. The firm can guarantee a correct choice by hiring a consultant (Sender) who learns which project will be successful before the firm. The consultant wishes to persuade the firm to hire them at a cost. For this example, we demonstrate that costless mediation provides no benefit to the consultant. However, an application of our geometric characterizations shows that costly blockchain mediation can benefit Sender, and thus has arbitrarily more value than costless mediation. In fact, we show that a BMP mechanism with just two costs is optimal. Learning Networks via Persuasion There has been recent interest in extending the classic Bayesian persuasion framework to settings with many agents. In these models, an (ex-post) better informed Sender wishes to persuade a group of agents, called receivers, to take actions that maximize Sender’s payoff. Sender designs a xi signaling mechanism that functions as a randomized mapping from the state of the world to a tuple messages, one for each receiver. Until recently, most multi-receiver models ignored possible communication between receivers. That is, it was assumed that either every receiver gets the same message, often referred to as public signaling, or receivers get private messages that they keep to themselves. In reality, receivers may participate in social networks, in which they communicate the messages they receive to others in their neighborhood. Recently, researchers have begun to consider such information “spillovers”, and have found that traditional persuasion mechanisms perform poorly in their presence. If the network is known, Sender can design and commit to mechanisms that perform better by taking spillovers into account. However, what if Sender does not have knowledge of the network? In this case, there is no hope for designing an optimal signaling mechanism. The goal of Chapter 3 of this dissertation is to design signaling mechanisms that are able to learn the network structure in a repeated setting. We focus on an application where a firm sells a product of uncertain quality to a market of customers over an infinite time horizon. The firm is able to commit to a signaling mechanism at the start of each time period, forming a signaling policy over time. At every time, the firm observes the messages that are sent and each customer’s action. We develop two signaling policies that learn the network in expected time linear in the number of customers. The key idea behind our approach is to send messages that cause a single customer to be fully informed of the state of the world. The informed customer’s message has the potential to change every other customer’s action, should they learn it. This naturally reveals the neighborhood of the informed customer. By repeating this for each customer, Sender learns the entire network. xii Chapter 1 Fair Scheduling of Heterogeneous Customer Populations 1.1 Introduction Managers at service firms frequently face the problem of efficiently serving customers when resources are scarce. In such systems, key performance metrics tend to be the volume of customers served, average waiting times or queue lengths, etc. Optimizing these metrics is clearly important to both firms and customers. At the same time, we must realize that customers are not siloed individuals but rather belong to population groups within society that may differ on attributes such as gender, race, age, nationality, income level, place of residence, etc. As we will formalize in this chapter, though standard priority policies optimize system performance, they generate inequity across different population groups within society. The premise of this work is to acknowledge that providing equitable access to individuals across different population groups is often an important consideration within service systems and to study the impact of doing so when designing policies. To give a concrete example, consider pretrial detention that occurs when a defendant, unable to make bail, is held in detention until their case is resolved either by trial or plea bargain. This waiting period is costly both for the criminal justice system and the accused defendant. The justice system must detain the accused during the entirety of the pretrial waiting period, which comes at a great cost for taxpayers. On the other hand, the detained defendant incurs both economic and psychological costs, as well as an increase in the chance of future conviction and recidivism. It follows that courthouses have a strong incentive to process cases as efficiently as possible so 1 as to reduce the length of such pretrial waiting times. However, court systems are often highly congested, with defendants waiting months and even years to have their day in court (Leslie and Pope, 2017; Lewis, 2021). Moreover, the type of crime that an individual is charged with can impact how long they wait for justice. For example, Leslie and Pope (2017) show that pretrial detention for misdemeanor cases is significantly shorter than for felony cases in New York City (NYC). A clear explanation for this discrepancy is that misdemeanor cases are faster to process than felony cases. That is, misdemeanors have a shorter average service requirement than felonies and NYC mandates that pretrial detention for misdemeanors must be shorter than that of felonies, which indirectly sanctions a priority scheme. One may argue operationally that it makes perfect sense to prioritize misdemeanors before felonies to decrease system waiting times. However, a defendant’s demographic characteristics are also correlated with time spent in a state of pretrial detention. According to Lewis (2021), most of the defendants waiting for justice belong to either the Black or Latino communities. In their study, Leslie and Pope (2017) note that minority groups are more likely to be charged with felonies and subsequently sentenced to prison. Per Piquero (2008), there are two main hypotheses posited for this, that it could be due to “differential involvement”, i.e., the crime nature and rate differ across the communities, or “differential selection and processing hypothesis” that comprises the notion of “differential selection” in terms of differences in police presence, patrolling, and profiling in minority and non-minority neighborhoods, and that of “differential processing” by the courts that leads to more minorities being arrested, convicted, and incarcerated. Piquero (2008) urges a move beyond which hypothesis is more relevant to moving to address these inequities. In this chapter, as one step in that direction, we argue that equitable access to justice across demographic groups is of particular salience when considering case scheduling in judicial systems. We study a stylized queueing model that captures the nuances of this fairness-efficiency tradeoff. Specifically, we consider a single server system where individual customers are differentiated in their service requirements and waiting (or holding) costs. Additionally, the customers belong to 2 one of several populations. In the criminal justice application, a defendant’s criminal charge corresponds to service type and their demographic group corresponds to population membership. The service requirement for an arriving customer is drawn from one of several service distributions. Service types may differ in their mean processing times, with some service types being faster to process than others, as well as their waiting costs. We assume customers from each population arrive independently according to a Poisson process, and we place no distributional assumptions on the service time distributions. Each population is comprised of a fixed, exogenously determined, proportion of each service type. We focus on solving a system-level average delay cost minimization problem with an imposed equity constraint that requires all populations to have the same average waiting costs. The majority of the chapter tackles the case of service time differentiation alone, keeping waiting costs constant. This allows us to better compare fairness with efficiency by focusing on the stochastics around service times and ignoring impatience attributes. In this case the First-In-FirstOut (FIFO) policy is unarguably the fairest policy. FIFO ensures not only that population groups have identical average waiting times, but also that all customers arriving to the system have identical average waiting times. At the same time, to achieve operational efficiency in such systems, we need to prioritize customers with shorter service requirement (on an average) over those with longer service requirement (on an average). However, such scheduling decisions may lead to wait time inequity across populations. The fact that FIFO policies can be very inefficient and priority policies that are optimized for delay minimization can be extremely unfair further highlights the underlying trade-off between fairness and efficiency, and it motivates the need to look beyond these simple policies to find balance between these two competing goals. We begin our analysis by first optimizing on policies that are population-unaware in the sense that, like classic priority policies, they only schedule arriving customers by their service types. We acknowledge that in many applications, it may not be feasible to prioritize based on a customer’s population membership, particularly if customers are differentiated according to some protected characteristics, like race or gender. We prove that in many settings, population-unaware policies 3 strictly improve over the FIFO-policy. In fact, when the populations are similar in their service type composition, the system manager can substantially decrease system waiting times, relative to those under FIFO, by selecting an appropriate population-unaware policy. However, we do find that these improvements can be small if the populations are too dissimilar with respect to their service type composition. We next consider population-aware policies that differentiate between customers on the basis of both their service type and their population membership. While we acknowledge that it may not be feasible to implement these policies in many applications, we believe this larger class of policies allows us to understand the value of utilizing customer population information when scheduling and provides us with a benchmark with which we can compare all other fair policies. We formally characterize these population-aware policies and surprisingly we prove that, under weak parameter conditions, there exist fair, population-aware policies that achieve the same optimal wait times as those obtained by using the optimal priority (cµ) policy that ignores fairness considerations altogether. Our work illustrates the additional complexities that must be addressed when considering fairness in service systems. Ignoring the fact that customers belong to different population groups, and that these groups may be disproportionately impacted by priority policies, can lead to societal inequities in our key operational metrics. We show that these inequities can be combated in a straightforward manner and potentially without sacrificing much operational efficiency. We organize the chapter as follows. A review of relevant literature is presented in Section 1.2. In Section 1.3, we formally introduce our model and provide the notation and background that will be utilized throughout the rest of the chapter. Also in Section 1.3, we state our definition of population fairness and define the fairness constrained delay minimization problem. In Section 1.4, we find necessary and sufficient conditions for the existence of non-FIFO, population-unaware fair policies that differentiate between customers only by service type. In Section 1.5, we solve the corresponding fair optimization problem assuming that the policy can differentiate customers by service type and population membership, i.e., is population-aware. We solve the problem explicitly 4 in the case of two populations and two service types, and we provide insights for the solution of the general problem. In Section 1.6 we extend the discussion and many of the results from Section 1.4 and Section 1.5 to settings where holding costs are not necessarily uniform across service types. 1.2 Literature Review Our work contributes to the growing literature regarding fairness in operations management, as well as the growing literature in algorithmic fairness. Specifically, our model contributes to the literature by studying fairness and social justice within service systems. Below we give a brief overview of the literature that is closest to our work. Our work is most closely related to the works of Armony and Ward (2010) and (2013), which study fair dynamic routing in large scale heterogeneous queueing networks. Their focus is on studying server-side fairness, and specifically in terms of identifying policies to route customers fairly to the heterogeneous (in ability) servers so that the idle time across different server groups is equal. Conceptually our approach on fairness is similar but we focus on the customer-side and consider scheduling our service capacity to ensure fairness in terms of equal waiting times for different customer groups. Another point of distinction is that we focus on exact steady-state analysis, whereas those works focus on asymptotic analysis. Our focus on customer (or job) based fairness within a queueing system makes our work related to the literature that has studied unfairness among jobs under certain queueing disciplines. Bansal and Harchol-Balter (2001) study unfairness within the realm of the shortest remaining processing time discipline. This work compares shortest remaining processing time to the processor sharing discipline by using response time and slowdown as the primary unfairness metrics. Another work of Wierman and Harchol-Balter (2005), categorizes policies in terms of fairness. Avi-Itzhak and Levy (2004) provide an axiomatic framework for fairness functions and show that, within their framework, the first in first out policy is the most fair. For a survey of fairness in scheduling for 5 single server queues see Wierman (2011). Our work differs from the above by considering fairness from a group-level perspective. Notions of social justice have been raised in the service literature, see for instance Larson (1987) that provides an anecdotal perspective. We believe our work contributes by formalizing what a notion of social justice from a queueing theoretic perspective could look like, and how one may incorporate this into decision-making. The solution to our problem involves carefully crafting mixed priority policies utilizing the achievable region approach pioneered by Coffman and Mitrani (1980). There has been much research pertaining to optimal scheduling of multi-class queues. In many cases, a strict priority policy, or index policy, that prioritizes among the service classes is optimal or asymptotically optimal. Cox and Smith (1961) prove that the cµ-policy is an optimal scheduling policy for single server queues without customer abandonment. Priority policies like the cµ-policy have also proven to be asymptotically optimal in more complex settings: see for example Mandelbaum and Stolyar (2004). Atar et al. (2010) prove that the analogous cµ/θ-rule is asymptotically optimal for manyserver queues with customer abandonment. We are not the first to modify such priority rules to gain better performance for a particular application. Hu et al. (2021) modify the cµ/θ-rule to better serve moderate and severely ill patients in a healthcare setting. However, to the best of our knowledge, our work is the first to adapt priority rules to ensure wait time equity across groups within society. We direct the reader to the recent tutorial Puha and Ward (2019) for additional background on the topic of scheduling multi-class many-server queueing systems. Our work is also related to the stream of literature regarding fair resource allocation. Some pertinent examples include the works of Bertsimas, Farias, et al. (2011) and (2012), which study the trade-off between fairness and efficiency within resource allocation problems broadly, and their subsequent paper that focuses on the application of organ allocation for kidney transplants (Bertsimas, Farias, et al., 2013). In a similar vein, Ata et al. (2020) study the kidney transplant wait list via a fluid model, and focus on finding an allocation that balances equity and efficiency. Also within the healthcare operations literature, McLay and Mayorga (2012) study policies that 6 balance equity and efficiency in emergency medical service dispatching to high and low priority customers. Lien et al. (2014) and more recently Manshadi et al. (2023) consider fair resource allocation in dynamic (online) allocation settings. The notion of group-level fairness, as considered central in this chapter, has been studied before in different settings. In particular, Nanda et al. (2020), Cayci et al. (2020) and Ma et al. (2022) focus on fairness across groups in the context of resource allocation. Group-level fairness has also been discussed by Artiga et al. (2020) in the context of providing equitable medical care for underserved demographic groups. Finally, group-level fairness has been a focal point of study within the algorithmic fairness literature, most of that work has been geared towards applications in machine learning, where corrections for unintentional biases and discrimination created inadvertently during the training process are sought. Corbett-Davies et al. (2024) provide a general survey of the recent advances in the fair machine learning literature. 1.3 Model We consider a system that comprises P populations of customers arriving to a service system where they receive service from a single server.1 Unless stated otherwise, we assume that the server processes work at a normalized rate r = 1. Further, we assume that any given customer entering the system can be one of T service types. We think of each population as representing a group of people within society, e.g., each population represents a group of people with a common characteristic such as demographic information. We let P be the set of populations with typical element p ∈ P, and we let T be the set of service types with typical element t ∈ T . We let λp > 0 denote the arrival rate of population p, for p ∈ P. We let 1/µt be the mean service time and ct be the holding cost for service type t ∈ T . Without loss of generality, we take the numbering convention that c1µ1 ≥ c2µ2 ≥ ··· ≥ cT µT . We assume that the arrival processes are Poisson. We assume that service times of each service type are independent and identically distributed. We assume that service times are generally distributed, and we use mt to denote the second moment for 1For simplicity, we refer to arrivals as customers, although this title may not be fitting for all applications. 7 service type t ∈ T . For ease of exposition, we first focus on the case of normalized holding costs ct = 1. However, we provide extensions to settings with general holding costs in Section 1.6.1. In our model, the populations differ in their composition of service types. We model this difference by introducing, for each population p ∈ P and service type t ∈ T , the parameter 0 < αp,t < 1 that denotes the proportion of population p that belong to service type t. We assume that the αp,t are fixed and known quantities. Intuitively, this assumption can be thought of as the firm having access to enough high quality data (e.g. demographic/census data or customer data) so that it is possible to accurately estimate the proportion of each population belonging to each service type. Using αp,t , we can write the arrival rates for each population-service type pair (p,t) as λp,t = αp,tλp. Correspondingly, the total utilization ρ of the system can be written as ρ = ∑p∈P ∑t∈T ρp,t , where ρp,t = λp,t/µt . For stability, we require that the overall utilization is less than 100%, i.e, ρ < 1. Generally speaking, we consider policies as mappings π(t) that determine at each time t the customer to be served at time t. We restrict attention to the class of non-preemptive work conserving policies, i.e. non-preemptive policies that are non-idling and non-forward looking. We denote this admissible set of policies by Π. Work conserving policies are quite common in the literature (see for example the assumptions of Kleinrock (1965) and Coffman and Mitrani (1980) as well as Definition 1 of Afeche ( ` 2013)). We note that, by utilizing strategic delay as in Afeche ( ` 2013), one can expand the set of admissible policies to include non-work conserving policies. However, we focus on work conserving policies for the remainder of the chapter. Given a policy π ∈ Π, a population p ∈ P and a service type t ∈ T , we let Wπ p,t denote the expected steady state wait time in queue for population p, type t customers under the policy π. We let Wπ p be the expected steady state wait time in queue for population p (where the expectation is also taken over service types). It follows from a straightforward application of Little’s law that the expected steady state wait time for population p ∈ P is given by Wπ p = ∑ t∈T αp,tWπ p,t . (1.1) 8 We measure the efficiency of a policy π ∈ Π by the total expected steady state delay cost in system under π: ∑p∈P ∑t∈T λp,tctWπ p,t . 1.3.1 Definition of Fairness Our goal is to understand the relationship between optimal scheduling and fairness across populations within society. In this subsection, we formally define our notion of fairness, which entails providing identical average steady-state waiting times to each population. Definition 1.3.1 A policy π is population fair if Wπ p = Wπ p˜ for all populations p, p˜ ∈ P. Fairness as stated in Definition 1.3.1 deals strictly with the notion of fairness across the populations. By adhering to the notion of population fairness given by Definition 1.3.1, we are effectively saying that we are concerned with finding policies that treat each population of customers exactly the same with respect to long run average waiting time. This notion of fairness places the population or group at the forefront rather than the experience of the individual customer. We argue that such wait time parity across populations is natural in our setting because population type is an abstraction of customer attributes, and equitable treatment regardless of the group a customer may belong to is a reasonable social goal. Of course, we acknowledge that our notion of population fairness is not the only notion of fairness that is relevant in service systems, but understanding how wait time parity across populations affects policy cost is a novel and important perspective that needs to be better understood, particularly since social justice and equity considerations have become increasingly important to firms in recent years. Although our definition of fairness is specific to populations, for convenience, we will refer to policies satisfying Definition 1.3.1 as fair policies, and likewise we will refer to the equations Wπ p = Wπ p˜ for all populations p, p˜ ∈ P as fairness constraints. 9 1.3.2 The Fair FIFO-policy is Inefficient One of the simplest fair policies is the FIFO-policy. One special property of the FIFO-policy is that it operates without knowledge of the population or service type of incoming customers. That is, the FIFO-policy does not attempt to prioritize, or rearrange service based on population or servicetype. Thus, every customer, regardless of their characteristics, can expect to wait the same amount of time upon entering the service system. Therefore, the FIFO-policy provides a benchmark for fairness, as no matter what properties a customer has, they can expect to wait the same as any other customer. In many settings, particularly settings in which the queue is visible to arriving customers, people often expect to see service conducted in a FIFO manner, e.g. bank lines, deli lines, checkout lines, etc. Thus in these settings, it may not be feasible to prioritize according to different service types or populations, and FIFO may be the only feasible policy. However, while the FIFO-policy is arguably the “fairest” policy available at our disposal, and even though it is fitting for many applications, FIFO leads to extremely inefficient results. This reality had led to the development of alternative policies that prioritize based on customer characteristics. 1.3.3 The Efficient cµ-Policy is Unfair A classic prioritization rule is the cµ-policy that prioritizes based on the customer characteristics of service distribution and holding cost. Like the FIFO-policy, the cµ-policy does not take into consideration the population membership of customers. However, unlike the FIFO-policy it prioritizes according to service type characteristics. It follows that there exists a common wait time W cµ t for each service type t ∈ T , i.e., for each population p ∈ P and service type t ∈ T , W cµ p,t = W cµ t . This equivalently means that within each service type, customers are served in a FIFO manner. However, as we will demonstrate below, this leads to unfair outcomes across populations. To illustrate, consider the case when there are two populations P = {A,B} and two service types T = {1,2}. We assume µ1 ≥ µ2, so that the cµ-policy prioritizes type-1 customers over 10 Figure 1.1: Inefficiency of the FIFO-policy for two populations A and B. Figure 1.2: Unfairness of the cµ-policy across populations A and B. type-2 arrivals. Furthermore, assume that the proportion of A arrivals that belong to service type-1 is larger than the proportion of B arrivals that belong to service type-1, i.e., αA,1 ≥ αB,1. Essentially, the cµ-policy leverages differences between different classes of customers to process more costly type-1 customers before less costly type-2 customers, in order to save on total cost. In Figure 1.1, we observe that the FIFO-policy performs far worse than the cµ-policy, especially when utilization is high. This demonstrates that prioritization is key when designing policies that perform well from a cost perspective. Applying the wait time formulas for the cµ-policy, we obtain the following result that compares the wait times of population-A and population-B under the cµ-policy. Proposition 1.3.1 For the cµ-policy in the setting of two populations and two service types, the population expected waiting times are equal if and only if αA,1 = αB,1. Specifically, the ratio of population expected waiting times satisfies W cµ B W cµ A = αB,1(1−ρ) +αB,2 αA,1(1−ρ) +αA,2 . (1.2) 11 Equation (1.2) gives a precise expression for the unfairness of the cµ-policy. It demonstrates that unfairness under the cµ-policy depends on the exogenously given parameters αp,t that determine the service type composition of each of the populations, as well as the total utilization ρ. From (1.2), we see that the cµ-policy achieves equal wait times across populations if and only if αA,1 = αB,1, i.e., the cµ-policy is a population fair policy if and only if the two populations have the same proportion of customers that are of service type-1. It follows that, unless the composition of the two populations are perfectly balanced, the cµ-policy is guaranteed to generate some amount of long run inequity between population wait times. In addition, (1.2) implies that for high utilization the disparity between the two population wait times depends entirely on the population composition parameters. Somewhat surprisingly, this means that when utilization is high, the discrepancy between population wait times is roughly independent of the arrival rates of A and B populations to the service system. As an illustration of (1.2), Figure 1.2 plots the ratio of population wait times for two different compositions. Comparing the two curves, we notice that the the case of higher proportion of type1 customers in population-A leads to more unfairness in the cµ-policy and is represented by the higher curve (blue). We also observe that the gap between the two curves is small at low utilization and increases as utilization increases. Therefore, the amount of unfairness under the cµ-policy is sensitive not only to utilization level, but also to population composition. Since, under cµ prioritization, type-1 customers are served before type-2 customers and populationA has disproportionately more type-1 customers than population-B, population-A reaps the benefits of the cµ prioritization more so than population-B. 1.3.4 Fairness Constrained Delay Minimization Problem Now that we have established the tension between fairness and efficiency, at this point, it is helpful to bring into foreground our fairness considerations in conjunction with cost minimization. Noting the reasonableness of minimizing total delay that is found throughout the literature, we propose 12 continuing to minimize overall expected delay but with the addition of the fairness constraints given in Definition 1.3.1. Specifically, the optimization problem we seek to solve is minimizeπ∈F ∑ p∈P ∑ t∈T λp,tWπ p,t (1.3) subject to Wπ p = Wπ p˜ , ∀p, p˜ ∈ P. We note that F ⊆ Π is a predetermined set of feasible policies considered for the optimization problem. There is good reason for considering different feasible sets F when solving (1.3). For example, in some applications, like hiring processes, it is possible to prioritize applications with higher skill levels (service type), but it is illegal to prioritize based on any protected characteristics (population membership). In other applications, it is acceptable to prioritize based on both service type as well as population membership. Therefore, the choice of feasible set F may vary depending on the application at hand. We choose to study the solutions to problem (1.3) under two different choices for F. We distinguish between policies that are permitted to prioritize, or differentiate, based on service type as well as population membership, and those that are only able to prioritize based on service type. We analyze the problem under these two policy classes to understand the value of utilizing customer population membership when scheduling fairly. The following definition captures those policies that only distinguish across service types. Definition 1.3.2 A policy π ∈ Π is population-unaware if it schedules arrivals only based on their service type. We let Πunaware denote the set of population-unaware policies. Any policy π ∈ Π that is not population-unaware is population-aware. We note that any population-unaware policy π ∈ Πunaware satisfies Wπ p,t = Wπ p˜,t for all populations p, p˜ ∈ P and all service types t ∈ T . In fact, any policy that satisfies this condition can be implemented by a population-unaware policy that achieves the same cost. In Section 1.4 we set F = Πunaware and discuss in detail optimal fair, population-unaware policies that schedule without utilizing the specific population information of each individual customer. 13 We then expand our policy space in Section 1.5 and study problem (1.3) with F = Π, allowing for population-aware policies that may schedule arrivals using both their service-type and population membership. The optimal, fair population-aware policy provides a benchmark for fair performance. Later we contrast the performance of the optimal policy in each case to quantify the value of considering population-aware policies. Since Πunaware ⊂ Π, we expect the optimal solution of problem (1.3) with F = Π to perform better than the optimal solution found when restricting the feasible set (1.3) to population-unaware policies. 1.3.5 Methodology To solve the fairness constrained optimization problem given in (1.3), we leverage two useful results regarding expected steady state wait times of priority queues. In particular, we find it useful to transform (1.3) from an optimization problem over feasible policies to an optimization problem over feasible wait time vectors. By performing this transformation, we are able to convert (1.3) into a linear program that is more amenable to our analysis. In this section, we describe in detail the socalled achievable region methodology established by Coffman and Mitrani (1980). For simplicity, we take the feasible set of problem (1.3) to be F = Π and note that the techniques described here also apply when F = Πunaware (see Section 1.4 for details). Define Πpriority to be the set of all strict priority policies over the population-service pairs {(p,t) : p ∈ P,t ∈ T }. It turns out that any wait time vector corresponding to a policy in Π can be written as a convex combination of the wait time vectors corresponding to these strict priority policies. To be more precise, consider the set of all wait time vectors that arise from the set of feasible policies π ∈ Π: W (Π) = Wπ p,t p∈P,t∈T : π ∈ Π . Coffman and Mitrani (1980) show that W (Π) = conv Wπ = Wπ p,t p∈P,t∈T : π ∈ Πpriority , (1.4) the convex hull of the wait times generated by each element of Πpriority. Equivalently, (1.4) says that W (Π) is a bounded polyhedron with set of extreme points Wπ , π ∈ Πpriority, the set of strict priority policies on the population-service type pairs. So, any feasible wait time vector can be represented as a convex combination of wait time vectors corresponding to the strict priority policies, and in fact it follows that any feasible policy can be represented as mixture of the strict priority policies. That is, given a vector W ∈ W (Π), W = ∑ π∈Πpriority wπWπ , ∑ π∈Πpriority wπ = 1, w ≥ 0. It follows that we can equivalently write problem (1.3) as the problem of deciding the optimal weighting for each wait time vector corresponding to priority policies π ∈ Πpriority . Note that there are (P × T)! ways to strictly prioritize the P × T population-service type pairs. Thus, the equivalent problem is min w∈R(P×T)! ∑ π∈Πpriority wπ ∑ p∈P ∑ t∈T λp,tWπ p,t ! (1.5) subject to ∑ π∈Πpriority wπ ∑ t∈T αp,tWπ p,t −αp+1,tWπ p+1,t ! = 0, ∀p ∈ P \ {P}, ∑ π∈Πpriority wπ = 1, w ≥ 0. Note that for each priority policy π ∈ Πpriority, the corresponding wait time vector Wπ is computable by the closed form expression for strict priority wait times. For the formula see Kleinrock (1976). 1 Alternatively, we can restate problem (1.5) as an optimization problem over the set of feasible wait time vectors. The equivalent optimization problem over feasible wait time vectors is: minimize Wπ∈W (Π) ∑ p∈P ∑ t∈T λp,tWπ p,t (1.6) subject to ∑ t∈T αp,tWπ p,t = ∑ t∈T αp˜,tWπ p˜,t , ∀p, p˜ ∈ P. where Wπ = (Wp,t)p∈P,t∈T denotes the wait time vector associated with policy π ∈ Π in the feasible set W (Π). We make use of both formulations. Now we discuss how to construct a policy that achieves wait time vector W = ∑π∈Πpriority wπWπ ∈ W (Π). Following Coffman and Mitrani (1980), let π¯ be the following policy: at the end of each busy period with probability wπ, use corresponding strict priority policy π until the end of the next busy period. Clearly, π¯ ∈ Π. Let Wπ¯ be the expected steady state wait time vector under policy π¯. Then, as shown in Coffman and Mitrani (1980), Wπ¯ = W. Following this construction, we see that given a feasible wait time vector, we can construct a policy that induces that wait time vector by mixing appropriately across the strict priority policies. A precursor to the feasible wait time region was Kleinrock’s wait time conservation theorem. In essence, the wait time conservation theorem states that, in a priority queue, the sum of class level wait times weighted by their respective utilization remains constant regardless of the policy that is implemented, so long as the policy is nonidling and non-forward looking. In our setting, the conservation theorem states that for any feasible policy π ∈ Π, the sum of waiting times satisfies ∑ p∈P ∑ t∈T ρp,tWπ p,t = ρ 1−ρ W0, (1.7) where the constant W0 is given by W0 = ∑t∈T λtmt 2 , where mt denotes the second moment of the service time distribution for type t ∈ T and λt = ∑p∈P λp,t is the effective arrival rate for service type t ∈ T . For a proof of this result we direct the reader to Kleinrock (1965). We utilize the wait 16 time conservation theorem repeatedly throughout the chapter to obtain expressions for the optimal, fair wait time vector. Finally, we note that the above methodology can be extended to settings with non-work conserving policies. Afeche ( ` 2013) utilizes the achievable wait time approach with non-work conserving policies. In this case, (1.7) does not hold with equality, but rather with inequality, with the right-hand-term providing a lower bound on the sum of the waiting times. Therefore, in that more general setting one optimizes over an unbounded convex region of achievable wait time vectors, rather than a bounded polytope as in (1.6). 1.4 Population-Unaware Policy Design In many applications, it may not be feasible to solve optimization problem (1.3) over the set of all policies, but rather only over the subset of policies that do not prioritize based on population membership. Here we first study the problem with this restricted set of policies. In Section 1.4.2, we use a simple two-population, two-type example to prove that in some settings, any policy that is fair and population-unaware is equivalent to FIFO, in the sense that the policy has the same expected steady state waiting time vector as FIFO. In Section 1.4.3, we provide a general result on existence of policies that are both fair and unaware that improve on the FIFO-policy. Fortunately, we find that for many model specifications, improvement over FIFO is possible. We conclude with an illustration of our main findings using a simple example with two populations and three types. 1.4.1 Simplified Optimization Problem We will consider an analogous version of problem (1.6), where we optimize over wait time vectors, instead of policies. Mathematically, population-unawareness implies that the wait time for each population is the same within each service type, i.e., for π ∈ Πunaware, Wπ p,t = Wπ t for all p ∈ P and t ∈ T . It follows that we no longer must decide on wait time vectors for each population-service pair, but rather just wait time vectors for each of the T service types. Equivalently, this means 17 that we optimize over the set of feasible wait time vectors that lie in T-dimensional space, rather than P × T-dimensional space. In this sense, population-unaware policies can be implemented without knowledge of individual customers’ population membership. We do note that we continue to assume that proportions αp,t are known by the firm for each population p and service type t, and this information is indeed utilized in designing the population-unaware policies, even though the actual implementation will not require us to know individual customer population membership. The fairness constraint of (1.6) can also be further simplified. Consider the fairness constraint that links population p and population ˜p: Wπ p = Wπ p˜ . Recall that the population p wait time is given by Wπ p = ∑t∈T αp,tWπ t . By applying Definition 1.3.2, we get that Wπ p = Wπ p˜ can be equivalently expressed as ∑t∈T (αp,t −αp˜,t)Wπ t = 0. Moreover, we need not include all possible fairness equations linking each of the populations. In fact, it is necessary to include only P − 1 fairness constraints to capture equality of wait time across all the populations. To encode the fairness constraints, we find it convenient to define the matrix A ∈ R (P−1)×T with entries ap,t = αp,t −αp+1,t . Then, observe that AWπ = 0 guarantees that Wπ p = ∑t∈T αp,tWπ t = ∑t∈T αp+1,tWπ t = Wπ p+1 for each p ∈ {1,2,...,P−1}. Putting the above remarks together, we get that problem (1.3) reduces to an optimization problem over fair, feasible wait time vectors Wπ ∈ R T : minimizeWπ∈W (Πunaware) ∑ t∈T λtWπ t (1.8) subject to AWπ = 0, ∑ t∈T ρtWπ t = ρ 1−ρ W0, where with slight abuse of notation, we let W (Πunaware) be the set of wait time vectors, with components corresponding to the wait time of each service type, generated by the policies that prioritize only based on service type: W (Πunaware) = {(Wπ t )t∈T : π ∈ Πunaware} ⊂ R T . Noting that (1.7) specialized to unaware policies gives the conservation equation ∑t∈T ρtWπ t = ρ 1−ρW0 and that every wait time vector Wπ ∈ W (Πunaware) must satisfy wait time conservation (see Section 1.3.5), 18 we get that any Wπ ∈ W (Πunaware) satisfies the conservation constraint ∑t∈T ρtWπ t = ρ 1−ρW0, which we state redundantly as a constraint in (1.8). 1.4.2 Fair, Population-unaware Policies May Not Perform Better Than FIFO Before we discuss solutions of the general problem given by (1.8), to gain intuition we assume for the moment that there are two populations P = {A,B} and two service types T = {1,2}. Now, in this case, there is only a single fairness constraint, namely (αA,1 −αB,1)Wπ 1 + (αA,2 −αB,2)Wπ 2 = 0. (1.9) Since there are only two service types it follows that, for p ∈ {A,B}, αp,2 = 1 − αp,1. So, upon substituting this into the fairness constraint and simplifying, we see that either αA,1 = αB,1, or Wπ 1 = Wπ 2 . Therefore, either the populations are comprised of the same proportions of service types, in which case the cµ-policy is feasible (as discussed in Section 1.3.2), or the wait times for each service type are the same. Note that the latter outcome is exactly the one achieved by the FIFO-policy. This discussion leads directly to the following proposition: Proposition 1.4.1 For the setting with P = 2 and T = 2, provided that αA,1 ̸= αB,1, the only policies that are both population fair and population-unaware have the same waiting time vector as the FIFO-policy. In particular, in this setting, FIFO solves (1.3). One way to view Proposition 1.4.1 is as an impossibility result that says there is no possibility of doing better than FIFO when we restrict ourselves to the set of fair, population-unaware policies in the P = T = 2 setting. In some sense, this shows that limiting ourselves to policies that are both population fair and population-unaware can be restrictive and costly. The algebra described above gives a straightforward proof of this fact that there is no possibility of improving on FIFO in this simple P = T = 2 setting. We can equivalently take on a geometric perspective to see why Proposition 1.4.1 is true. Any fair, population-aware policy of course satisfies the fairness constraint (1.9) as well the conservation equation (1.7) specialized to 19 population-unaware policies, which in this setting is just ρ1W1 + ρ2W2 = ρ 1−ρW0. Note, each of these equations can be represented by a line in two dimensional space. Moreover, since FIFO is both population fair and population-unaware, it lies on both of these lines, and thus these lines are intersecting. Also, provided that the proportion of type-1 customers are not the same for each population, the lines are not overlapping, since for example the cµ-policy is population-unaware but not population fair in this case. Therefore, the solution is unique, and thus no improvement on FIFO is possible. The main takeaway from this special case is that it is not to be taken for granted that the set of feasible solutions for the population-unaware fairness problem (1.8) is nontrivial. In many cases, it could be that every feasible policy must cost the same as the FIFO-policy under these restrictions. Fortunately, as we will see in the following section, there are also many settings in which problem (1.8) is not trivial, that is, the feasible region is not the singleton {WFIFO}. This requires additional dimensions of flexibility that we formalize next. 1.4.3 Existence of Non-FIFO, Fair, Population-Unaware Policies We know that if π ∈ Π is a fair, population-unaware policy, then its corresponding wait time vector must satisfy the system of equations AWπ = 0. So, any fair, feasible policy must have wait time vector that lies in the solution space of AWπ = 0. The converse, however, is not true. If a vector W ∈ R T satisfies AW = 0, then it may not be feasible. Furthermore, we know that if π is a population-unaware policy, then it must satisfy wait time conservation, i.e., ∑t∈T Wπ t = ρ 1−ρW0. Again, while the converse of the statement is not true, it tells us that the set of feasible wait time vectors lies in the solution space of the given by the conservation equation. The key observation is that the FIFO-policy lies in the interior of the feasible region. So, by using a convexity argument, if there exists a vector in the intersection of the two solution spaces that is distinct from WFIFO, then there has to exist a wait time vector close to WFIFO, close enough that it lies in the feasible region. We make this formal in the following theorem. 20 Theorem 1.4.1 Consider the equation system AW = 0, (1.10) ∑ t∈T ρtWt = ρ 1−ρ W0. System (1.10) has at least one solution. If (1.10) has more than one solution, then there exists a non-FIFO wait time vector that is both fair and population-unaware. If, on the other hand, WFIFO is the unique solution of (1.10), then the only policies that are both population fair and population-unaware have the same waiting time vector as the FIFO-policy. By observing that the system of equations given by (1.10) has P equations in T unknowns, we see that if the number of service types exceeds the number of populations, then the equation system is undetermined. Since WFIFO is a solution to the equation system, the equation system is consistent, and therefore must have infinitely many solutions. This remark immediately leads to the following corollary. Corollary 1.4.1 If the number of service types exceeds the number of populations, then there exists a policy corresponding to a non-FIFO waiting time vector that is both fair and populationunaware. Corollary 1.4.1 does not have a converse. However, in many cases, we expect the converse to hold true. This is because when the number of populations is greater than or equal to the number service types, there are at least as many equations as unknowns, and so we expect from elementary algebra a unique solution in this case, unless the A matrix has too many linearly dependent rows. This is unlikely unless parameters are chosen specially to achieve such an outcome. Thus, in many settings, when the number of populations is greater than or equal to the number of service types, any fair, population-unaware policy has the same cost as FIFO. The simplest example to illustrate existence of a fair, unaware policy with non-FIFO waiting time vector is in the case of two populations and three service types. For concreteness, assume that 21 P = {A,B} and T = {1,2,3}. Since there are only two populations, as in the example of Section 1.4.2, there is only a single fairness constraint imposing equality of population-A and B wait times: (αA,1 −αB,1)Wπ 1 + (αA,2 −αB,2)Wπ 2 + (αA,3 −αB,3)Wπ 3 = 0. (1.11) Any feasible, population-unaware policy must also satisfy the wait time conservation equation ρ1Wπ 1 +ρ2Wπ 2 +ρ3Wπ 3 = ρ 1−ρ W0. In this setting, unlike in Section 1.4.2, the solution sets of each of the fairness and wait-time conservation constraints form a plane lying in three dimensional space. These two planes intersect because WFIFO satisfies both constraints, as it is both population fair and population-unaware. If the populations have identical population composition vectors, then every population-unaware policy is fair, and so there are infinitely many fair, population-unaware policies with wait time vectors distinct from WFIFO. Otherwise, there exist policies that are not fair, e.g. the cµ-policy, and so these two planes are not the same. Thus, the intersection must be a line. Since the boundary of the feasible set W (Πunaware) corresponds to mixtures of two or fewer strict priority policies, the FIFO-policy lies in the interior of the feasible set of wait time vectors, it follows that there is a feasible neighborhood around WFIFO that is contained in the intersection of the two planes. Therefore, there are infinitely many population fair, population-unaware policies in the P = 2, T = 3 setting. Thus, optimizing over this set, we expect to identify the optimal fair and populationunaware policy. The following proposition provides the formal solution. Proposition 1.4.2 Define K = λ1 +λ2 ρ3aA,1 −ρ1aA,3 ρ2aA,3 −ρ3aA,2 −λ3 ρ2aA,1 −ρ1aA,2 ρ2aA,3 −ρ3aA,2 , b = min{Wπ 1 |π ∈ Π is population fair and population-unaware}, and b¯ = max{Wπ 1 |π ∈ Π is population fair and population-unaware}. 22 Figure 1.3: Similar population compositions: Population-aware matches cµ performance and Population-unaware performs well. For the setting with P = 2 and T = 3 the optimal solution for problem (1.8) is given by Wπ ∗ 1 = w, Wπ ∗ 2 = aA,3 ρ2aA,3 −ρ3aA,2 ρ 1−ρ W0 + ρ3aA,1 −ρ1aA,3 ρ2aA,3 −ρ3aA,2 w Wπ ∗ 3 = − aA,2 ρ2aA,3 −ρ3aA,2 ρ 1−ρ W0 − ρ2aA,1 −ρ1aA,2 ρ2aA,3 −ρ3aA,2 w where w solves minimizeb≤w≤b¯Kw. 1.4.4 Cost Comparison for Different Policies Now that we know that, in many cases, we can find fair unaware policies that improve on the FIFOpolicy, we will compare their performance with the less restrictive population-aware policies to get a sense of how much efficiency we are losing by scheduling based on service type alone. For this experiment, we keep P = 2 and T = 3. We fix the service rates for each service type to be µ1 = 4, µ2 = 3 and µ3 = 2. We will vary the population compositions α to consider two scenarios: 23 Figure 1.4: Dissimilar population compositions: Population-aware performs well, but populationunaware does not perform that well. similar compositions with αA = (.5,.3,.2) and αB = (.4,.5,.1), and dissimilar compositions with αA = (.9,.05,.05) and αB = (.3,.2,.5). We initialize the population arrival rates to be λA = λB = 1 and then scale these arrival rates to study the difference in performance between the policies as the utilization changes. In Figure 1.3, we consider similar population compositions. As expected, we observe that the FIFO-policy performs poorly when compared with the cµ-policy, having a cost increase of nearly 60% over cµ cost when utilization is close to one. While the fair population-unaware policy performs better than FIFO, surprisingly the population-aware policy achieves cµ cost. In Figure 1.4, we consider dissimilar population compositions. In particular, nearly all of populationA is of type-1, while population-B is more evenly distributed across the three service types. We see that in this case the fair population-unaware policy does not provide much cost improvement over FIFO. Yet, the fair population-aware still matches cµ performance for utilization levels that are sufficiently high. To understand intuitively why cost improvements of the fair population-unaware policy are more significant when populations are more similar, consider the extreme case, when populations 24 have exactly the same proportion of each service type. It follows that the fairness constraint is trivially satisfied. Then, the cµ-policy becomes feasible for problem (1.8), and so we are able to achieve the best possible result. So, if two populations have similar proportions in each service type, we expect policies that are “close” to the cµ-policy to be feasible, thus leading to larger performance gains. Conversely, if the populations are extremely dissimilar, for example if populationA is entirely comprised of type-1 and population-B is entirely comprised of type-2 and type-3, then we must give the lower cost types 2 and 3 more priority over type-1 in order to maintain fairness across populations. This, of course, leads to significantly higher overall cost relative to cµ. Unfortunately, this leads to the observation that population-unaware policies perform well in cases where the cµ policy is not that unfair. On the other hand, the population-aware policy appears largely insensitive to the service type compositions of the populations when utilization is sufficiently high. We investigate this further in the next section. 1.5 Population-Aware Policy Design We now consider policies that are aware of a customer’s population and use this information to differentiate between customers of the same service type. Our main result shows that this class of policies can reap the entire benefits of the cµ-policy while being fair at the same time, which demonstrates that incorporating customer population membership into scheduling decisions greatly improves the efficiency of fair policies. We will first analyze the model in the simplified two population, two service type setting. Then, we will conclude the section by providing insights for solving the general problem. 1.5.1 Two Populations and Two Types We first consider a simpler setting with two populations and two service types. For concreteness, assume that P = {A,B} and T = {1,2}. We assume, without loss of generality, that µ1 ≥ µ2 25 and that αA,1 ≥ αB,1, so that population-A has a proportion of type-1 customers greater than or equal to that of population-B. We solve the optimization problem (1.6) for this setting. First, in Section 1.5.1.1, we present the intuition as to how and when one can obtain the optimal cµ-policy level of system cost while satisfying the fairness constraint. In Section 1.5.1.2, we will discuss how to compute the optimal fair policies in general when we cannot meet the conditions necessary to ensure we can achieve the optimal cµ-policy level system cost, and this helps us characterize the cost of fairness. In Section 1.5.1.3, we will show that as the system utilization increases, the cost of fairness decreases and further, for utilization near 100%, fair population-aware policies can always obtain the cµ-performance, and thus there is no cost to fairness in those settings. 1.5.1.1 Cases with No Cost For Fairness Recall that the cµ-policy processes type-1 customers before type-2 customers in a FIFO manner within each service class. But, notice that this is just one way to process type-1 customers before type-2 customers. For example, the priority policy that processes customers in the order of (A,1), (B,1), (A,2), (B,2), also gives priority to type-1 customers over type-2 customers, and thus preserves the cµ-policy specific to our setting. In fact, there are infinitely many ways to process customers while maintaining cµ-prioritization across service types. This observation leads to the following definition. Definition 1.5.1 A policy π ∈ Π is said to be cµ-preserving if it respects cµ prioritization of service types. The idea is to leverage this ability to prioritize type-1 (resp. type-2) population-B customers over type-1 (resp. type-2) population-A customers to obtain a policy that satisfies the fairness constraint while still maintaining the overall optimality of the cµ-policy. Note that if we can find a cµ-preserving policy π ∈ Π that satisfies Wπ A = Wπ B , then such a policy must be optimal for the constrained problem (1.3), since it is optimal for the unconstrained delay minimization problem that is an upper bound for (1.3). Figure 1.5 illustrates the structure of the class of cµ-preserving policies. 26 Figure 1.5: Structure of fair cµ-preserving policies. Type-1 customers are prioritized over type-2 customers. Let cµB ∈ Π be the cµ-preserving policy that prioritizes population-B whenever possible. That is, cµB-policy prioritizes (B,1) customers over (A,1) customers, and prioritizes (B,2) customers over (A,2) customers, while always prioritizing all type-1 customers over type-2 customers. The following theorem makes use of the cµB-policy to provide a necessary and sufficient condition for the existence of a fair cµ-preserving policy. Theorem 1.5.1 Suppose without loss of generality that αA,1 ≥ αB,1. There exists a fair populationaware policy that achieves the cost of the cµ-policy if and only if WcµB A ≥ W cµB B . The necessity of the condition in Theorem 1.5.1 can be understood as follows. First, we recall from (1.2) that W cµ B ≥ W cµ A , and so to achieve population fairness we need to reduce the wait time of population-B, or equivalently, increase wait times of population-A. To ensure fairness unconstrained optimality, we must also prioritize all type-1 customers before type-2 customers. Now, if within each class of service types, we fully prioritize population-B over A, which would lead to the largest decrease (increase) in population-B (population-A) wait times, but we still have W cµB B > W cµB A , then clearly it is not possible to obtain fairness while maintaining the unconstrained optimal system delay. 27 The intuition behind the sufficiency of the condition stated in Theorem 1.5.1 is the following: if W cµB A ≥ W cµB B , then since W cµ B ≥ W cµ A , and since the set of fair cµ wait times is convex subset of W (Π), there exists some wait time vector Wπ that is a convex combination of the two wait time vectors WcµB and Wcµ that satisfies the fairness constraint Wπ A = Wπ B . Since both cµB and cµ achieve the same optimal cost, the mixture policy π does as well. We conclude this subsection by characterizing the wait times for optimal fair policies. Provided the condition in Theorem 1.5.1 W cµB A ≥ W cµB B holds, we can compute the set of all fair cµ-preserving wait times in closed-form. This characterization is presented in the next proposition. Proposition 1.5.1 Suppose αA,1 ≥ αB,1. If WcµB A ≥ W cµB B , there exist constants L ≤U such that for any w ∈ [L,U], there exists a fair cµ-preserving policy that achieves the below wait times: Wπ A,1 = w, Wπ B,1 = C1 ρB,1 − ρA,1 ρB,1 w, Wπ A,2 = ρB,2αB,1C1 +ρB,1αB,2C2 −ρB,2 ρA,1αB,1 +αB,1αA,1 w ρB,1 ρA,2αB,2 +ρB,2αA,2 , Wπ B,2 = ρB,1αA,2C2 −ρA,2αB,1C1 +ρA,2 ρA,1αB,1 +ρB,1αA,1 w ρB,1 ρA,2αB,2 +ρB,2αA,2 , (1.12) where C1 = ρ1W0 (1−ρ1) and C2 = ρ2W0 (1−ρ1)(1−ρ) . While the formulas for each population-service type wait time are complicated, the most important consequence of Proposition 1.5.1 is that when there exists one fair, cµ-preserving policy, there often exist infinitely many fair, cµ-preserving policies. In the proposition, we pin down a fair cµ-preserving policy by a choice of the (A,1) waiting time that can be varied between bounds L and U. We note that L and U derived from feasibility constraints on the wait time vectors, and as such are complicated and do not seem to provide much insight. Further discussion on the derivation of the bounds L and U can be found in the appendix. Figure 1.6: Different fair cµ-preserving policies characterized by different values of parameter w. Illustration of Proposition 1.5.1 with λA = λB = 1, µ1 = 3, µ2 = 2, αA,1 = 0.53, and αB,1 = 0.5. Figure 1.6 illustrates how the wait times associated with the optimal policy change as the (A,1) wait time varies between the lower and upper bounds L and U. In the figure, we observe that as Wπ A,1 increases, Wπ B,1 decreases. The intuition is the following: for any cµ-preserving policy, type-1 customers are prioritized before type-2 customers, and so within the type-1 service class, we mix priority between the (A,1) customers and (B,1) customers. Thus, by the conservation law given in (1.7), the higher the chosen wait time for (A,1) customers, the lower the wait time of the (B,1) customers. Similarly, within the slow customers, the policy prioritizes between (A,2) customers and (B,2) customers. As we increase the (A,1) wait time, in order to maintain fairness across populations, the (A,2) customers receive higher priority over the (B,2) customers, which results in a decrease in (A,2) wait time and an increase in (B,2) wait time. Effectively, Figure 1.6 shows that to maintain the fairness constraint, any increase in Wπ A,1 (resp. decrease in Wπ B,1 ) needs to be offset by an appropriate decrease in Wπ A,2 (resp. increase in Wπ B,2 ). At point x = 0.54 in Figure 1.6 we see that (A,1) waiting time is equal to (B,1) waiting time, while (A,2) waiting time is strictly greater than (B,2) waiting time. A similar observation 29 holds at the point y = 0.62. At no value of w are both pairs of waits the same, demonstrating that optimal population-aware policies lead to outcomes where there is inequity of population waiting times within service types. But given that there tend to be multiple population-aware policies that achieve the same outcome, this service-level inequity can be managed intentionally. We discuss this in further detail in Section 1.6.2. 1.5.1.2 Cases with Cost For Fairness Now we consider the case when no cµ-preserving policy is feasible for problem (1.3). Recall that a fair cµ-preserving policy achieves cµ-cost by maintaining cµ priority across service types by serving type-1 before type-2. When no cµ-preserving policy exists, we know by Theorem 1.5.1 that W cµB A < W cµB B . However, we know that within the class of cµ-preserving policies, the cµBpolicy minimizes population-B wait time. When even the cµB-policy fails to make population-A wait more than population-B, we must look outside the class of cµ-preserving policies to find an optimal fair policy. Intuitively, the optimal fair policy should serve as many type-1 customers before type-2 customers as possible while maintaining fairness across populations, staying “as close” to a cµpreserving policy as possible. To do so, the optimal policy should serve (B,1) first and serve (A,2) last, so as to get as close to equality across populations, while maintaining the cµ ordering. Then, to obtain equality, the policy must mix priority between (A,1) customers and (B,2) customers. Figure 1.7 illustrates the described structure of the optimal fair policy. It turns out that this intuition is correct and can be made formal. 30 Figure 1.7: Structure of optimal, fair policy when no fair cµ-preserving policy exists. Some type-2 customers are prioritized over type-1 customers. Proposition 1.5.2 Suppose αA,1 ≥ αB,1. If WcµB A <W cµB B , the optimal fair population-aware policy is characterized by Wπ ∗ A,1 = ρB,2 αB,1W cµB B,1 −αA,2W cµB A,2 +αB,2 ρ 1−ρW0 −ρB,1W cµB B,1 −ρA,2W cµB A,2 ρB,2αA,1 +ρA,2αB,2 Wπ ∗ B,1 = W0 1−ρB,1 Wπ ∗ A,2 = W0 (1−ρB,1 −ρA,1 −ρB,2)(1−ρB,1 −ρA,1 −ρB,2 −ρA,2) Wπ ∗ B,2 = αA,1 ρ 1−ρW0 −ρB,1W cµB B,1 −ρA,2W cµB A,2 −ρA,1 αB,1W cµB B,1 −αA,2W cµB A,2 ρB,2αA,1 +ρA,2αB,2 . 1.5.1.3 High Utilization Implies No Cost For Fairness At first glance, it is not clear whether the condition in Theorem 1.5.1 is satisfied for a wide range of parameters. As we will prove below, we find that a cµ-preserving policy will always be feasible for (1.6) when utilization is high. This is fortunate because we recall from Section 1.4 that the cµ-policy tends to be most unfair when utilization is high. In other words, when the cµ-policy performs its worst with respect to population fairness, there likely exists an alternative fair policy that achieves cµ-cost. 31 Figure 1.8: The cost of optimal, fair policies for varying utilization with service parameters µ1 = 4, µ2 = 2, and population-B composition fixed at αB,1 = .1. Defining ρt = ∑p∈P ρp,t for t ∈ T , the following corollary states a sufficient condition for the existence of a fair cµ-preserving policy. Corollary 1.5.1 Suppose αA,1 ≥ αB,1. If ρ−ρ1 1−ρ1 ≥ αA,1−αB,1 1−αB,1 , then we have WcµB A ≥ W cµB B , which implies that there exists a fair population-aware policy that achieves cµ-cost. There are some important observations to make about Corollary 1 that provide insight into the necessary and sufficient condition W cµB A ≥ W cµB B . First, note that, since 0 < αB,1 ≤ αA,1 < 1, we have that α −1 B,2 (αA,1 −αB,1) < 1, and so the sufficient condition ρ−ρ1 1−ρ1 ≥ αA,1−αB,1 1−αB,1 for W cµB A ≥ W cµB B becomes easier to satisfy when ρ−ρ1 1−ρ1 is close to 1. This is equivalent to saying that the inequality is easier to satisfy when total utilization is high since ρ−ρ1 1−ρ1 ≈ 1 occurs if and only if ρ ≈ 1. Second, observe that α −1 B,2 (αA,1 − αB,1) is closer to zero when αA,1 ≈ αB,1. This captures the common sense notion that fairness is easier to satisfy when the populations have roughly the same service composition. Corollary 1.5.2 If ρ is sufficiently close to one, then there exists a fair policy that achieves cµcost. 32 We illustrate Corollary 1.5.1 and Corollary 1.5.2 using a numerical experiment. We fix all parameters and vary the arrival rate of population-A from zero up to the maximum arrival rate that still satisfies stability. We plot the cost of the optimal fair policy relative to the best possible cost of the cµ-policy, i.e., we illustrate the cost of fairness by plotting the percentage inefficiency of the optimal fair policy relative to the unfair cµ-policy. Figure 1.8 depicts the results. We consider three different system configurations (αA,1 values), each depicted by a different curve. We observe that at zero utilization there is no cost to fairness and the ratio of costs is 1, and the cost of fairness becomes non-zero as utilization increases. However, for large enough utilization, the cost of fairness again becomes zero and the ratio of the costs becomes equal to 1 in line with Corollary 1.5.2. 1.5.2 Generalizing Insights While our focus in this section has been on the two population and two service type setting, several of the arising insights hold more generally. First, the rationale underlying Corollary 1.5.2 holds more generally. Specifically, the notion of balancing the wait times of different populations while preserving the cµ-ordering across service types applies directly to any number of populations and service types. That is, we will prioritize populations within each service type differently (to achieve fairness) while maintaining cµ-priority across the service types (to achieve efficiency). The intuition behind the existence of such policies for high enough utilization applies in general. As utilization approaches 100%, under a cµ-preserving policy, the waiting times of the lowest service type would increase without bound. Further, if all population groups are represented in this lowest service type tier, then the overall growth in wait time can be balanced between the population groups to ensure fairness. While the lowest priority service type does bear nearly all of the waiting time when utilization is high, our policy ensures this burden is shared equally among the populations. As long we have positive fraction of service types of lowest cµ-priority, i.e., αp,T > 0 for all p ∈ P, then as utilization approaches 100%, by carefully mixing between all population groups in the lowest service type, we can ensure the overall population wait times are matched. 33 To formalize this, define for each ε > 0 a perturbation of the server capacity rε = ρ +ε converging to ρ as ε → 0. For each ε > 0 consider the service system where the server works at rate rε . This effectively scales the service times of each service type by a factor of r −1 ε , which implies the first and second moments of the type t ∈ T service time distribution are scaled as 1/(rεµt) and mt/r 2 ε , respectively. Thus, as ε → 0, the utilization of the system approaches 1. For p ∈ P, let Πp,T ⊂ Πcµ ∩Πpriority be the set of strict priority, cµ-preserving policies that give least priority to class (p,T). It is straightforward to verify that Wπ p,T = Wπ ′ p,T for all π,π ′ ∈ Πp,T . For convenience, call this last priority wait time Wε p,T . For each p ∈ P, define weights w ε,∗ 1 = 1 1+∑1 0, so that whenever 1 ≤ t < t ′ ≤ T, service type t is given priority over service type t ′ under the cµ-policy. 1.6.1.1 Population Fair Policies and Their Existence with Non-uniform Holding Costs When holding costs are allowed to differ across service types, the prior definition of fairness given by Definition 1.3.1 no longer captures the notion of equity across different populations. That is, just considering the waiting time of each population does not capture the average cost of waiting experienced by individuals from different populations. Instead of considering the expected steady state waiting time Wπ p of population p ∈ P under policy π, we now consider the expected steady state waiting cost of a customer from each population. We denote this expected cost by C π p ; it is given by C π p = ∑ p∈P ∑ t∈T αp,tctWπ p,t . (1.13) Using the expected population waiting cost (1.13) in place of the expected population waiting time (1.1), we define fairness in an analogous fashion to what was done before in Definition (1.3.1). That is, we consider a policy to be population fair if the each population has the same expected waiting costs. Definition 1.6.1 A policy π is population fair if C π p = C π p˜ for all populations p, p˜ ∈ P. When holding costs are the same across service types, the FIFO-policy is population fair under Definition 1.6.1. However, our first observation is that FIFO may not be population fair when holding costs are different. In fact, the discrepancy in wait time costs under the FIFO-policy can be calculated explicitly when there are two populations and two types. 36 Proposition 1.6.1 For the FIFO-policy in the setting of two populations and two types, the expected population waiting costs are equal if and only if αA,1 = αB,1 or c1 = c2. Specifically, the ratio of population expected waiting costs satisfies C FIFO B C FIFO A = αB,1(c1 −c2) +c2 αA,1(c1 −c2) +c2 (1.14) Proposition 1.6.1 demonstrates that, in steady state, the expected waiting cost disparity of the FIFO-policy across populations is pairwise constant. There are only two ways that the FIFO-policy can be fair: either the population compositions are the same or holding costs are the same. In the latter case, we revert back to uniform holding costs, where we have established that FIFO is always a population fair policy. We now consider the unfairness of the cµ-policy when holding costs are nonuniform. The following proposition is a direct extension of Proposition 1.3.1. Proposition 1.6.2 For the cµ-policy in the setting of two populations and two service types, the expected population waiting costs are equal if and only if αA,1 = αB,1 or c2 = (1−ρ)c1. Specifically, the ratio of population expected waiting costs satisfies C cµ B C cµ A = αB,1c1(1−ρ) +αB,2c2 αA,1c1(1−ρ) +αA,2c2 . (1.15) There are, essentially, two ways that the cµ-policy can be population fair. Either, the proportion of each population belonging to each service type is the same, or the holding costs are precisely balanced so that the extra time that one population spends waiting is offset by a lower waiting cost. To shed light onto the second condition, suppose that type-1 customers have much higher holding cost than type-2 customers. Under cµ-priority, type-1 customers are served before type-2 customers. Although type-1 customers wait less time than type-2 customers on average, the waiting cost of type-1 customers relative to the waiting cost of type-2 customers makes it costly to belong to service type-1, even if service is quick. Therefore, the population with the higher proportion customers belonging to service type-1 actually may have higher expected waiting costs. This is in 37 Figure 1.9: The waiting time cost disparity of the FIFO and cµ policies for varying utilization with hold cost parameters c1 = 10, c2 = 5, and population compositions fixed at αA,1 = .9 and αB,1 = .2. stark contrast with what we observed earlier in the chapter, where belonging to the priority class led to better population outcomes, in the sense that, the population with higher proportion of type-1 customers always had lower average waiting times. Proposition 1.6.1 and Proposition 1.6.2 imply that the population unfairness of the FIFO-policy may actually exceed the population unfairness cµ-policy. Figure 1.9 illustrates a concrete example where FIFO is more unfair than the cµ-policy for lower utilization levels. When holding costs are uniform, we are guaranteed existence of a work conserving policy that is population fair. This is because, as mentioned earlier, FIFO is population fair when holding costs are equal. However, as made clear by Proposition 1.6.1, this is no longer the case when holding costs differ. The following proposition gives precise conditions for the existence of a population fair policy in the two population, two service type setting. Before we state the result, we introduce some notation. We let πA be the policy that prioritizes in the order (A,1),(A,2),(B,2),(B,1) and πB the policy that prioritizes in the order (B,1),(B,2),(A,2),(A,1). 38 Proposition 1.6.3 In the setting of two populations and two types, a population fair policy exists if and only if CπA A ≤ C πA B and CπB A ≥ C πB B or αA,1 = αB,1. Proposition 1.6.3 is similar in flavor to Theorem 1.5.1. Note that πA is a cost minimizing policy for population p and a cost maximizing policy for population B. Similarly, πB is a cost minimizing policy for population B and a cost maximizing policy for population A. So, if C πA A > C πA B , then every policy π satisfies C π A > C π B , making it impossible to achieve population fairness. A similar observation can be made if C πB A < C πB B . 1.6.1.2 Fair, Population-Unaware Policies with Non-uniform Holding Costs In Section 1.4, we investigated the existence of fair, population-unaware policies that were permitted to schedule based on the service type of a customer, but not a customer’s population membership. When holding costs are uniform across service types, the FIFO-policy is always fair and population-unaware. However, as we saw in the previous section, when holding costs are different, the FIFO-policy may not be population fair. The following result characterizes the wait time vector of a population fair, population-unaware policy in the two population, two service type setting. Proposition 1.6.4 In the setting of two populations and two service types, provided that αA,1 ̸= αB,1, every fair, population-unaware policy achieves wait times W1 = ρW0 (1−ρ)(ρ1 + c1 c2 ρ2) , W2 = ρW0 (1−ρ)( c2 c1 ρ1 +ρ2) . Similar to Proposition 1.4.1, in the case of two populations and two service types, we have that when there exists a fair, population-unaware policy, the policy is unique up to expected steady state waiting times. Note that the wait time formula given in Proposition 1.6.4 reduces to the FIFO wait time vector when c1 = c2 and reduces to the cµ wait time vector when c1/c2 = 1 − ρ. In fact, depending on how waiting costs are specified, the cost of the fair, unaware policy can vary 39 between the cost of the cµ policy and the cost of the “reverse” cµ policy that prioritizes type-2 before type-1. These observations make it is difficult to extend Theorem 1.4.1 to the general holding cost setting for a couple of reasons. First, the proof of Theorem 1.4.1 relies on the fact that the wait time vector corresponding to the FIFO-policy satisfies the definitions of population fairness and population-unawareness, and the fact that the FIFO wait time vector lies in the interior of the feasible wait time region. Furthermore, FIFO served as an obvious benchmark with which to compare other fair, population-unaware policies. Now, there is no obvious benchmark. Moreover, it possible that under certain parameter conditions, the cµ-policy is a fair, population-unaware policy, which is impossible to approve upon. However, we note that the spirit of Theorem 1.4.1 and Corollary 1.4.1 ring true in this setting. One can define an analogous system of of equations by adjusting the fairness constraints to reflect equity in waiting cost, rather than waiting time. If there are many populations, then the equation system is over-determined, making it likely that there is at most one feasible wait time vector that satisfies the fairness constraints. However, if there are many service types, then the equation system is under-determined, meaning that if there exists one fair, unaware policy, then likely there are many more. The main takeaway is that having many service types makes it easier to construct fair, population-unaware policies. Although we focus on the case of fixed service types, a natural corollary of this takeaway is that if the system designer were able to control the number of service types offered, then one additional consideration toward fairness would be to consider offering additional service types to enable population-unaware policies to achieve higher efficiency. 1.6.1.3 Fair, Population-Aware Policies with Non-uniform Holding Costs Recall the definition of cµ-preserving given in Definition 1.5.1, and recall that if a fair, cµpreserving policy exists, then population fairness can be achieved at cµ-cost. In Section 1.5.1.1, we gave precise conditions for which a fair cµ-preserving policy exists, when all holding costs are identical. In this subsection, we extend Theorem 1.5.1 and Corollary 1.5.1 to the setting of general 40 holding costs. As in Section 1.5.1.1, we will focus on the two population and two service type setting. Let cµA be the cµ-preserving policy that prioritizes population A whenever possible, and let cµB be the cµ-preserving policy that prioritizes population B whenever possible. We are able to obtain the following necessary and sufficient condition for the existence of a fair, cµ-preserving policy. Theorem 1.6.1 In the setting of two populations and two service types, there exists a fair populationaware policy that achieves the cost of the cµ-policy if and only if CcµA A ≤ C cµA B and CcµB A ≥ C cµB B . The intuition behind Theorem 1.6.1 is the following: cµA is the cµ-preserving policy that minimizes the waiting cost for population A and maximizes the waiting cost for population B. If C cµA A >C cµA B , then any cµ-preserving policy π is such thatC π A >C π B , which means no cµ-preserving policy is can attain fairness. Similar reasoning holds if C cµB A <C cµB B . However, if both C cµA A ≤C cµA B and C cµB A ≥C cµB B , then there is some mixture of these two policies that satisfies population fairness. In this case, there exists a cµ-preserving fair policy, which means fairness comes at no additional cost. Recall Corollary 1.5.1, which guaranteed that a fair cµ-preserving policy exists when utilization is sufficiently high in any two population, two service type setting with uniform holding costs. Upon inspection of the necessary and sufficient condition of Theorem 1.6.1, it is straightforward to see that C cµB A and C cµA B become sufficiently large when ρ is sufficiently close to 1. That is, Corollary 1.5.2 remains true in cases with nonuniform holding costs. 1.6.2 Choosing Among Fair, cµ-Preserving Policies Recall from Section 1.5, that in many cases, when a cµ-preserving policy is feasible for problem (1.3), there are infinitely many optimal fair policies to choose from (see Proposition 1.5.1). Often, this gives a manager freedom to choose among many optimal, fair, population-aware policies. In this section, we discuss a particular way to choose from the set of cµ-preserving fair policies. We restrict our attention to the two-population, two-type setting studied in Section 1.5.1. 41 Figure 1.10: Selecting between different cµ-preserving fair, population-aware policies. Parameters λA = λB = 1, µ1 = 3, µ2 = 2, αA,1 = .53, αB,1 = .5. By prioritizing across populations in addition to service types, unfortunately population-aware policies create some amount of inequity between populations within a particular service type. As demonstrated by Figure 1.6, optimal cµ-preserving policies give more priority to populationB within at least one service type to make up for the unfairness created by prioritizing type-1 customers before type-2 customers. Since this discrepancy can be seen as a form of unfairness, it can make sense to optimize across policies with respect to a secondary criteria of minimizing the discrepancy between population wait times within each service type. We can do this by using a loss function that captures the difference between wait times of the same service type. An example of a loss function that does this is ℓ(π) = Wπ A,1 −Wπ B,1 + Wπ A,2 −Wπ B,2 (1.16) We can refer to the quantity ℓ(π) as the service-level unfairness of a policy π ∈ Π. Note that ℓ(π) = 0 if and only if π is population-unaware. One may also consider a weighted sum to capture differences between service types. 42 Recall from Proposition 1.5.1, the fair cµ-preserving wait times are parameterized by w ∈ [L,U]. We let π ∗ (w) be the fair, cµ-preserving policy associated with parameter w ∈ [L,U]. Then, for w ∈ [L,U], ℓ(π ∗ (w)) is the service-level unfairness associated with each fair, cµ-preserving policy. Figure 1.10 is a plot of the service-level unfairness function ℓ on the set of fair, cµ-preserving policies illustrated previously by Figure 1.6. Notice that the two points x and y correspond to the intersection points x = 0.54 and y = 0.62 of the wait times in Figure 1.6, where the wait times of the (A,1) and (B,1) customers are equal and the wait times of the (A,2) and (B,2) customers are equal, respectively. As see in the figure, the minimum of the function occurs at one of the two intersection points, meaning that the minimum occurs when either type-1 or type-2 customers are served in a FIFO manner. Thus, in many cases, we can obtain a policy that achieves cµ cost and population fairness while minimally disturbing the population wait times within each service type, leading to a fairer cµ-preserving policy. 1.7 Conclusion In this chapter, we study fairness in the context of a queueing scheduling problem. In such systems, it is efficient to prioritize incoming customer work based on its service requirement and cost characteristics, with customers having higher holding cost or faster processing rate being prioritized as in the classical cµ-priority policy. In this work, we take the view point that each individual customer should be considered as part of a broader population, for instance, customers making request at a service firm may belong to one of many population groups in society that differ on attributes such as race, nationality, income level, etc. With this view point, we note that priority policies may inadvertently inject inequity and unfairness across the population groups. We state a fairness constrained optimization problem that seeks to minimize system costs or waiting times while ensuring fairness and equity across the various population groups. 43 We identify two classes of policies that can be used to achieve fairness: population-unaware and population-aware policies. These policies differ in whether their implementation requires knowledge of the population membership, with population-unaware policies not needing such information. We find that population-unaware policies can be implemented to achieve fairness, and can dominate the FIFO-policy if the population groups are not very dissimilar in composition. On the other hand, population-aware policies provide us with a useful performance benchmark and help understand the trade-off between fairness and efficiency, as we see how the optimal cµ-policies can be modified to achieve fairness without loss in efficiency for high utilization settings. The difference in performance between aware and unaware policies demonstrates the value of scheduling according to population membership in addition to service type. We have focused on analyzing population-level fairness from the perspective of average performance. Our notion of fairness requires waiting cost equity to hold across populations on average. Our analysis does not take into consideration items such as the variance of waiting costs or other behavioral and strategic factors that may influence a customer’s perception of fair policies. We also do not take into consideration the perception of fairness from an individual perspective. Finally, our analysis (specifically Figure 1.8) suggests a counterintuitive tension between fairness and efficiency when resources are added to a high utilization system: optimal fair policies may actually become less efficient (relative to the cµ-policy) as utilization decreases. In fact, we find that some population-types may actually have longer waits when utilization decreases in such a situation. While we have only observed these phenomena numerically, we believe that taking these complexities into consideration formally are interesting avenues for future research. Overall, our goal in this work is to shine light on the fact that we should take into account the societal impact of policies that focus on individual service-level prioritization, and further to illustrate that, with some small changes to the existing policies, we can achieve fairer societal outcomes without sacrificing much, or in some cases any, efficiency. 44 Chapter 2 Blockchain Mediated Persuasion 2.1 Introduction Blockchain is a distributed ledger system that allows for efficient, verifiable, and immutable recording of transactions between two parties. This system can also be programmed to autonomously execute transactions through what are termed “Smart Contracts”. Contracts within the blockchain are digitized and stored within transparent databases, safeguarded against alterations or deletions. Consequently, every agreement or task possesses a digital record and signature that is verifiable and shareable. This obviates the need for intermediaries such as lawyers or brokers, enabling seamless transactions. The main distinctive features of blockchain, as opposed to conventional databases, include its immutable nature, making ledger corruption nearly impossible, and its decentralized structure, negating the requirement for central oversight or third-party involvement. Therefore, this technology inherently establishes trust and reduces transactional friction by removing third-party incentives. Although blockchain garnered significant attention for its transformative potential in both business and society, its adoption remains largely confined to payments and cryptocurrency realms. In this chapter, we present a new blockchain application, leveraging its core attributes, on facilitating trustworthy communication between parties with conflicting incentives. 45 In the classic Bayesian Persuasion model studied by Kamenica and Gentzkow (2011), there are two parties: the first, called Sender, wishes to persuade the second, called Receiver, to take a desired action. Provided that Sender is ex-post better informed about the underlying state of the world, Sender can leverage their informational advantage by communicating with Receiver via a signal mechanism. By careful design of the mechanism, Sender can manipulate Receiver’s posterior beliefs to obtain higher expected payoffs. However, Sender’s ability to effectively manipulate Receiver’s posterior beliefs largely hinges on the assumption that Sender can commit to a signal mechanism in a credible way. If Receiver does not believe that Sender can reliably commit to a mechanism, then all bets are off: the desired persuasion devolves into cheap talk. One alternative approach is information mediation, where a credible mediator commits to a signal mechanism on behalf of Sender. For a mediator to be credible in the eyes of both Sender and Receiver, it is natural to require it to be transparent, unchanging, and reliable. Modern blockchain technology allows us to implement such a mediator via smart contracts. However, blockchain technology is costly, which means Sender must pay costs to communicate with Receiver. Surprisingly though, we find that this costly blockchain mediation performs better than free mediation in certain settings. To make our main finding concrete, consider a Firm facing an investment decision in one of two projects A and B. Only one of the two projects will be successful and the prior probability associated with A being the successful one is 1/4. For simplicity, we normalize costs so that the cost of investing in the project is $0. If the Firm invests in the successful project, their profit will be $3 million, otherwise they get $0. A Consultant, who can do research to find which of the two projects will be successful, offers their services on advising the Firm but charges a fee equal to $1 million. Without any communication between the Firm and the Consultant, the Firm would invest in project B and obtain an expected profit of 3 · 3/4 = $2.25 million. If the Firm hires the Consultant, the Firm would obtain an expected profit equal to 3−1 = $2 million. Hence, without communication, the Consultant would not be hired. 46 Following the standard Bayesian Persuasion framework (Kamenica and Gentzkow, 2011), suppose the Consultant could credibly communicate with the Firm after they do their research but before the Firm hires them. The Consultant could use the following communication strategy that involves 2 messages: “Project B”or “Hire”, with π(“Project B” | A) = 0, π(“Hire” | A) = 1, π(“Project B” | B) = 1 3 , π(“Hire” | B) = 2 3 . If the Firm receives the message “Project B” it can infer with certainty that the corresponding project will be successful and will follow the recommendation from the Consultant. On the other hand, if the Firm receives the message “Hire” it can update its beliefs and infer that Project A will be successful with probability 1/3. At that posterior belief, if the Firm invests in Project B without using the Consultant, its expected profit will be equal to (2/3)· 3 = $2 million, while if it hires the Consultant its expected profit will also be equal to 3−1 = $2 million and hence the Firm hires the Consultant (note here that, as in the classical persuasion paradigm, we break ties in favor of the Consultant). This way, the Consultant’s expected payment is equal to ((1/4)·1+ (3/4)·(2/3))1 = $0.75 million, the Firm’s expected payoff from engaging with the Consultant is $2.25 million, and hence the Firm is willing to engage in this communication. Note that, according to this strategy, with probability equal to (3/4)· 1/3 = (1/4), the Consultant reveals the outcome of their research to the Firm free of charge. Essentially, this is the cost of commitment to a communication strategy: in order for the Firm to form beliefs in the way that we described above, the Consultant commits to the specific communication mechanism and the Firm trusts that they did so. Clearly though, this is not ex-post optimal to the Consultant; ex post the Consultant always would always send the “Hire” message making the above equilibrium collapse. As an alternative to the commitment assumption, consider a Smart Contract that does not have access to the true outcome of the Consultant’s research but is able to 47 (i) transparently commit to a mapping between the Consultant’s report of the outcome of her research and a message to the Firm, (ii) charge a fee that can be different for different messages. For our example, consider a contract that allows the Consultant to report A or B, sends messages that depend on the Consultant’s report as follows, π(“Project B” | A) = 0, π(“Hire” | A) = 3 4 , π(“Project A” | A) = 1 4 , π(“Project B” | B) = 1 2 , π(“Hire” | B) = 1 2 , π(“Project A” | B) = 0, and charges the Consultant a fee of $1 million whenever the message “Project A” is sent. For this contract, conditional on receiving each of the messages the Firm is forming exactly the same beliefs as before and acts accordingly. Most importantly, the Consultant is indifferent between reporting the true outcome of their research and misreporting, since their payoff from reporting “A” is equal to 3 4 · 1− 1 4 · 1 = 1 2 while their payoff from reporting “B” is equal to 1 2 · 1 = 1 2 . Note that the Consultant’s expected profit is equal to $0.5 million without assuming commitment to the communication signaling mechanism! In this chapter, we study the possible benefits of cost-differentiated information mediation that this simple example illustrates. We consider a model where Sender reports the state of the world to a mediator. The mediator generates a signal realization and charges Sender a corresponding fee. Receiver then updates their posterior based on the signal realization and takes an action. Our work argues that blockchain technology, and in particular smart contracts, enable the implementation 48 of such transparent mappings between reports and messages of different entities while enabling charging “fees” that are dependent on the realization of different messages. In Section 2.4 we present our formal model of persuasion, introduce the concept of Blockchain Mediated Persuasion (BMP), and formulate the associated optimization problem. We show that, without loss of generality, one can consider straightforward and direct signal mechanisms, where Sender truthfully reports the state of the world and Receiver follows a recommended action distribution (see Proposition 2.4.1). This same result shows that we can without loss optimize over distributions on compatible belief-action pairs that satisfy the Bayes Plausibility and Payoff Plausibility conditions that capture the incentive compatibility constraints for Receiver and Sender respectively. In Section 2.5, we provide two geometric characterizations for the blockchain mediated persuasion problem. The first characterization describes the achievable payoff for a given prior belief µ0 in terms of a ‘lifted’ space that captures the Sender’s incentive constraints (Theorem 2.5.1). The second geometric characterization uses a Lagrangian dual approach to describe the achievable payoff frontier for all prior beliefs µ0 as a linear rational function, demonstrating the loss in value for the Sender compared to the affine functions in the classic Bayesian Persuasion problem (Theorem 2.5.2). In Section 2.6, drawing on our geometric characterizations of the achievable payoff region, we show that costly mediation with two price levels can benefit Sender arbitrarily more than free mediation (Theorem 2.6.1). Our results allow us to distinguish between two potentially valuable properties of BMP mechanisms: their ability to mediate and guarantee mappings between reports and messages, and their ability to price differentiate between signal realizations. Finally, in Section 2.7 we show that the two-price BMP mechanism is optimal for the information selling problem, and leverage this fact to obtain a characterization of optimal BMP mechanisms (Proposition 2.7.2). We conclude with a discussion on the welfare gains from costdifferentiated mediation. 49 2.2 Implementing Mediated Persuasion on the Blockchain Before we continue, we take a moment to discuss how one might practically implement a mediated communication mechanism on a blockchain. First, it is fairly straightforward to write a smart contract that contains the signal mechanism and corresponding message space. At a high level, the contract publicly and immutably stores the message space and signal function, the Sender reports directly to the smart contract’s interface, then, upon receiving this report, the contract executes the stored mechanism to generate a signal realization that is then sent off to Receiver. Below we discuss two caveats that make the implementation less trivial than this high-level idea. Persuasion, in general, requires randomization in order to achieve desired outcomes. Blockchains do not support pseudo-random number generation, and without reliable pseudo-random numbers we are not able to implement the optimal mechanism for essentially all non-trivial cases. To make matters worse, if the smart contract simply accepts Sender’s report and computes the optimal mechanism, then the transparency of blockchain makes it is possible for Receiver to access the report from the smart contract. Of course, this would lead to a total breakdown of a persuasion mechanism. To avoid this, the state of the world must be encrypted and the computation of the signal realization must be done off-chain. Fortunately, recent advances in oracle technology allow both of these problems to be mitigated. In simple terms, an oracle is a blockchain service that connects smart contracts to off-chain entities to achieve interoperability between the blockchain and the outside world. Oracles are capable of interacting with web APIs, generating verifiable random numbers (VRNs) and even performing off-chain computation. Smart contracts that utilize oracles are often called hybrid smart contracts. Two popular oracles that are capable of providing these services are Provable and Chainlink. Chainlink provides verifiable random numbers (VRNs) and automated execution of smart contracts, while Provable allows for encryption and decryption. Both oracles support off-chain computation. A key issue with oracles is that they are inherently centralized, and thus using an oracle seemingly defeats the benefits of decentralized ledger technology. This problem is known as the oracle problem. To solve the oracle problem, the Provable oracle relies on authenticity proofs that 50 give both parties proof that the result has not been tampered with. Alternatively, Chainlink solves the oracle problem by using a decentralized model, where off-chain data is aggregated by many node operators, whereby consensus on the validity of the data is reached. However, oracles and node operators do not work for free. On blockchain networks, there is a price that one must pay oracles for their services, as well as fees that must be paid to entice blockchain node operators to execute a smart contract. On Ethereum, this execution fee is referred to as the gas cost. Effectively, one can think of the gas cost as a fee that one must pay to use computational resources on the blockchain. In addition, it is possible to intentionally charge smart contract users fees via what is known as cash burning. Cash burning makes it possible to think about the cost structure as an endogenous. 2.3 Literature Review The persuasion framework considered in this chapter builds on the seminal work of Kamenica and Gentzkow (2011) on Bayesian persuasion. There has been much work on expanding the standard persuasion model to more complex settings and important applications. See, for instance, Bergemann and Morris (2019) for a detailed survey of persuasion and other information design topics and applications. Persuasion and related approaches to information design have found many applications in the operations and applied economics literature. Examples include in inventory systems (Allon, Bassamboo, and Randhawa, 2012; Yu et al., 2014), queueing systems (Veeraraghavan and Debo, 2008; Allon, Bassamboo, and Gurvich, 2011; Lingenbrink and Iyer, 2019), networks (Candogan and Drakopoulos, 2020), policy (Alizamir et al., 2020), Bayesian exploration (Kremer et al., 2014; Papanastasiou et al., 2017), and the management of political campaigns (Alonso and Camara, ˆ 2016). Much of the persuasion literature assumes that Sender is able to commit to the signaling mechanism. The commitment assumption allows one to effectively ignore strategic behavior from Sender, since once the signaling mechanism is designed, commitment means Sender simply reports the 51 state of the world to the mechanism and the mechanism sends a signal to Receiver. There has been growing interest in relaxing the commitment assumption in the economics literature and bridging the gap between full commitment (persuasion) and no-commitment (cheap talk) communication regimes. Cheap talk models like the one first investigated by Crawford and Sobel (1982) assume that Sender has no commitment power. More specifically, cheap talk models assume that communication is costless, non-binding and non-verifiable. In these models there often exist equilibria where information is communicated and the parties benefit, however, there also always exists the so-called babbling equilibrium where no information is transmitted and payoffs for both parties are the same as if there was no communication. In fact, Lipnowski and Ravid (2020) show that Sender’s highest achievable payoff in a cheap talk setting corresponds to the quasiconcave envelope of their payoff function, which can be much less than the payoff of committed persuasion. To bridge this gap, researchers have considered models that weaken the commitment assumption of the classical persuasion model in some way. In a recent paper, Lipnowski, Ravid, and Shishkin (2022) characterize the achievable region where Sender may influence the message sent to Receiver with a given, possibly state-dependent, probability, thereby weakening the commitment assumption. Min (2021) studies a model where Sender’s commitment to the mechanism binds with an exogenously given probability. This same model is also analyzed from an empirical perspective in Frechette et al. ( ´ 2022). Another approach for weakening the commitment assumption is to allow costly deviation from truthful reporting. For example, Nguyen and Tan (2021) and R. Li (2020) study models where Sender can report the state of the world strategically, but pays a price for lying. In a similar vein, Degan and M. Li (2021) consider a model where Sender pays a higher cost for communicating with more precision. Yingni Guo and Shmaya (2021) study a model where Sender designs a condition distribution over the state space and pays a miscalibration cost for being inconsistent with the true distribution. 52 Our work relaxes the commitment assumption by allowing Sender to strategically report the state of the world to an intermediary. There has been other work that studies the effects of mediated communication. Our model is most closely related to the mediator model studied in Salamanca (2021), however our model and analysis differ as we allow for cost differentiation of signal realizations. Our model is also related to Arieli et al. (2022), which studies a model with multiple strategic mediators. Utilizing blockchain technology as a mediator was suggested by Best and Quigley (2023) to implement communication mechanisms in a repeated setting. Communication with mediators can be viewed as a constrained persuasion problem with incentive compatibility constraints for both Sender and Receiver. So, the analysis found in the work of Doval and Skreta (2023) and (2023) is closely related to our analysis. For this same reason, our work is related to studies on interim persuasion found in Perez (2014) and Koessler and Skreta (2023). Finally, costly blockchain mediation leverages cash burning to cost differentiate messages. The positive effects of cash burning and mechanism design have been considered in Hartline and Roughgarden (2008) and more recently by Fotakis et al. (2016). 2.4 Model A critical assumption for the effectiveness of classical Bayesian Persuasion (Kamenica and Gentzkow, 2011) is Sender’s ability to commit to the signal mechanism. In many applications the commitment assumption is unrealistic and, as such, the literature has explored different ways of relaxing this assumption (see Section 2.3). Here we propose a novel alternative to classical persuasion where we relax the classical commitment assumption and use blockchain to implement Blockchain Mediated Persuasion (BMP) mechanisms. These mechanisms can be implemented as smart contracts between Sender and Receiver, where Sender submits a (potentially fake) report of the realized state of the world and the contract produces a message through a well-defined (possibly randomized) mapping. Blockchain technology provides a reliable avenue for implementing such a commitment-free mechanism. 53 Oracle Smart Contract Oracle Oracle Sender Receiver Step 1: Using the Oracle’s public key, Sender sends an encrypted report ω r to the smart contract via the oracle. The oracle triggers the smart contract. Step 2: The smart contract contains data D = (π,ω r ). The oracle generates a signal realization s using distribution π(· | ω r ), along with verification that demonstrates the computations are trustworthy. Sender’s wallet is charged a signal dependent cost κs . Step 3: The smart contract asks the oracle to send the message s to Receiver. ω r ω r D s s s Figure 2.1: Illustration of the implementation of a signaling mechanism using blockchain technology. 2.4.1 Blockchain Mediated Persuasion There is an a-priori unknown state of the world ω ∈ Ω and two players: Sender and Receiver. Receiver has a utility function u(a,ω) that may depend on her action a ∈ A and the state of the world ω ∈ Ω. Sender has a utility function v(a) that only depends on Receiver’s action.1 Since only the comparison between utilities affects the strategic behavior of Sender, we assume that v(a) ≥ 0 for all a ∈ A . For simplicity, we assume the action space A and the state space Ω are finite. As in the classical Bayesian Persuasion paradigm (Kamenica and Gentzkow, 2011), Sender wishes to persuade Receiver to take actions that increase her payoff via a communication mechanism. The class of mechanisms that will be our focus for the remainder of the chapter is formally defined below. 1Note that this is less general than the full persuasion paradigm, but allows us to provide simple characterizations. 54 Definition 2.4.1 A Blockchain Mediated Persuasion (BMP) mechanism consists of a finite realization space S, a family of distributions {π(· | ω r )}ωr∈Ω over S and a set of costs κs for s ∈ S. We denote a BMP mechanism by M = (π,S,{κs}s∈S). A BMP mechanism receives a report ω r ∈ Ω from Sender and transmits a message s ∈ S to Receiver, generated according to π. The timeline for the use of the mechanism is as follows. Initially, Sender and Receiver have a common prior µ0 ∈ ∆(Ω) ◦ over the state of the world. Sender designs the BMP mechanism M = (π,S,{κs}s∈S) and then observes the realized state of the world. Sender then reports ω r to the BMP mechanism, which generates a message s ∈ S according to π(· | ω r ) and transmits it to Receiver. The BMP mechanism charges Sender an associated fee κs . Finally, Receiver takes an utility maximizing action and the payoffs u and v for Receiver and Sender are realized. We refer to n = |S| as the number of cost levels of the BMP mechanism, and with some abuse of notation we denote by κi the ith-smallest cost level. In reality, the costs {κs}s∈S can be interpreted as some combination of incidental gas fees and intentional cash burning. We frame the problem with endogenous cost structure so that we may determine optimal BMP mechanisms. However, as we will see, our general analysis of BMP mechanisms applies to any given cost structure, thus handling exogenously given cost structures as well. Regarding implementation, one can think of the signal structure π, message space S and the associated cost structure {κs}s∈S as being primitives of the smart contract created by Sender. Effectively, Sender would code these data in the smart contract implementation, making them fully transparent to Receiver. Then, Sender’s report is encrypted and uploaded to the mechanism. As discussed in Section 2.2, the generation of the signal realization is done off chain by the oracle because of the need for randomization. Finally, the oracle delivers the message to Receiver. The schematic representation of such an implementation is given by Figure 2.1. 55 2.4.2 Strategies and Solution Concept Our solution concept is subgame perfect equilibrium. We denote by β : Ω → ∆(Ω) Sender’s reporting strategy and by δ : S → ∆(A) Receiver’s strategy. Clearly, Sender will not engage with a BMP mechanism if the expected payoff is less than the no-communication default payoff v0. Therefore, any BMP mechanism must satisfy the following Individual Rationality constraint v(M,β,δ) = ∑ ω,ωr∈Ω µ0(ω)β(ω r | ω) ∑ s∈S ,a∈A π(s | ω r )δ(a | s)(v(a)−κs) ≥ v0, (I.R.) that ensures the expected payoff of Sender garnered from the mechanism is sufficiently high. We implicitly assume that once Sender opts in to the mechanism, her report is always within Ω. This is a reasonable assumption, as in principle the smart contract can charge a large amount to Sender for refusing to submit a valid report (similar to slashing2 in Ethereum Proof of Stake).3 However, even though the Blockchain Mediated Mechanism enables Sender to commit to a mediated mapping between reports and messages, Sender always has the option to lie about the observed state of the world and therefore there is no a priori guarantee that ω r = ω. In general, Sender submits a report that maximizes her expected payoff, imposing a Sender optimality constraint, ω r ∈ argmax x∈Ω ∑ s∈S ,a∈A π(s | x)δ(a | s)(v(a)−κs) (S.O.) for all ω,ω r ∈ Ω satisfying β(ω r | ω) > 0. Finally, for a given mechanism M and reporting strategy β, Receiver takes the optimal action from her perspective, imposing a Receiver optimality constraint, a ∈ argmax a ′∈A " ∑ ω,ωr∈Ω µ0(ω)β(ω r | ω)π(s | ω r )u(a ′ ,ω) # (R.O.) for all s ∈ S and a ∈ A satisfying ∑ω′∈Ω π(s|ω ′ ) > 0 and δ(a | s) > 0. 2https://ethereum.org/en/developers/docs/consensus-mechanisms/pos/ rewards-and-penalties/ 3From a practical point of view, a smart contract would be directly connected through an API to Sender’s data and therefore the set of valid reports is pre-determined. 56 This set of constraints defines the following solution concept. Definition 2.4.2 Given a BMP mechanism M = (π,S,{κs}s∈S), the strategy profile (β,δ) is an M-Bayes Nash Equilibrium (BNE) if the (I.R.), (S.O.), and (R.O.) conditions hold. We say that a BMP mechanism M attains value v if there is an M-Bayes Nash equilibrium (β,δ) satisfying v(M,β,δ) = v. Sender’s goal is therefore to design a Blockchain Mediated Mechanism that attains the highest value, i.e. to solve the following optimization problem, sup M,β,δ v(M,β,δ) subject to M = (π,S,{κs}s∈S) is a Blockchain Mediated Mechanism, (OPT-1) (β,δ) is an M-Bayes Nash Equilibrium. 2.4.3 Simplifying the Optimization Problem In this section we show that, without loss of optimality, we can focus on a subclass of BMP mechanisms that we refer to as generalized-straightforward and direct. These mechanisms essentially restrict the signal space to action distributions, and induce a strategy profile (β,δ) such that (i) Sender finds it optimal to truthfully report the realization of ω whenever they use the service. (ii) Receiver finds it optimal to follow the recommended action distribution whenever they receive a message and infers the recommended belief using Bayes rule conditional on the received signal. Note that generalized-straightforward signals depart from the classic notion of a straightforward signal (cf. Kamenica and Gentzkow, 2011) in the case when Receiver is indifferent between two actions. For beliefs where Receiver is indifferent, in the classic persuasion paradigm, the agents follow the action that maximizes Sender’s utility. Here, we generalize this idea and allow any mixing between indifferent actions. We present the formal definition below. 57 Definition 2.4.3 A Blockchain Mediated Mechanism is said to be generalized-straightforward and direct if S ⊆ ∆(A) and it induces a Bayes Nash equilibrium (β,δ) such that (i) for any ω ∈ Ω, we have β(ω|ω) = 1, (ii) for any s ∈ S ⊆ ∆(A ), we have δ(· | s) = s(·). A consequence of (i) is that for a generalized straightforward and direct mechanism, Receiver’s posterior belief upon seeing signal realization s ∈ S is given by µs = π(s | ω)µ0(ω) ∑ω′∈Ω π(s | ω′)µ0(ω′) . A key conceptual contribution of the classical framework of Bayesian Persuasion (Kamenica and Gentzkow, 2011) is thinking of Sender’s payoff as a function of Receiver’s posterior beliefs. This mapping is more intricate in our case, since at beliefs where Receiver is indifferent between two or more actions, there are many different payoffs Sender can obtain depending on the action distribution that Receiver chooses. For this reason, we specify both the belief and a corresponding action distribution and we say that a belief µ ∈ ∆(Ω) and action distribution δ ∈ ∆(A) are compatible if δ is a Receiver-optimal action distribution at belief µ. That is, (µ,δ) are compatible if and only if δ ∈ ∆(aˆ(µ)), where aˆ(µ) := argmax a∈A Eµ[u(a,ω)]. (2.1) For ease of exposition we define U = {(µ,δ) : µ ∈ ∆(Ω),δ ∈ ∆(aˆ(µ))}, to be the set of all compatible pairs of beliefs and action distributions. For (µ,δ) ∈ U , we denote by v(µ,δ) = Eδ [v(a)], (2.2) 58 Sender’s expected utility when Receiver follows the action distribution δ. We note that it is helpful to think of elements of U first and foremost as a set of beliefs. The inclusion of the compatible action distribution is only to make the payoffs v(µ,δ) a well defined function at each belief. Now to account for the different cost of signal realizations, we let P = {Ui} n i=i be a partition of U such that any belief-action pair (µ,δ) ∈ Ui has a cost κi and let v((µ,δ);P) = v(µ,δ)−κi , for (µ,δ) ∈ Ui . denote the net payoff that Sender receives upon generating belief action pairs via the mechanism. We will refer to P as a cost partition of U . Essentially, any signal π induces a distribution τ on U . In order for the signal π to be generalized-straightforward, for any possible signal realization (µ,δ) in the support of τ, it must be the case that Receiver forms the belief µ and it is incentive compatible for her to choose the action distribution δ. The latter is guaranteed by the definition of U while to ensure the former, it is necessary and sufficient for τ to be Bayes Plausible (BP), that is for every ω ∈ Ω, Eτ [µ(ω)− µ0(ω)] = 0. (B.P.) For further discussion on the relationship between Bayes Plausible belief distributions and signals we direct the reader to (Kamenica and Gentzkow, 2011). From Sender’s perspective, in Proposition 2.4.1 we derive a necessary and sufficient condition for directness in terms of the induced belief distribution. Specifically, we show that in order to guarantee directness, it is enough to ensure that the distribution τ on U , induced by a signal π, is Payoff Plausible (PP), i.e. there exists a cost partition P such that Eτ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) v((µ,δ);P) = 0, for all ω,ω ′ ∈ Ω. (P.P.) 59 We conclude this section by showing that it is without loss of optimality, with respect to the optimization problem (OPT-1), to focus on signals that induce distributions τ that are Bayes Plausible and Payoff Plausible. Proposition 2.4.1 The following are equivalent: (i) There exists a BMP mechanism that attains value v∗ . (ii) There exists a generalized-straightforward and direct BMP mechanism that attains value v∗ . (iii) There exists a distribution τ ∈ ∆(U ) that is Bayes Plausible and Payoff Plausible with Eτv((µ,δ);P) = v ∗ . In light of Proposition 2.4.1, it suffices to present a BMP mechanism in terms of the associated distribution τ and partition P. Therefore, the optimization problem (OPT-1) for a given cost partition P can be rewritten as V(µ0;P) = max τ∈∆(U ) Eτ [v((µ,δ);P)] (OPT-P) subject to Eτ [µ(ω)− µ0(ω)] = 0, ∀ω ∈ Ω Eτ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) v((µ,δ);P) = 0, ∀ω ̸= ω ′ . 2.5 The Value of Blockchain Mediated Persuasion Here we derive geometric characterizations for the solution of problem (OPT-P). We leverage the fact that, for fixed cost partition P, (OPT-P) is a linear optimization problem in the belief-action distribution τ to derive both primal and dual characterizations of the optimal BMP mechanism under the given cost partition. 60 2.5.1 Primal Approach We first observe that there is a parallel between our optimization problem above and the optimization problem stated in Corollary 1 of Kamenica and Gentzkow (2011). In the classical persuasion framework, one optimizes Sender’s expected utility over all Bayes Plausible belief distributions. In contrast, in the absence of commitment one also has to account for the additional Payoff Plausibility constraints to guarantee that Sender adheres to direct reporting. With this observation in hand, given a belief µ and corresponding Receiver action δ, for each ω,ω ′ ∈ Ω, we define yω,ω′((µ,δ);P) = µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) v((µ,δ);P). We let y((µ,δ);P) denote the |Ω| ×|Ω| matrix with entries yω,ω′((µ,δ);P). Let GP = {(µ,y((µ,δ);P), v((µ,δ);P)) : (µ,δ) ∈ U }, (2.3) and let co(GP) be the convex hull of GP. Intuitively, given an element of GP, the first component µ tracks the belief of Receiver. The second component, y((µ,δ);P), tracks the integrands found in the constraints of (P.P.) at belief µ and compatible action distribution δ. The third component, v((µ,δ);P), tracks Sender’s payoff at belief µ and compatible action distribution δ. If (µ0,0,z) ∈ co(GP), then since co(GP) is a convex hull of a set in finite dimensions, it follows from Caratheodory’s theorem that we can find finitely many elements in GP and corresponding weights that combine to (µ,0,z). By definition, these weights form a Bayes Plausible and Payoff Plausible distribution that achieves value z. With this geometric characterization in hand, intuitively, finding the largest such z so that the point (µ0,0,z) lies in co(GP) will produce an optimal distribution on U . The following theorem formalizes this intuitive argument. 61 Figure 2.2: The set co(GP) for a piecewise-constant payoff function. The prior is µ0 = 0.2. At the optimal solution, the signal mixes between the vertices of the orange plane which correspond to the beliefs µ1 = 0,µ2 = 0.4 and µ3 = 1. Theorem 2.5.1 The value of mediation (in the absence of signal cost differentiation) is equal to V(µ0;P) = sup z : (µ0,0,z) ∈ co(GP) . In order to develop intuition about Theorem 2.5.1, we illustrate the construction of co(GP) in Figure 2.2 for a piecewise-constant payoff function. The payoff function is shown on the right-hand side. We first construct GP by “lifting” the payoff function into three-dimensional space, with the two dimensions of the payoff function corresponding to Bayes Plausibility and the Sender’s expected payoff, and the third dimension of the lifted space corresponding to Payoff Plausibility. Once GP is constructed, we find its convex hull in 3-dimensional space with each point in the hull achieved by a distribution τ in ∆(U ). To satisfy feasibility, we locate points at which both BP and PP are satisfied. Finally, for optimality, we find the point with the highest payoff that remains inside of GP. 62 The figure suggests that mediation has the potential to benefit Sender (the red point corresponding to V(µ0;P) lies above the payoff function), but improvement depends on the structure of Sender’s payoff function. As we will see later, Blockchain Mediated Persuasion allows us to change the shape of the payoff function. By carefully manipulating the payoffs, we can manipulate the convex region co(GP), which in turn can lead to higher expected payoffs for Sender. 2.5.2 Dual Approach While Theorem 2.5.1 provides a characterization of the value of mediation for a given prior belief µ0, it does not fully characterize the geometry of the achievable frontier—the equivalent of Corollary 2 in Kamenica and Gentzkow (2011). In order to do that we resort to the Lagrangian dual of problem OPT-P and obtain a geometric understanding of the shape of the achievable frontier for a general payoff function v((µ,δ);P) Theorem 2.5.2 The Lagrangian dual of problem (OPT-P) is equivalent to the following optimization problem: min g,λ,ρ g (OPT-D) s.t. v((µ,δ);P) +∑ ω λω(µ(ω)− µ0(ω))− ∑ ω̸=ω′ ρω,ω′ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) v((µ,δ);P) ≤ g, ∀(µ,δ) ∈ U . Moreover, given a solution (g ∗ ,λ ∗ ,ρ ∗ ) for the dual problem (OPT-D), it follows that (µ,δ) ∈ U satisfies v((µ,δ);P) ≤ g ∗ −∑ω λ ∗ ω(µ(ω)− µ0(ω)) 1−∑ω̸=ω′ ρ ∗ ω,ω′ µ(ω) µ0(ω) − µ(ω′) µ0(ω′) , if ∑ ω̸=ω′ ρ ∗ ω,ω′ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) < 1, v((µ,δ);P) ≥ g ∗ −∑ω λ ∗ ω(µ(ω)− µ0(ω)) 1−∑ω̸=ω′ ρ ∗ ω,ω′ µ(ω) µ0(ω) − µ(ω′) µ0(ω′) , if ∑ ω̸=ω′ ρ ∗ ω,ω′ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) > 1. 63 Theorem 2.5.2 is best understood in juxtaposition to the analogous result in the classical fullcommitment persuasion setting. In that original setting, there is no discounted belief, so problem (OPT-1) is simply given by max τ∈∆(Ω) Eτ [vsp(µ)] s.t. Eτ [µ(ω)− µ0(ω)] = 0, ∀ω ∈ Ω, where vsp(µ) = maxa∈aˆ(µ) v(a) is the sender-preferred payoff function at posterior belief µ. For reference, see Corollary 1 of Kamenica and Gentzkow (2011). Letting κω ∈ R, for each ω ∈ Ω, be the dual variables for the Bayes Plausibility constraints of the above optimization problem, by setting ρω,ω′ = 0 for all ω,ω ′ ∈ Ω, it follows as a straightforward corollary of Theorem 2.5.2 that the Lagrangian dual of the classical problem is given by min vˆ,κω vˆ s.t. vsp(µ) ≤ vˆ+ ∑ ω∈Ω κω(µ(ω)− µ0(ω)), ∀µ ∈ [0,1]. This dual formulation is geometrically insightful as it shows that the problem boils down to finding the “closest” affine function in the family {vˆ+∑ω κω(µ(ω)− µ0(ω)) : µ ∈ [0,1]} that lies above the sender-preferred payoff function ˆvsp(µ) at every belief µ, providing an alternative derivation of the concave closure of co(vˆsp). This geometric interpretation is presented in Figure 2.3 (left). Theorem 2.5.2 provides a similar geometric insight, presented in Figure 2.3 (right): Sender’s payoffs are no longer bounded above by an affine function, but rather bounded from above and below by the family of affine functions g ∗+∑ω λ ∗ ω(µ−µ0) 1−ρ ∗ ω,ω′ (µ−µ0) . In other words, in the absence of commitment, the problem is equivalent to finding the “closest” rational function that bounds Sender’s payoff from above whenever ∑ω,ω′ ρ ∗ ω,ω′ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) < 1 and that bounds Sender’s payoff from below whenever ∑ω,ω′ ρ ∗ ω,ω′ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) < 1 . Whenever the shadow price of the (P.P.) 64 Firm Belief Consultant Value Firm Belief Consultant Value Figure 2.3: Characterization of achievable frontier with commitment (left) and through a mediator (right). constraint is equal to zero, i.e., ρω,ω′ = 0 for all ω,ω ′ ∈ Ω, this rational function collapses to the affine function g ∗ + ∑ω λ ∗ ω(µ(ω) − µ0(ω)), recovering the characterization of the classical Bayesian Persuasion problem. 2.6 Blockchain Mediated Information Selling The existing literature has showed that information mediation need not provide any additional benefit to Sender over no communication. In fact, whether or not mediation provides benefit depends entirely on the structure of Sender’s payoff function (see, e.g., Salamanca, 2021). We now will demonstrate that the value of mediation alone can be arbitrarily smaller than the value of costly blockchain mediation. In essence, BMP mechanisms can be decomposed into two properties: the ability to transparently guarantee the mapping between the report and the message, i.e., serve as a mediator, and the ability to price-differentiate between signals through the fees of the associated smart contracts. In a simple setting, we show that without Sender commitment, BMP mechanisms can improve the payoff for Sender more than costless mechanisms that mediate the transmission of information. 65 2.6.1 Selling Information We return to the example of a Consultant selling information on the investment options of a Firm and formalize the Information Selling game between the two players. Both the Firm (Receiver) and the Consultant (Sender) are uncertain about the quality of the two projects: either Project A will be successful (ω = A) with probability µ0 or Project B will be successful (ω = B) with probability 1 − µ0. We denote by Ω = {A,B} the set of all possible states of the world. The Consultant after researching the two investment options (ex-post) learns ω ∈ {A,B} and can communicate this information to the Firm. The Firm can make a decision to invest in Project A (a = A), invest in Project B (a = B), or hire the Consultant a = H to obtain more information and learn ω. We denote by A = {A,B,H} the set of all possible actions. Investing in a successful project yields a return equal to h while hiring the Consultant reveals the successful project but in exchange for a payment equal to e > 0. Concretely, the Firm’s utility can be written as u(a,ω) = 1(a = ω)h+1(a = H)(h−e). Note that for simplicity we have assumed that the cost of investment is equal to 0. Prior to making this decision, the Firm collects all available information I (i.e., her prior and the Consultant’s message) and forms a posterior belief µ = P(ω = A | I) about the quality of the projects. The Consultant receives utility 0 if the Firm does not hire them and utility e from the Firm if they do. Formally, the utility of the Consultant is given by v(a) = 1(a = H)e. The following lemma presents the Firm’s optimal action and Consultant’s value for each beliefaction distribution pair. In Figure 2.4 we present the Consultant’s value function as a function of the Firm’s belief. Note here that in the case of indifference the Firm can use mixed strategies. 66 0 1 0 Figure 2.4: The value of the Consultant as a function of the Firm’s belief. Lemma 2.6.1 For any given belief µ, the Firm’s set of optimal actions aˆ(µ) and the Consultant’s utility v(µ,δ) are given by aˆ(µ) = {B} if µ ∈ [0,µ), {B,H} if µ = µ, {H} if µ ∈ (µ,µ), {A,H} if µ = µ, {A} if µ ∈ (µ,1], and v(µ,δ) = 0 if µ ∈ [0,µ), δ(H)e if µ = µ, e if µ ∈ (µ,µ), δ(H)e if µ = µ, 0 if µ ∈ (µ,1], where δ ∈ ∆(aˆ(µ)), µ = e/h and µ = 1− µ. We assume that e < h/2 so that µ < µ and there is an interval of beliefs where the Firm finds it optimal to Hire the Consultant. Furthermore, when µ0 ∈ [µ,µ] and the Consultant does not communicate any information to the Firm, the obtained utility is equal to e, the largest possible, and hence there is no value in information exchange. This is why for the rest of the chapter we focus on the more interesting case of µ0 ∈ (0,µ). Note that the analysis of the case µ0 ∈ (µ,1) would be symmetric and we omit it for simplicity of exposition. 67 2.6.2 Improvement due to Blockchain Mediated Persuasion In this section, we show that it is possible for the Consultant to increase their payoff by using a simple BMP mechanism. However, surprisingly, in order to increase their payoff, the Consultant must use a costly BMP mechanism. When signals are not costly, i.e. when the BMP mechanism is serving purely as a mediator, the Consultant cannot increase their payoff. We first present and analyze the performance of a simple class of BMP mechanisms that we refer to as BMP-2 mechanisms. BMP-2 mechanisms offer two price levels for signals: κℓ and κh, with κℓ < κh. They thus both mediate and price-differentiate between signals. We denote by ∆κ = κh −κℓ the level of cost differentiation that the mechanism introduces. Theorem 2.6.1 establishes that BMP mechanisms have the potential to strictly improve Sender’s expected payoff compared to the benchmark of no communication (in which case the Consultant would have received 0) even without assuming commitment to the signal. Theorem 2.6.1 For the case of Information Selling, the following are true for the maximum value attained by a BMP-2 mechanism with price levels κℓ and κh, with κh ≥ κℓ ≥ 0 and κℓ < e. (i) There exists a range of prior beliefs such that Sender benefits from the optimal BMP-2 mechanism if and only if there is cost differentiation, ∆κ > 0. (ii) The maximum value is equal to VBMP−2(µ0,κℓ ,κh) = max( ∆κ(e+∆κ)µ(1− µ0) (e+∆κ)(µ − µ0) +∆κ(1− µ)µ0 −κh,0 ) . (iii) The maximum value is achieved by a BMP-2 mechanism that sends one of 3 signals, charges κℓ for 2 of the signals, and κh for the third signal. The improvement achieved by the optimal BMP mechanism stems not only from the ability of smart contracts to guarantee the mapping between the report received and the message sent, but also, as part (i) of Theorem 2.6.1 states, heavily depends on the flexibility of cost differentiation between signals. In fact, as explicitly demonstrated by part (ii) of the theorem, if ∆κ = 0, then 68 0 Firm Belief Consultant Value Figure 2.5: The achieved Sender’s value with Blockchain Mediated Persuasion, when the mechanism only charges for one signal. the optimal value achieved by the optimal BMP mechanism is zero. We isolate the effects of transparency and cost differentiation to study their benefits separately in the subsequent subsection. We note that the optimal BMP-2 mechanism is very simple and sends one of only three messages. The optimal mechanism has the same structure as the example that we presented in the introduction: the mechanism sends one of three messages to the Firm “Project A”, “Project B” or “Hire”. Two of these messages (namely “Project B” and “Hire”) cost κℓ and the message “Project A” costs κh. The conditional probabilities of sending each message are chosen so that it is incentive compatible for the Firm to take the corresponding action (hence satisfying (B.P.)) and optimal for the Consultant to truthfully reveal the outcome of her research (hence satisfying (P.P.)), while at the same time solving the Consultant’s optimization problem (OPT-P) for all priors µ0, when constrained to the class of BMP-2 mechanisms. 69 “Project A” “Hire” “Project B” ω = A e(µ−µ0) e(µ−µ0)+κhµ(1−µ0) κhµ(1−µ0) e(µ−µ0)+κhµ(1−µ0) 0 ω = B 0 κh(1−µ)µ0 e(µ−µ0)+κhµ(1−µ0) e(µ−µ0)+κh(µ−µ0) e(µ−µ0)+κhµ(1−µ0) Table 2.1: Signal realization distribution for the optimal BMP-2 mechanism when κℓ = 0. 2.6.3 The value of cost differentiation In order to explore the added benefit of cost differentiation, we focus on BMP-2 mechanisms with κℓ = 0 and κh > 0. Note that taking κℓ = 0 is without loss, since the optimal value of BMP-2 mechanisms given by Theorem 2.6.1 only depends on the cost differentiation ∆κ and the high price κh. As we showed in Theorem 2.6.1, the value achieved by the optimal BMP-2 mechanism is larger than 0 for a wide range of prior beliefs. The latter implies that the benefit of BMP mechanisms can be arbitrarily larger than mediation without cost differentiation. Figure 2.5 geometrically illustrates how this improvement is achieved in the case of κℓ = 0 and κh > 0. Intuitively, two messages (namely “Project B” and “Hire”) are offered to the Consultant for free (hence the Consultant’s net utility is 0 and e respectively), while the third signal (namely “Project A”) costs the Consultant κh. This way the value function of the Consultant is reshaped so that it has three levels: l = 0, e and r = −κh. In order to crystallize the intuition, note that in the absence of cost differentiation the Consultant would ex-post be incentivized to report the state of the world (Project A or B) that is associated by the mechanism with the higher probability of sending the “Hire” message, since this is the only case that she gets paid. On the other hand, the optimal BMP-2 mechanism (that charges κh to the Consultant whenever the message “Project A” is sent), is structured according to Table 1. This way, whenever the Consultant reports A, which maximizes her probability of being hired, she runs the risk of the message being “Project A”, in which case she pays the high cost κh. This flexibility offers an additional lever to the BMP mechanism to ensure directness. In other words, even though there is an explicit cost associated with charging the Consultant κh whenever the “Project A” signal is sent, it also offers the implicit benefit of relaxing her incentive compatibility constraint (P.P.) and establishing trust. This, in turn, partially recovers the value of persuasion. 70 0 High Cost Consultant Value Figure 2.6: The achieved Sender’s value as a function of the high cost κh, when κℓ = 0. 0 Gas Fees Consultant Value Figure 2.7: The achieved Sender’s value as a function of the gas fees g, where κℓ = g and κh = 2g. Below, we rigorously present this effect by focusing on these two forces that are in play: explicit cost vs. the shadow price of Payoff Plausibility. Proposition 2.6.1 For the case of Information Selling, let τ(r) = µ0 be the probability that the optimal BMP-2 mechanism sends the message “Project A” when ∆κ = 0. Then ∂VBMP−2(µ0;κℓ ,κh) ∂∆κ ∆κ=0 = −τ(r) | {z } explicit cost +ρ ∗ τ(r)(1− µ0) | {z } implicit benefit , where ρ ∗ is the shadow price of the (P.P.) constraint of (OPT-P). Proposition 2.6.1 provides an envelope theorem type characterization of the dual role of cost differentiation. Increasing cost differentiation (i.e., increasing ∆κ) comes at an explicit cost associated with the fees paid when the more expensive signal is sent, which happens with probability τ(r). On the other hand, allowing this cost differentiation adds more flexibility to the mechanism, ensuring that Payoff Plausibility is satisfied: the Consultant is less inclined to manipulate their report in a way that leads to more messages which induce the action H. In general, the implicit benefit of differentiation is stronger than the explicit cost. In Figure 2.6, we show the dependence of the Consultant’s value on κh when κℓ = 0. That being said, even in cases where fees are determined exogenously by gas costs, one can still attain adequate cost 71 differentiation by artificially inflating the cost of one message over another, e.g. by sequentially executing smart contracts so that messages executed by later contracts in the sequence accumulate more gas fees. For example, if the gas fees of running the smart contract is equal to g, then messages executed by the first contract cost κℓ = g and messages executed by the second contract cost twice as much κh = 2g. In this case, using Theorem 2.6.1, the Consultant’s Value from the BMP-2 mechanism is given by VBMP−2(µ0,g,2g) = g(e+g)µ(1− µ0) (e+g)(µ − µ0) +g(1− µ)µ0 −2g. Figure 2.7 illustrates how the Consultant’s value changes as the gas fees increase and demonstrates a non-monotonicity that highlights the implicit and explicit role of g. When g = 0, which corresponds to the case of free mediation, there is no benefit. On the other hand, as g increases cost differentiation kicks in and the Consultant’s value increases. As discussed above, in this initial regime, there is a Braess paradox type of effect: the gas fee g, in addition to being an “expense” that the Consultant must pay to implement the mechanism, also serves as an instrument that guarantees her directness. However, eventually there are diminishing returns, and the cost of additional trustworthiness outweights the additional benefit. This is why the total value drops when g is large enough. 2.7 Optimal BMP Mechanisms In light of Theorem 2.5.1, the optimization problem (OPT-1) gives rise to a family of convex hulls G = {GP : P is a partition of U }. For each convex hull, we can find the optimal belief distribution and therefore, Solving problem OPT-1 amounts to selecting the convex hull that leads to the highest payoff. We state this formally below. 72 Corollary 2.7.1 The value of the optimal mechanism is given by V(µ0) = sup GP∈G sup z : (µ0,0,z) ∈ co(GP) . In order to solve this problem in full generality one would have to consider all possible cost partitions of U , and we leave such a characterization to future work. On the other hand, for the Information Selling example, since the Firm is deciding between three actions, i.e. |A | = 3, intuitively it is enough to consider mechanisms that send three messages. Therefore, one could consider, without loss of optimality, all partitions of size 3 in order to obtain the optimal mechanism for this example, simplifying the optimization problem substantially. Surprisingly for the case of Information Selling, the optimal solution is even simpler: it is achieved by a BMP-2 mechanism and hence it is enough to only consider partitions of size 2, as the next Proposition establishes. Proposition 2.7.1 Consider the Information Selling problem and an ordered set of costs {κ1,κ2,...,κn}. The optimal BMP mechanism is a BMP-2 mechanism with price levels κℓ = κ1 and κh = κn. Moreover, the optimal mechanism sends 3 messages, 2 of which cost κℓ and 1 costs κh. With this characterization in hand, we can readily obtain the optimal value achieved by any BMP mechanism, as the following proposition illustrates. Proposition 2.7.2 For the Information Selling problem when µ0 ∈ [0,µ) the supremum of (OPT-1) is equal to Vcommit(µ0) 1− µ 1− µ0 , where Vcommit = eµ0/µ is the optimal value achieved by a mechanism that assumes Sender’s commitment. This characterization provides the optimality gap that arises due to the lack of trust between the Firm and the Consultant and is illustrated by Figure 2.8 below. Note that the statement of the theorem characterizes the supremum of the value of the Consultant because the latter is achieved as 73 the low cost of the BMP-2 mechanism is equal to 0, i.e. κℓ = 0 and the high cost grows arbitrarily κh → ∞. Note that even though the high cost grows to infinity, the probability of sending the costly “Project A” message shrinks to zero at a rate equal to µ0 e(µ − µ0) e(µ − µ0) +κhµ(1− µ0) , and therefore the expected cost paid to the Consultant converges to a constant, i.e. µ0 µ µ − µ0 1− µ0 e. We conclude this section by briefly discussing the social welfare implications and show that social welfare increases despite the fact that the mechanism was optimized for the Consultant’s value. In fact, the change in social welfare due to using the optimal BMP mechanism is equal to e µ0 µ 1− µ 1− µ0 > 0. For this calculation we used the fact that the social welfare when the optimal BMP mechanism is used is equal to h− µ0 µ µ − µ0 1− µ0 e, since the Firm always makes the correct investment decision and the Consultant pays the messaging costs (note that payments cancel out), while in the absence of the BMP mechanism the social welfare is equal to W0(µ0) = h(1− µ0) = h− µ0 µ e, since the Firm invests in Project B which is the correct investment decision with probability (1− µ0) and the Consultant does not get hired. 74 0 Firm Belief Consultant Value Figure 2.8: The achieved Sender’s value with the optimal BMP mechanism (red line), vs. the optimal committed persuasion mechanism (green line). 2.8 Conclusion The persuasion literature demonstrates that it is possible for a well-informed party to convince rational people to take a desired action by strategically managing their beliefs. Therefore, persuasion can be viewed as a low cost technique that can garner potentially high payoffs. It makes sense then why tactics of persuasion are pervasive throughout the social, political and business worlds. However, persuasion can only take place if the communication channel through which the parties talk is reliable and trustworthy. By assuming commitment, we may be too optimistic regarding the difficulties and complexities that arise in real life communication. People often have reason to be skeptical of each other, and this may inevitably lead to a lack of ability to believe that Sender will honor their commitment to a signal mechanism. One alternative is utilizing a mediator that both parties trust to facilitate the communication. With the growing availability of blockchain technology, the implementation of such mediation has never been easier. However, as we have shown 75 in this work, having a reliable mediator is not sufficient for beneficial persuasion. Rather, we see that costly mediation succeeds where free mediation fails. By requiring Sender to pay the mediator for different signal realizations, we effectively make it easier for them to truthfully report, which in turn empowers them to persuade Receiver more effectively. 76 Chapter 3 Learning Networks via Persuasion 3.1 Introduction The Bayesian persuasion framework introduced by Kamenica and Gentzkow (2011) has become one of the most popular ways to model strategic information disclosure in the past decade. The model considers settings in which an (ex-post) better informed agent, called Sender, seeks to persuade a rational Bayesian agent, called Receiver to take an action that maximizes their own expected utility. By designing a signaling mechanism that strategically discloses information about the state of the world, Sender is often able to improve their expected payoff by communicating with Receiver. Since the seminal work, there has been much interest in applying the persuasion framework to more complex settings. In particular, extending the model to settings with multiple receivers has been extensively explored in the literature. The importance of multi-receiver persuasion models is nearly self-evident: many applications in business, economics and engineering have to do with settings comprised of many agents. For example, a retailer wishes to convince buyers in a market to purchase its product, a politician wants to persuade voters to voter for her, a social media platform desires its users to engage in its content more frequently, and so on. Multi-receiver Bayesian persuasion models have largely focused on (i) public signaling that communicates a single message to all receivers or (ii) private signaling that communicates a private signal to each receiver independently. In both of these settings, the receivers’ network structure 77 does not play a role. However, in many practical applications, receivers are embedded in a network in which they interact with each other. For example, customers living in the same household may discuss and share information with one another before making decisions. This information “leakage” or “spillover” has known consequences for the design of optimal signaling mechanisms. There are several ways to incorporate information spillovers into the classic Bayesian persuasion framework. We consider a model where receivers are embedded in a directed network, where the directed edges correspond to information flow. That is, if there is an edge from receiver i to receiver j, then i shares the message they received from Sender with j. Note that the information that i shares with j may impact his final decision, particularly when the message that i received is more informative. If Sender knows the network structure, then she can design a mechanism that takes these information spillovers into account, thereby minimizing the chance that information leaked from one agent to another adversely impacts the latter’s decision. Unfortunately, Sender does not always know the network structure, which makes designing an optimal signaling mechanism impossible. In this chapter, we explore the possibility of constructing a sequence of signaling mechanisms that enable Sender to learn the receiver network structure over time. We focus on a binary model with two actions and two states of the world in a repeated setting. For concreteness, we consider a firm (Sender) selling a product of variable quality over an infinite time horizon to a finite market of customers (receivers). At each time, the firm commits to a signaling mechanism that sends a message conveying information about the product’s quality to each customer. Customers share their messages with others in their directed (outgoing) neighborhoods. We establish that learning is possible by constructing two signaling policies that sequentially learn the neighborhoods of each customer. The first policy, which we call the Sequentially Truthful (ST) policy, gives full information to one customer at each time step. Customers that receive no information default to their prior and choose to not purchase the product. However, when the product quality is good, the buyer that receives the true state of the world communicates the good news with those customers in his neighborhood, causing them to purchase. Since only those customers in the informed customer’s 78 neighborhood purchase, the firm learns the informed customer’s outgoing neighborhood. Repeating this process for each customer leads to learning the network in expected time linear in the number of customers. The ST policy is able to quickly learn the network, but not every customer purchases when the product quality is good, leading to worse revenue outcomes than if the firm were to simply tell the truth to all customers at each time step. Because of this deficiency, we move beyond this simple policy and construct the Sequentially Correlated Persuasion (SCP) policy. This policy carefully correlates messages across customers so as to balance persuasion with learning. We prove that the SCP policy also learns the network in linear time. We demonstrate numerically that the SCP policy is able to garner benefits from persuasion, leading to high payoffs for Sender as they learn the network. The SCP policy greatly outperforms both the ST policy and classic Bayesian persuasion when graph connectivity is moderate. Because of its ability to balance persuasion with learning, we believe the SCP policy can be successfully integrated into a learn-then-optimize framework. After the SCP policy learns the network, the optimal signaling mechanism can then be designed and implemented for the remainder of time horizon. In addition, we believe that similar techniques can be applied to more complex settings with larger state and action spaces. The remainder of this chapter is structured as follows. First we discuss the relevant literature in Section 3.2. Then we formally introduce our model in Section 3.3. The main results are stated in Section 3.4. We briefly discuss extensions and future directions in Section 3.5 3.2 Literature Review Our model builds on the work of Kamenica and Gentzkow (2011) on Bayesian persuasion. There has been much interest in the Bayesian persuasion model throughout management, economics, and engineering. See, for instance, Bergemann and Morris (2019) for a detailed survey of persuasion 79 and its applications. We also refer the reader to the literature review provided in Section 2.3 of this dissertation for additional citations of recent applications. In the classic Bayesian persuasion model, a sender strategically communicates information to a single receiver in order to persuade them to take a desirable action. There has been work extending the classic Bayesian persuasion model to multi-receiver settings, e.g. social networks (Candogan and Drakopoulos, 2020), crowd-sourcing (Kremer et al., 2014; Papanastasiou et al., 2017), retail (Drakopoulos, Jain, et al., 2020) and management of political campaigns (Alonso and Camara, ˆ 2016). Until recently, only two regimes were analyzed in the multi-receiver setting: (i) private signaling where receivers only have access to their own signal and (ii) public signaling where all receivers get the same signal. Both of these regimes ignore possible network effects between receivers. In reality, agents within a network are likely to interact with each other. In fact, such network effects in economic systems have been widely studied in the literature. For instance, see work by Jackson (2008) for a comprehensive literature review. In the context of information design, agents embedded within a network may communicate their signals with one another prior to taking an action. This so-called “information spillover” or “network communication” can have effects on the design of optimal persuasion mechanisms and their associated payoff. Our work is not the first to consider information spillovers. There have been several recent papers that have incorporated spillovers into multi-receiver Bayesian persuasion models. For instance, Egorov and Sonin (2020) consider the problem of sending propaganda to agents that lie in a random social network. Information flows from one node to another with a certain fixed probability, and agents can decide to subscribe or possibly learn the information from their neighbors. In contrast, Galperti and Perego (2019) consider a setting where a select number of agents in the network receiver information, called seeds. The information spills from the seeds to other agents through the links in the network. Candogan, Yifan Guo, et al. (2020) consider a network of agents each with assigned “experiments”. Each pair of connected agents leak their experiments rather than signal realizations. 80 The idea of correlating signals in the presence of network communication has been considered by Kerman and Tenev (2021) and (2023). As in our model, they study a model where receivers are connected in a fixed network, where neighboring receivers share the signals they received from the sender. The focus of their work is on deriving optimal signaling mechanisms given the fixed, known network. We extend their retailer example to a multi-period setting with an unknown network, where the focus is on learning the network rather than design of a revenue maximizing mechanism. Our work seeks to understand if one can learn the underlying network via Bayesian persuasion alone. To our knowledge, we are the first to consider the question. Since the existing techniques assume the network is known, we believe uncovering the network using the tools from information design will complement the existing literature on the topic. However, our work is related to other models that consider persuasion in a repeated setting. Castiglioni, Celli, et al. (2022) and Bernasconi et al. (2022) study the persuasion problem with a single receiver in the context of sequential decision making. Castiglioni, Celli, et al. (2020) study no-regret algorithms to an online Bayesian Persuasion where a sender repeatedly interacts with a receiver whose type is chosen adversarially. Castiglioni, Marchesi, et al. (2021) extend the onlinesetting to multiple receiver, however, they restrict to the case where there is a private signaling channel with each receiver (no leakage). Hahn et al. (2021) look at prophet inequalities for the online Bayesian Persuasion problem problem. 3.3 Model We focus on a multi-receiver Bayesian persuasion model in a repeated setting with binary action and state space. This class of models is common throughout the persuasion literature, capturing a large number of applications including retail, voting and engagement on social media platforms. 81 For concreteness, we will restrict our attention to a specific application of a firm (Sender) repeatedly selling a product to a market of customers (receivers) over an infinite horizon. We note that this extends the application introduced by Kerman and Tenev (2023) to a multi-period setting. A firm repeatedly sells a product to a finite market of customers i = 1,2,...,N at a fixed price p > 0. Sales take place over an infinite time horizon with time periods denoted by t = 1,2,3,... In each time period, the firm sells their product to the customers in the market. We ignore inventory considerations by assuming that the supply of the product exceeds the size of the market. That is, in each time period there is enough product produced so that every customer that chooses to purchase will receive the product. However, we do assume an underlying uncertainty regarding the quality of the product produced in each period. The product sold in time period t is of either good or bad quality. The quality of the product in period t is the realization of the random variable Qt ∈ {0,1}. Product quality Qt has time homogeneous prior probability distribution given by P(Qt = 1) = µ0. This prior is common among the firm and the customers at every time t. To be clear, all of the product sold in time period t is either good or bad. At time step t, each of the N customers can either choose to purchase the product or not. The customer action set is denoted by A = {b,n}, where b stands for the “buy” action and n stands for the “no buy” action. Customers are heterogeneous in their valuation vi > p for the product. The utility of customer i is a function of their action a ∈ A and the quality of the product Qt ∈ {0,1} given by the following: ui(a,Qt) = vi − p a = b, Qt = 1 −p a = b, Qt = 0 0 o.w. It is easy to see that, given full information, every customer will purchase the product if and only if the quality is good (Qt = 1). However, the utility gained from purchasing a good product can vary across customers depending on their valuation. Each customer’s utility function is time 82 homogeneous, in the sense that their underlying valuation for the product as well as the distribution of product quality remains the same in each time period t. As in the classical Bayesian Persuasion paradigm of Kamenica and Gentzkow (2011), the firm is able design a signaling mechanism that privately sends a message to each customer in the market. In our setting, we assume that the firm can commit to a signal mechanism at the start of each period t = 1,2,.... This forms a policy over the selling horizon. Formally, a signaling policy is specified by a sequence of conditional probability distributions {πt(·|Qt)} over a signal realization space St = S1,t × S2,t × ··· × SN,t , where Si,t is the signal realization space for customer i at time t. We will refer to (πt ,St) as the time t signal mechanism, and with slight abuse of notation, we may simply refer to a mechanism by its distribution πt . We will denote a generic signaling policy as π = {(πt ,St)} ∞ t=1 . Customers are embedded in a directed network G with vertex set V(G) and edge set E(G). The vertex set V(G) is simply the collection of customers i = 1,...,N. If there is an edge from node i to node j, then customer i shares the message they receive from the firm with customer j at every time t. We let N out i (G) and N in i (G) be the outgoing and incoming edges of customer i, respectively. We will refer to these sets as the outgoing and incoming neighborhoods of i. Accordingly, the set of messages that customer i receives at time t is Si,t = n sj : j ∈ N in i (G) o . Note that the time t signal realization set is independent of past realizations. This means that customers have no memory of past messages, but rather only base their time t decision on the messages received from the present mechanism (πt ,St). At time t, given knowledge of the signal mechanism πt , each customer updates their belief of Qt according to Bayes rule upon seeing the messages in Si,t : P(Qt = 1|Si,t) = πt(Si,t |Qt = 1)µ0 πt(Si,t |Qt = 1)µ0 +πt(Si,t |Qt = 0)(1− µ0) , 83 where πt(Si,t |Qt) = ∑s∈St :Si,t π(s|Qt) is the probability that Si,t is realized under πt . It follows that the expected utility of customer i at time t is equal to ui(a|Si,t) = P(Qt = 1|Si,t)vi − p a = b, 0 a = n Thus, customer i purchases the product at time t if and only if their posterior belief that the product quality is good is sufficiently high: P(Qt = 1|Si,t) ≥ p vi . We let ai(Si,t) = argmaxa∈A ui(a|Si,t) be the action that maximizes customer i’s utility given message set Si,t . To simplify exposition, we assume vi < p/µ0 for all i, so that every customer’s default action is to not purchase the product. The firm does not have any prior knowledge of G and wishes to learn the network structure via the design of a signaling policy. For the moment, we focus solely on the learning problem and ignore any revenue considerations. Under policy π = {(πt ,St)} ∞ t=1 , at each time t, the firm generates a signal realization st ∈ St and observes the customers’ actions at ∈ {b,n} N. Let Ht = {(st ′,at ′) : t ′ ≤ t} be the history of all observed data up to time period t. Informally, as time passes, some graph configurations become incompatible with the observed data. The following definition makes this precise. Definition 3.3.1 A network Gˆ is incompatible with history Ht if for some t ′ ≤ t, ai(Sˆ i,t ′) ̸= ai,t ′ , where Sˆ i,t ′ is the set of messages customer i receives in the network Gˆ at time t ′ . We let Gˆ t denote the set of all compatible estimates Gˆ at time t. Therefore, the firm has learned the graph G when there is only a single estimate Gˆ that is compatible with the observed data. We denote this time by T(G;π) = inf{t : |Gˆ t | = 1} (3.1) 84 We consider the problem of learning G. Thus, the firm’s optimization problem is given by min π E[T(G;π)] (3.2) In the next section, we construct solutions to Problem 3.2 that quickly learn G. 3.4 Main Results In this section, we state the main results of this chapter. Our primary goal is to construct policies for Problem 3.2 that learn the network quickly. As a secondary goal, we find a policy that is able to persuade customers to purchase more frequently along the way to learning the network. We first establish that learning is always possible. This is not obvious, as there are signaling policies that never learn the underlying network. We then give two signaling policies that learn G in expected time linear in the number of customers. We leverage a particular correlation structure across customers to induce linear learning times with higher revenue streams. 3.4.1 Learning Is Possible A priori, it is not clear that there exist signaling policies that are able to uncover the underlying network G. For instance, consider the class of public signals where Si = Sj for all i, j and πt(s1,...,sn|Qt) = 0 whenever si ̸= sj for some i, j. If one were to restrict policies to public signals in each period t, then all customers always receive the same messages. This makes it impossible to learn if i communicates with j because, in either case, j will take the same action, yielding all possible network estimates Gˆ compatible with all possible histories at every time t. Therefore, a policy having the capability to learn G must have the capability of generating data that reveal incompatibilities, as per Definition 3.3.1. 85 Is the set of such policies empty? We define Π(G) to be the set of signaling policies that guarantee learning in finite time given any network G. Our first result states that there exist policies that learn G in finite time. Proposition 3.4.1 The set of policies Π(G) = {π : E[T(G;π)] < ∞} is non-empty given any graph G. To see why learning is always possible in this setting, consider the classic Bayesian persuasion mechanism that independently signals a straightforward action recommendation b or n to each customer i with the following probabilities: π classic i (b|Q = 1) = 1 π classic i (b|Q = 0) = µ0 1− µ0 vi − p p . We can construct an associated signaling policy by committing to π classic at every time t. Definition 3.4.1 The Classic Bayesian Persuasion (CBP) policy commits to π classic at every time. Let Ei be the event that customer i receives n and customers j ̸= i receive b. Note that under π classic, the n message is only sent when the quality of the product is bad. Thus, when Ei occurs, if j ∈ N out i (G), j learns that the quality of the product is bad, and chooses not to purchase the product. On the other hand, if j ̸∈ N out i (G), Sj,t consists only of b messages and a quick calculation shows that j will buy the product in this case. Therefore customer j does not purchase if and only if i communicates their signal with j. Thus, when Ei occurs, N out i (Gˆ) = N out i (G) for every Gˆ ∈ Gˆ t . If Ei occurs for each i, then N out i (G) is learned for each i, which results in learning G. This reasoning shows that the CBP policy is able to learn G. However, because CBP independently 86 s1 = n s2 = b s3 = b s4 = b s5 = b Figure 3.1: Illustration of the event E1. We learn customer 1’s outgoing neighborhood is N out 1 (G) = {4,5} by observing customers 4 and 5 deviate from their recommended “buy” action. signals to each customer in the market, the probability that Ei occurs in any given time period t is small: P(Ei) = (1− µ0) ∏ j̸=i µ0 1− µ0 v j − p p ! 1− µ0 1− µ0 vi − p p , which goes to zero exponentially fast as N goes to infinity. This leads to the following proposition. Proposition 3.4.2 For the CBP policy, the events Ei take exponentially long to occur on average. The events Ei may appear to be one of many ways to learn G. However, it turns out that for some networks, the firm must observe Ei for all i to learn G. Corollary 3.4.1 The CBP policy has worst case expected learning time of order exponential in N, i.e., max G E[T(G;CBP)] = O c N , where c is a constant greater than one. 8 3.4.2 The Sequentially Truthful (ST) Policy In the previous section, we showed that signal realizations that give a single customer customer knowledge of the state of the world has the potential to change all other customer actions, and thus allows the firm to eventually learn the network. We now construct a simple policy that induces such events to occur in (expected) linear time. Define the signaling mechanism with realization spaces Si = {b,n} and Sj = {/0}, j ̸= i, and distribution defined by π truth,i j (/0|Qt) = 1, ∀ j ̸= i, π truth,i i (b|Qt = 1) = 1, π truth,i i (n|Qt = 0) = 1. The mechanism sends a truthful signal to customer i, that fully reveals the product’s quality, while giving no information to all other customers j ̸= i. Note that, under this mechanism, for all j ̸= i, customer j will purchase if and only if (i, j) ∈ N out i (G) and Qt = 1. Given this observation, we can construct a policy that sequentially learns N out i (G). Definition 3.4.2 Let t1 < t2 < ··· < tN be the first N time periods in which Qti = 1. Then, the Sequentially Truthful (ST) policy commits to the mechanism π truth,i for every time period within the interval (ti−1,ti ], with t0 = 0. The following theorem characterizes the learning speed of the ST policy. Theorem 3.4.1 The Sequentially Truthful (ST) policy learns G in expected time E[T(G;SCP)] = N µ0 . (3.3) Moreover, the policy achieves long run average revenue less than N pµ0 with strict inequality when G is not complete. 88 Theorem 3.4.1 shows that there exist policies that can learn G in (expected) time linear in the number of customers, regardless of the structural properties of the network. However, the revenue generated under the ST policy is low because it makes no attempt to persuade customers to purchase the product. We try to balance learning with revenue generation in the subsequent section. 3.4.3 The Sequentially Correlated Persuasion (SCP) Policy The ST policy constructed in the previous section does not make any attempt to persuade customers to purchase the product, and so while it is able to learn the network in linear time, its revenue performance is poor. In this section, we construct a more complex policy that leverages the principles of classic Bayesian persuasion to improve the firm’s revenue, while at the same time learning the network in expected time linear in the number of customers. Under the classic Bayesian persuasion mechanism π classic, if a customer receives a buy message, then they find it incentive compatible to purchase the product in the absence of information spillovers. However, in the presence of information spillover, every customer that receives a “no buy” message, knows with certainty that the product is bad and shares this bad news with all other customers in their outgoing neighborhood. When a customer sees that their neighbor received “no buy” they too learn that the product is bad and forgo their purchase. Therefore, a single “no buy” message can corrupt the purchasing behavior of many customers in the network. To mitigate this effect, it makes sense to send as many “buy” messages together as possible. To balance learning speed with persuasion, we will design a mechanism that correlates “buy” messages as much as possible across customers, while at the same time reserving sufficient probability mass for a special message that sends a single customer the “no buy” n message. To do this, we will augment the classic Bayesian persuasion mechanism. First, define the value ei by ei = min min j̸=i µ0 1− µ0 v j − p p ,1− µ0 1− µ0 vi − p p . 89 n = (n,n,n,n) b(π˜1) = (b,n,n,n) b(π˜3) = (b,n,b,n) b(π˜2) = (b,b,b,n) b −3 = (b,b,n,b) Figure 3.2: Illustration of the correlation structure employed by the SCP policy Define the values π˜j = µ0 1− µ0 v j − p p −ei , ∀ j ̸= i, π˜i = µ0 1− µ0 vi − p p Let j1, j2,..., jN be the customer indices satisfying the ordering: π˜j1 ≥ π˜j1 ≥ ··· ≥ π˜jN Denote by b(π˜jk ) ∈ {b,n} N the message with sjl = b for all l ≤ k and sjl = n if l > k. Denote by b −i ∈ {b,n} N the message where every customer j ̸= i receives b, and customer i receives n. Let b be the signal realization that sends b to all customers. Finally, let n be the signal realization where every customer receives n. Define the mechanism (π i ,S i ) according to the following distribution: π i (b|Qt = 1) = 1, π i (b(π˜jN )|Qt = 0) = π˜jN , π i (b(π˜jk )|Qt = 0) = π˜jk −π˜jk+1 , π i (b −i |Qt = 0) = ei , π i (n|Qt = 0) = 1−π(b(π˜j1 )|Qt = 0)−π(b −i |Qt = 0) 90 Figure 3.3: Comparison of revenue generation across SCP, CBP and ST policies It is fairly straightforward to check that, when the firm commits to π i at time t, customer j will purchase if and only if n ̸∈ Sj,t . Therefore, when b −i is sent, Sender induces Ei to occur and the outgoing neighborhood of customer i is learned. The value ei can be interpreted as the maximum possible amount of probability with which Ei can be induced. The remaining probability mass we have to work with is given by π˜j . We then correlate b messages as much as possible to guard against customer leakage. To make the intuition more concrete, Figure 3.2 illustrates, for N = 4, the probability with which the messages are sent under the mechanism (π 3 ,S 3 ), conditional on Qt = 0. For each customer, the blue shaded area is equal to µ0 1−µ0 v j−p p , the marginal probability that b is sent to j under π classic conditional on Qt = 0. Note that this is the highest probability with which b can be sent so that j finds it optimal to follow the recommendation and purchase. The red shaded area is the complement probability that n is sent to j. First, we reserve probability ei to guarantee that the message b −3 eventually occurs and i’s neighborhood is learned. In this case, π˜1 > π˜3 > π˜2 > π˜1 = 0. We correlate messages so that, if for example customer 2 gets sent b, so do customers 1 and 3. 91 We are now ready to define the policy. Definition 3.4.3 The Sequentially Correlated Persuasion (SCP) policy commits to the mechanism (π 1 ,S 1 ) at time t = 0. Let t1 < t2 < ··· < tN be the times at which b −i is sent to the customers. The SCP policy commits to π i on intervals (ti−1,ti ], where t0 = 0. In short, the SCP policy will commit to π i until the event Ei occurs, at which point it will switch to the next mechanism π i+1 . Theorem 3.4.2 The Sequentially Correlated Persuasion (SCP) policy learns G in expected time E[T(G;SCP)] ≤ N (1− µ0)mini ei = O(N). (3.4) Moreover, it achieves long run average revenue strictly greater than N pµ0. Theorem 3.4.2 guarantees that the SCP policy learns the network in time linear in the number of customers. However, unlike the ST policy, because the SCP policy correlates buy messages, it is able to garner higher revenue. In Figure 3.3, we numerically show that the SCP policy greatly outperforms the ST policy and the CBP policy. We repeatedly simulated a scenario with 10 customers with valuations distributed uniformly in the interval [p,2p] and µ0 = .2. The network is a directed Erdos-Renyi graph. We ˝ varied the edge probability and averaged and plotted the payoffs generated from each policy. As the edge probability approaches one, all three policies do as well as a full-information policy that simply discloses the true state of the world at each time. This suggests that for highly connected networks, switching to an optimal public signaling mechanism may perform better. However, for moderately connected networks, the SCP policy performs well. 3.5 Conclusion Bayesian persuasion has become a powerful tool for persuading agents (receivers) to take certain actions. However, when extending the model to multiple receivers, Sender should be wary of 92 possible network spillover effects. If customers share there signals with neighbors in the network, then the classic Bayesian persuasion mechanism may perform poorly. In some cases, Sender may have knowledge of the network structure. When this is the case, modifications can be made to the mechanism to achieve better results. However, the structure of the network may not be known. We develop signaling policies that uncover the network structure in expected time linear in the number of receivers. Our guarantees hold for any underlying network structure, even those with unbounded degree. We show that designing mechanisms that provide a single agent with an informative signal that has the potential to change the behavior of the other agents allows Sender to uncover the neighborhood of the informed signal. We believe this key insight can be extended to more complex settings beyond the simple binary state and action model discussed in this chapter. In particular, we conjecture that linear time learning is best possible, however, it remains to be shown formally. We focus primarily on learning the network and less so on payoff considerations. We establish that policies exist that learn the network, and we have also shown numerically that the SCP policy performs substantially better than the ST policy. However, as the sender uncovers the network, the set of compatible networks diminishes, and the SCP policy fails to utilize this information in real time. Is it possible to design mechanisms that leverage this to garner better average payoffs, and will doing so naturally slow down learning time? We believe that better understanding this exploration/exploitation trade-off is another interesting direction for future work. 93 References Afeche, Philipp (2013). “Incentive-Compatible Revenue Management in Queueing Systems: Op- ` timal Strategic Delay”. In: Manufacturing & Service Operations Management 15 (3), pp. 423– 443. DOI: 10.1287/msom.2013.0449. Alizamir, Saed., Francis de Vericourt, and Shouqiang Wang (2020). “Warning Against Recurring ´ Risks: An Information Design Approach”. In: Management Science 66 (10), pp. 4612–4629. DOI: 10.1287/mnsc.2019.3420. Allon, Gad, Achal Bassamboo, and Itai Gurvich (2011). “We will be right with you: Managing customer expectations with vague promises and cheap talk”. In: Operations Research 59.6, pp. 1382–1394. DOI: 10.1287/opre.1110.0976. Allon, Gad, Achal Bassamboo, and Ramandeep Randhawa (2012). “Price as a Signal of Product Availability: Is it Cheap?” In: Social Science Research Network. DOI: 10.2139/ssrn. 3393502. Alonso, Ricardo and Odilon Camara (2016). “Persuading voters”. In: ˆ American Economic Review 106 (11), pp. 3590–3605. DOI: 10.1257/aer.20140737. Arieli, Itai, Yakov Babichenko, and Fedor Sandomirskiy (2022). “Bayesian Persuasion with Mediators”. In: Arxiv. DOI: 10.48550/arXiv.2203.04285. Armony, Mor and Amy R. Ward (2010). “Fair Dynamic Routing in Large-Scale HeterogeneousServer Systems”. In: Operations Research 58 (3), pp. 624–637. DOI: 10.1287/opre.1090. 0777. Artiga, Samantha, Kendal Orgera, and Olivia Pham (2020). “Disparities in health and health care: Five key questions and answers”. In: Kaiser Family Foundation. Ata, Baris, Yichuan Ding, and Stefanos Zenios (2020). “An Achievable-Region-Based Approach for Kidney Allocation Policy Design with Endogenous Patient Choice.” In: Manufacturing & Service Operations Management. 23 (1), pp. 36–54. DOI: 10.1287/msom.2019.0807. Atar, Rami, Chanit Giat, and Nahum Shimkin (2010). “The cµ/θ rule for many-server queues with abandonment”. In: Operations Reseaerch 58 (5), pp. 1427–1439. DOI: 10.1287/opre.1100. 0826. 94 Avi-Itzhak, Benjamin and Hanoch Levy (2004). “On Measuring Fairness in Queues”. In: Advances in Applied Probability 36 (3), pp. 919–936. Bansal, Nikhil and Mor Harchol-Balter (2001). “Analysis of SRPT Scheduling: Investigating Unfairness”. In: Proceedings of the 2001 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pp. 279–290. DOI: 10.1145/378420.378792. Bergemann, Dirk and Stephen Morris (2019). “Information Design: A Unified Perspective”. In: Journal of Economic Literature 57 (1), pp. 44–95. DOI: 10.1257/jel.20181489. Bernasconi, Martino, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti, and Francesco Trovo´ (2022). “Sequential Information Design: Learning to Persuade in the Dark”. In: Advances in Neural Information Processing Systems 35. Bertsimas, Dimitris, Vivek F. Farias, and Nikolaos Trichakis (2011). “The Price of Fairness”. In: Operations Research 59 (1), pp. 17–31. DOI: 10.1287/opre.1100.0865. — (2012). “On the Efficiency-Fairness Trade-off”. In: Management Science 58 (12), pp. 2234– 2250. DOI: 10.1287/mnsc.1120.1549. — (2013). “Fairness, Efficiency, and Flexibility in Organ Allocation for Kidney Transplantation”. In: Operations Research 61 (1), pp. 73–87. DOI: 10.1287/opre.1120.1138. Bertsimas, Dimitris and John Tsitsiklis (1997). Introduction to Linear Optimization. Belmont, Massachusetts: Athena Scientific. Best, James and Daniel Quigley (2023). “Persuasion for the Long Run”. In: Journal of Political Economy. DOI: 10.1086/727282. Candogan, Ozan and Kimon Drakopoulos (2020). “Optimal Signaling of Content Accuracy: Engagement vs. Misinformation”. In: Operations Research 68.2, pp. 497–515. DOI: 10.1287/ opre.2019.1897. Candogan, Ozan, Yifan Guo, and Haifeng Xu (2020). “On Information Design with Spillovers”. In: Social Science Research Network. DOI: 10.2139/ssrn.3537289. Castiglioni, Matteo, Andrea Celli, Alberto Marchesi, and Nicola Gatti (2020). “Online Bayesian Persuasion”. In: Advances in Neural Information Processing Systems 33. — (2022). “Bayesian Persuasion in Sequential Decision-Making”. In: Proceedings of the AAAI Conference on Artificial Intelligence. 36 (5), pp. 5025–5033. DOI: 10.1609/aaai.v36i5. 20434. Castiglioni, Matteo, Alberto Marchesi, Andrea Celli, and Nicola Gatti (2021). “Multi-Receiver Online Bayesian Persuasion”. In: Proceedings of the 38th International Conference on Machine Learning. 95 Cayci, Semih, Swati Gupta, and Atilla Eryilmaz (2020). “Group-Fair Online Allocation in Continuous Time”. In: Advances in Neural Information Processing Systems 33. Coffman, E.G. and I. Mitrani (1980). “A Characterization of Waiting Time Performance Realizable by Single-Server Queues”. In: Operations Research 28 (3-part-ii), pp. 810–821. DOI: 10.1287/opre.28.3.810. Corbett-Davies, S, Johann D. Gaebler, Hamed Nilforoshan, Ravi Shroff, and Sharad Goel (2024). “The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning.” In: The Journal of Machine Learning Research 24.312 (1), pp. 14730–14846. Cox, D.R. and Walter L. Smith (1961). Queues. London: Methuen. Crawford, Vincent P. and Joel Sobel (1982). “Strategic Information Transmission”. In: Econometrica 50 (6), pp. 1431–1451. DOI: 10.2307/1913390. Degan, Arianna and Ming Li (2021). “Persuasion with costly precision”. In: Economic Theory 72 (3), pp. 869–908. DOI: 10.1007/s00199-021-01346-9. Doval, Laura and Vasiliki Skreta (2023). “Constrained Information Design”. In: Mathematics of Operations Research 49 (1), pp. 78–106. DOI: 10.1287/moor.2022.1346. Doval, Laura and Alex Smolin (2023). “Persuasion and Welfare”. In: Journal of Political Economy. DOI: 10.1086/729067. Drakopoulos, Kimon, Shobhit Jain, and Ramandeep Randhawa (2020). “Persuading Customers to Buy Early”. In: Management Science 67 (2), pp. 828–853. DOI: 10.1287/mnsc.2020.3580. Drakopoulos, Kimon, Irene Lo, and Justin Mulvany (2023). “Blockchain Mediated Persuasion”. In: Proceedings of the 24th ACM Conference on Economics and Computation, p. 538. DOI: 10.1145/3580507.3597769. Egorov, Georgy and Konstantin Sonin (2020). “Persuasion on networks”. In: National Bureau of Economic Research w27631. DOI: 10.3386/w27631. Fotakis, Dimitris, Dimitris Tsipras, Christos Tzamos, and Emmanouil Zampetakis (2016). “Efficient Money Burning in General Domains”. In: Theory of Computing Systems 59 (4), pp. 619– 640. DOI: 10.1007/s00224-016-9720-2. Frechette, Guillaume R., Alessandro Lizzeri, and Jacopo Perego (2022). “Rules and Commitment ´ in Communication: An Experimental Analysis”. In: Econometrica 90 (5), pp. 2283–2318. DOI: 10.3982/ECTA18585. Galperti, Simone and Jacopo Perego (2019). “Games With Information Constraints: Seeds and Spillovers”. In: Social Science Research Network. DOI: 10.2139/ssrn.3340090. 96 Guo, Yingni and Eran Shmaya (2021). “Costly miscalibration”. In: Theoretical Economics 16 (2), pp. 477–506. DOI: 10.3982/TE3991. Hahn, N., M. Hoefer, and R. Smorodinsky (2021). “Prophet Inequalities for Bayesian Persuasion”. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, pp. 175–181. Hartline, Jason D. and Tim Roughgarden (2008). “Optimal mechanism design and money burning”. In: Proceedings of the fortieth annual ACM symposium on Theory of computing, pp. 75– 84. DOI: 10.1145/1374376.1374390. Hu, Yue, Carri W. Chan, and Jing Dong (2021). “Optimal Scheduling of Proactive Service with Customer Deterioration and Improvement”. In: Management Science 68 (4), pp. 2533–2578. DOI: 10.1287/mnsc.2021.3992. Jackson, Matthew O. (2008). Social and Economic Networks. Princeton: Princeton University Press. DOI: 10.1515/9781400833993. Kamenica, Emir and Matthew Gentzkow (2011). “Bayesian Persuasion”. In: American Economic Review 101 (6), pp. 2590–2615. DOI: 10.1257/aer.101.6.2590. Kerman, Toygar T. and Anastas P. Tenev (2021). “Persuading Communicating Voters”. In: Social Science Research Network. DOI: 10.2139/ssrn.3765527. — (2023). “Pitfalls of Information Spillovers in Persuasion”. In: Social Science Research Network. DOI: 10.2139/ssrn.4641522. Kleinrock, Leonard (1965). “A conservation law for a wide class of queueing disciplines”. In: Naval Research Logistics 12 (2), pp. 181–192. DOI: 10.1002/nav.3800120206. — (1976). Queueing Systems Volume 2. New York: John Wiley & Sons. Koessler, Fred´ eric and Vasiliki Skreta (2023). “Informed Information Design”. In: ´ Journal of Political Economy 131 (11), pp. 3186–3232. DOI: 10.1086/724843. Kremer, Ilan, Yishay Mansour, and Motty Perry (2014). “Implementing the ”Wisdom of the Crowd””. In: Journal of Political Economy 122 (5), pp. 988–1012. DOI: 10.1086/676597. Larson, Richard C. (1987). “OR Forum—Perspectives on Queues: Social Justice and the Psychology of Queueing”. In: Operations Research 35 (6), pp. 895–905. DOI: 10.1287/opre.35.6. 895. Leslie, Emily and Nolan G. Pope (2017). “The Unintended Impact of Pretrial Detention on Case Outcomes: Evidence from New York City Arraignments”. In: The Journal of Law and Economics 60 (3), pp. 529–557. Lewis, Robert (2021). “Waiting for Justice”. In: calmatters.org. 97 Li, Run (2020). “Persuasion with Strategic Reporting”. In: Social Science Research Network. DOI: 10.2139/ssrn.3536404. Lien, Robert W., Seyed M.R. Iravani, and Karen R. Smilowitz (2014). “Sequential Resource Allocation for Nonprofit Operations”. In: Operations Research 62 (2), pp. 301–317. DOI: 10 . 1287/opre.2013.1244. Lingenbrink, D. and K. Iyer (2019). “Optimal Signaling Mechanisms in Unobservable Queues”. In: Operations Research 67.5, pp. 1397–1416. DOI: 10.1287/opre.2018.1819. Lipnowski, Elliot and Doron Ravid (2020). “Cheap Talk with Transparent Motives”. In: Econometrica 88 (4), pp. 1631–1660. DOI: 10.3982/ECTA15674. Lipnowski, Elliot, Doron Ravid, and Denis Shishkin (2022). “Persuasion via Weak Institutions”. In: Journal of Political Economy 130 (10), pp. 2705–2730. DOI: 10.1086/720462. Ma, Will, Pan Xu, and Yifan Xu (2022). “Group-level Fairness Maximization in Online Bipartite Matching”. In: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, pp. 1687–1689. Mandelbaum, Avishai and Alexander L. Stolyar (2004). “Scheduling flexible servers with convex delay-costs: Heavy-traffic optimality of the generalized cµ-rule”. In: Operations Research 52 (6), pp. 836–855. DOI: 10.1287/opre.1040.0152. Manshadi, Vahideh, Rad Niazadeh, and Scott Rodilitz (2023). “Fair Dynamic Rationing”. In: Management Science 69 (11), pp. 6818–6836. DOI: 10.1287/mnsc.2023.4700. McLay, Laura A. and Maria E. Mayorga (2012). “A Dispatching Model for Server-to-Customer Systems That Balances Efficiency and Equity”. In: Manufacturing & Service Operations Management 15 (2), pp. 205–220. DOI: 10.1287/msom.1120.0411. Min, Daehong (2021). “Bayesian persuasion under partial commitment”. In: Economic Theory 72, pp. 743–764. DOI: 10.1007/s00199-021-01386-1. Nanda, Vedant, Pan Xu, Karthik Abinav Sankararaman, John P. Dickerson, and Aravind Srinivasan (2020). “Balancing the Tradeoff between Profit and Fairness in Rideshare Platforms during High-Demand Hours”. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, p. 131. DOI: 10.1145/3375627.3375818. Nguyen, Anh and Teck Yong Tan (2021). “Bayesian persuasion with costly messages”. In: Journal of Economic Theory 193.105212. DOI: 10.1016/j.jet.2021.105212. Papanastasiou, Yiangos, Kostas Bimpikis, and Nicos Savva (2017). “Crowdsourcing Exploration”. In: Management Science 64 (4), pp. 1727–1746. DOI: 10.1287/mnsc.2016.2697. Perez, Eduardo (2014). “Interim Bayesian Persuasion: First Steps”. In: American Economic Review 104 (5), pp. 469–474. DOI: 10.1257/aer.104.5.469. 98 Piquero, Alex R. (2008). “Disproportionate Minority Contact”. In: The Future of Children 18 (2), pp. 59–79. Puha, Amber L. and Amy R. Ward (2019). “Scheduling an Overloaded Multiclass Many-Server Queue with Impatient Customers”. In: INFORMS TutORials in Operations Research, INFORMS, pp. 189–217. DOI: 10.1287/educ.2019.0196. Salamanca, Andres (2021). “The value of mediated communication”. In: ´ Journal of Economic Theory 192.105191. DOI: 10.1016/j.jet.2021.105191. Veeraraghavan, Senthil and Laurens Debo (2008). “Joining Longer Queues: Information Externalities in Queue Choice”. In: Manufacturing & Service Operations Management 11 (4), pp. 543– 562. DOI: 10.1287/msom.1080.0239. Ward, Amy R. and Mor Armony (2013). “Blind Fair Routing in Large-Scale Service Systems with Heterogeneous Customers and Servers”. In: Operations Research 61 (1), pp. 228–243. DOI: 10.1287/opre.1120.1129. Wierman, Adam (2011). “Fairness and scheduling in single server queues”. In: Surveys in Operations Research and Management Science 16 (1), pp. 39–48. DOI: 10.1016/j.sorms.2010. 07.002. Wierman, Adam and Mor Harchol-Balter (2005). “Classifying Scheduling Policies with Respect to Higher Moments of Conditional Response Time”. In: ACM SIGMETRICS Performance Evaluation Review, pp. 229–240. DOI: 10.1145/1071690.1064238. Yu, Man, Hyun-Soo Ahn, and Roman Kapuscinski (2014). “Rationing Capacity in Advance Selling to Signal Quality”. In: Management Science 61 (3), pp. 560–577. DOI: 10.1287/mnsc.2013. 1888. 99 Appendices A Proofs for Chapter 1 Proof of Proposition 1.3.1 This is a special case of Proposition 1.6.2 with c1 = c2 = 1. See the proof of Proposition 1.6.2. ■ Proof of Theorem 1.4.1 First, observe that the wait time vector WFIFO of the FIFO-policy is a solution of (1.10). Now, assume that the vector W1 ∈ R T is also a solution and W1 ̸= FIFO. If W1 ∈ W (Πunaware), then there is nothing to show. So, assume that W1 ̸∈ W (Πunaware), i.e., W1 is not a feasible wait time vector. Note that, since W1 is a solution, it lies in the convex set subset of R T generated by the fairness constraints: Sfair = {W ∈ R T : AW = 0} as well as the convex subset of R T generated by the conservation constraint: Sconserve = {W ∈ R T : ∑t∈T Wt = ρ 1−ρW0}. Note that S1 ∩ S2 is a convex subset of R T containing the wait time vector corresponding to the FIFO-policy. Furthermore, the boundary of W (Πunaware) corresponds to policies that prioritize one or more classes. Therefore, the FIFO wait time vector WFIFO lies in the relative interior of W (Πunaware). So, there exists a euclidean ball BFIFO ⊂ W (Πunaware) containing WFIFO. Since S1 ∩ S2 is convex, for every k ∈ [0,1], kWFIFO + (1 − k)W1 ∈ S1 ∩ S2. Taking k > 0 small enough, we get a vector W∗ ∈ S1 ∩ S2 distinct from WFIFO that also satisfies W∗ ∈ BFIFO ⊂ W (Πunaware). Conversely, if there exists a wait time vector W1 ̸= WFIFO that is population fair and populationunaware, then W1 is a solution to equation system (1.10), showing that (1.10) has more than one 100 solution. ■ Proof of Proposition 1.4.2 Using the fairness equation (1.11) as well as the conservation equation ρ1W1 + ρ2W2 + ρ3W3 = ρ 1−ρW0 to solve for W2 and W3 in terms of W1 = w yields the wait time formulas stated in the proposition. These wait time formulas are valid provided that w is chosen to be feasible. In particular, since W1 = w and b ≤ W1 ≤ b¯, respectively, we must have b ≤ w ≤ b¯. Upon substituting the expressions for the wait times into the fairness optimization problem for unaware policies (1.8) and noting that b ≤ w ≤ b¯, we get the relaxed problem minimizeℓ≤w≤u λ1w+λ2 aA,3 ρ2aA,3 −ρ3aA,2 ρ 1−ρ W0 + ρ3aA,1 −ρ1aA,3 ρ2aA,3 −ρ3aA,2 w −λ3 aA,2 ρ2aA,3 −ρ3aA,2 ρ 1−ρ W0 − ρ2aA,1 −ρ1aA,2 ρ2aA,3 −ρ3aA,2 w . We can further simplify this optimization problem by dropping constants from the objective. After doing so, it follows that the problem becomes minimizeb≤w≤b¯Kw, where K is as stated in Proposition 1.4.2. ■ Proof of Theorem 1.5.1 When c1 = c2 = 1, Theorem 1.6.1 says that there exists a fair cµpreserving policy if and only if W cµA A ≤ W cµA B and W cµB A ≥ W cµB B . The result follows since W cµA A ≤ W cµA B when αA,1 ≥ αB,1. See the proof of Theorem 1.6.1 for details. ■ Proof of Proposition 1.5.1 Since W cµB A ≥W cµB B , by Theorem 1.5.1 there exists a fair, cµ-preserving policy. Let π be any fair, cµ-preserving policy and let Wπ be its corresponding wait time vector. For convenience, let C1 = W0ρ1/(1−ρ1) and C2 = W0ρ2/((1−ρ1)(1−ρ)). By the Kleinrock wait time conservation theorem, the wait time vector satisfies ∑ p∈P ∑ t∈T Wπ p,t = ρ 1−ρ W0, (A.1) 101 Since π is cµ-preserving, we may apply the Kleinrock wait time conservation theorem to type-1 customers (Kleinrock, 1965). This gives ρA,1Wπ A,1 +ρB,1Wπ B,1 = ρ1 1−ρ1 W0 = C1. (A.2) From the equations given by (A.1) and (A.2) above we obtain Wπ B,1 = 1 ρB,1 C1 − ρA,1 ρB,1 Wπ A,1 , (A.3) Wπ B,2 = 1 ρB,2 C2 − ρA,2 ρB,2 Wπ A,2 . (A.4) Substituting (A.3) and (A.4) into the fairness constraint yields αA,1Wπ A,1 +αA,2Wπ A,2 = αB,1 1 ρB,1 C1 − ρA,1 ρB,1 Wπ A,1 +αB,2 1 ρB,2 C2 − ρA,2 ρB,2 Wπ A,2 . Solving for Wπ A,2 in terms of Wπ A,1 gives Wπ A,2 = ρB,2αB,1C1 +ρB,1αB,2C2 −ρB,2 ρA,1αB,1 +αB,1αA,1 Wπ A,1 ρB,1 ρA,2αB,2 +ρB,2αA,2 . (A.5) Plugging (A.5) into equation (A.4) yields the desired expression for Wπ B,2 in terms of Wπ A,1 . 1 Now, the wait times in W (Πcµ) is polyhedron with extreme points corresponding to the strict priority policies that preserve cµ priority. Equivalently, this means that we must have that the following set of inequalities satisfied by any Wπ ∈ W (Πcµ): W0 (1−ρA,1) ≤ Wπ A,1 ≤ W0 (1−ρB,1)(1−ρ1) , W0 (1−ρB,1) ≤ Wπ B,1 ≤ W0 (1−ρA,1)(1−ρ1) , W0 (1−ρ1)(1−ρ1 −ρA,2) ≤ Wπ A,2 ≤ W0 (1−ρ1 −ρB,2)(1−ρ) , W0 (1−ρ1)(1−ρ1 −ρB,2) ≤ Wπ B,2 ≤ W0 (1−ρ1 −ρA,2)(1−ρ) . (A.6) The lower and upper bound L ≤ U are obtained by substituting the given formulae into (A.6) and solving for Wπ A,1 . Four sets of inequalities will be obtained, we set L to be the maximum of the lower bounds and U to be the minimum of the upper bounds. ■ Proof of Proposition 1.5.2 If W cµB A < W cµB B , then by Theorem 1.5.1, no fair, cµ-preserving policy exists. We will show that any optimal, fair policy must serve type-2 before type-1, and that the policy will do so as sparingly as possible, leading to a policy that serves (B,1) customers first and (A,2) customers last. To formalize this, consider the fairness-constrained optimization problem minimizeW∈W (Π) λA,1WA,1 +λA,2WA,2 +λB,1WB,1 +λB,2WB,2 (A.7) subject to αA,1WA,1 +αA,2WA,2 −αB,1WB,1 −αB,2WB,2 = 0 Solve the fairness constraint for WB,2: WB,2 = αA,1 αB,2 WA,1 + αA,2 αB,2 WA,2 − αB,1 αB,2 WB,1. (A.8) 103 Plugging (A.8) into the objective of problem (A.7) yields the following equivalent objective λA,1WA,1 +λA,2WA,2 +λB,1WB,1 +λB,2 αA,1 αB,2 WA,1 + αA,2 αB,2 WA,2 − αB,1 αB,2 WB,1 = λA,1 +λB,2 αA,1 αB,2 WA,1 + λA,2 +λB,2 αA,2 αB,2 WA,2 + λB,1 −λB,2 αB,1 αB,2 WB,1 = λαA,1WA,1 +λαA,2WA,2. Therefore, problem (A.7) is equivalent to minimizeW∈W (Π) αA,1WA,1 +αA,2WA,2 (A.9) subject to αA,1WA,1 +αA,2WA,2 −αB,1WB,1 −αB,2WB,2 = 0 Now, recall that any wait time vector W ∈ W (Π) must satisfy conservation: ρA,1WA,1 +ρA,2WA,2 +ρB,1WB,1 +ρB,2WB,2 = ρ 1−ρ W0. Subbing (A.8) into the conservation equation yields ρ 1−ρ W0 = ρA,1WA,1 +ρA,2WA,2 +ρB,1WB,1 +ρB,2 αA,1 αB,2 WA,1 + αA,2 αB,2 WA,2 − αB,1 αB,2 WB,1 = ρA,1 +ρB,2 αA,1 αB,2 WA,1 + ρA,2 +ρB,2 αA,2 αB,2 WA,2 + ρB,1 −ρB,2 αB,1 αB,2 WB,1. (A.10) Solving (A.10) for WA,1 gives WA,1 = − ρA,2 +ρB,2 αA,2 αB,2 ρA,1 +ρB,2 αA,1 αB,2 WA,2 − ρB,1 −ρB,2 αB,1 αB,2 ρA,1 +ρB,2 αA,1 αB,2 WB,1 + ρ 1−ρW0 ρA,1 +ρB,2 αA,1 αB,2 . (A.11) 104 Plugging (A.11) into the objective of the problem (A.9) results in the following objective: αA,1WA,1 +αA,2WA,2 = αA,1 − ρA,2 +ρB,2 αA,2 αB,2 ρA,1 +ρB,2 αA,1 αB,2 WA,2 −αA,1 ρB,1 −ρB,2 αB,1 αB,2 ρA,1 +ρB,2 αA,1 αB,2 WB,1 + ρ 1−ρW0 ρA,1 +ρB,2 αA,1 αB,2 ! +αA,2WA,2 = αA,2 −αA,1 ρA,2 +ρB,2 αA,2 αB,2 ρA,1 +ρB,2 αA,1 αB,2 !!WA,2 − ρB,1 −ρB,2 αB,1 αB,2 ρA,1 +ρB,2 αA,1 αB,2 ! WB,1 +αA,1 ρ 1−ρW0 ρA,1 +ρB,2 αA,1 αB,2 ! . Therefore, we consider the equivalent optimization problem: minimizeW∈W (Π) αA,2 −αA,1 ρA,2 +ρB,2 αA,2 αB,2 ρA,1 +ρB,2 αA,1 αB,2 !!WA,2 +αA,1 ρB,2 αB,1 αB,2 −ρB,1 ρA,1 +ρB,2 αA,1 αB,2 ! WB,1 subject to αA,1WA,1 +αA,2WA,2 −αB,1WB,1 −αB,2WB,2 = 0. Since µ1 ≥ µ2, it follows that αA,2 −αA,1 ρA,2 +ρB,2 αA,2 αB,2 ρA,1 +ρB,2 αA,1 αB,2 ! = αA,2 1− λA µ2 + λB µ2 λA µ1 + λB µ2 ! ≤ 0, and αA,1 ρB,2 αB,1 αB,2 −ρB,1 ρA,1 +ρB,2 αA,1 αB,2 ! = αA,1 λB,1 µ2 − λB,1 µ1 ρA,1 +ρB,2 αA,1 αB,2 ≥ 0. Therefore, the coefficient of WA,2 is nonpositive and the coefficient of WB,1 is nonnegative, meaning it is optimal to make WB,1 as small as possible and WA,2 as large as possible. It follows that any fair wait time vector with WB,1 = W cµB B,1 and WA,2 = W cµB A,2 is optimal, i.e., any fair policy with corresponding wait time vector in the set W = Wπ p,t p∈P,t∈T ∈ W (Π) : Wπ B,1 = W cµB B,1 ,Wπ A,2 = W cµB A,2 10 must be also be optimal. Therefore, it suffices to check there exists a fair policy in W . First, note that W is a convex subset of W (Π). Observe that WcµB ∈ W , and by assumption W cµB A < W cµB B . Furthermore, the policy π that prioritizes in the order (B,1),(B,2),(A,1),(A,2) is in the set W and clearly satisfies Wπ A ≥ Wπ B . So, as in the proof of Theorem 1.6.1, it follows from the Intermediate Value Theorem that there exists Wπ ∗ ∈ W that satisfies the fairness constraint. Using these observations leads to the following system in two equations and two unknowns in Wπ ∗ A,1 and Wπ ∗ B,2 : αA,1Wπ ∗ A,1 −αB,2Wπ ∗ B,2 = αB,1W cµB B,1 −αA,2W cµB A,2 ρA,1Wπ ∗ A,1 +ρB,2Wπ ∗ B,2 = ρ 1−ρ W0 −ρB,1W cµB B,1 −ρA,2W cµB A,2 . Solving this system yields the stated expressions for the optimal wait time vector. ■ Proof of Corollary 1.5.1 We can write down the associated wait times for the policy cµB using the standard formula for wait times for head-of-the-line priority policies: W cµB B,1 = W0 1−ρB,1 , W cµB A,1 = W0 (1−ρB,1)(1−ρ1) , W cµB B,2 = W0 (1−ρ1)(1−ρ1 −ρB,2) , W cµB A,2 = W0 (1−ρ1 −ρB,2)(1−ρ) . First, observe that W cµB A ≥ W cµB B if and only if αA,1 (1−ρB,1)(1−ρ1) + αA,2 (1−ρ1 −ρB,2)(1−ρ) ≥ αB,1 (1−ρB,1) + αB,2 (1−ρ1)(1−ρ1 −ρB,2) , 106 which, upon clearing denominators, is equivalently satisfied if and only if αA,1(1−ρ1 −ρB,2)(1−ρ) +αA,2(1−ρB,1)(1−ρ1) ≥ αB,1(1−ρ1)(1−ρ1 −ρB,2)(1−ρ) +αB,2(1−ρB,1)(1−ρ). Since αA,1 ≥ αB,1 and 1 − ρ1 < 1, the first term on the left hand side of the inequality is greater than or equal to the first term on the right hand side of the inequality. Therefore, a sufficient condition for this inequality to hold is αA,2(1−ρB,1)(1−ρ1) ≥ αB,2(1−ρB,1)(1−ρ), which after some algebraic manipulations becomes ρ2 1−ρ1 ≥ αB,2−αA,2 αB,2 . Observing that αB,2−αA,2 = αA,1−αB,1, αB,2 = 1−αA,1 and ρ2 = ρ −ρ1 yields the desired expression. ■ Proof of Corollary 1.5.2 Corollary 1.5.2 follows immediately from Corollary 1.5.1. If ρ ≥ αA,1−αB,1 1−αB,1 (1 − ρ1) + ρ1, then by Corollary 1.5.1, W cµB A ≥ W cµB B , and by Theorem 1.5.1, we get existence of a fair, cµ-preserving policy. ■ Proof of Theorem 1.5.2 We prove the claim for the general setting with nonuniform holding costs and note that the result reduces to Theorem 1.5.2 upon letting ct = 1 for all t ∈ T . Let Πp,T = {π ∈ Πcµ ∩ Πpriority : (p,T) has last priority} be the set of cµ-preserving, strict priority policies that gives least priority to customers belonging to population p and service type t. Observe that the collection {Πp,T : p ∈ P} forms a partition of Πcµ ∩ Πpriority. Consider a sequence of queues with service capacity rε = ρ + ε, i.e., for ε > 0 we assume that the server serves customers at rate rε . For each population p ∈ P and service type t ∈ T , given a policy π ∈ Π, let W π,ε p,t be the expected steady state wait time of (p,t)-customers when the server works at rate rε . Note that, when server works at rate rε , the effective service rate for type t ∈ T is rεµt . The effective second moment of the service time distribution is r −2 ε mt . Claim 1: If π ∈ Πpriority is such that (p,t) is not in last priority, then W π,ε p,t = O(1) as ε → 0. 107 First, Wε 0 = ∑ t∈T λt r −2 ε mt 2 = ∑ t∈T λt r −2 ε mt 2 → 1 ρ 2W0, as ε → 0. Now, consider class (p,t). Let π ∈ Πcµ and suppose that (p,t) has ith priority under π. Let ρj be the utilization of the ith priority class under policy π. Then, as ε → 0, the wait time of class (p,t) converges: W π,ε p,t = Wε 0 (1−r −1 ε ∑ i−1 j=1 ρj)(1−r −1 ε ∑ i j=1 ρj) → W0/ρ 2 (1−ρ−1 ∑ i−1 j=1 ρj)(1−ρ−1 ∑ i j=1 ρj) < ∞. Note the limit is finite since, by assumption, i < P×T. ■ Claim 2: For all p ∈ P, W π,ε p,T = W π ′ ,ε p,T =: Wε p,T for all π,π ′ ∈ Πp,T . This follows immediately from inspection of the cµ-wait time formula: W π,ε p,T = Wε 0 (1−r −1 ε ∑ P×T−1 j=1 ρj)(1−r −1 ε ρ) = W π ′ ,ε p,T . ■ Claim 3: For all p ∈ P and π ∈ Πp,T , W π,ε p,T −1 = O(ε) as ε → 0. Observe that, as ε → 0, Wε p,T −1 = (1−r −1 ε ∑ P×T−1 j=1 ρj)(1−r −1 ε ρ) Wε 0 = O(1) 1− ρ ρ +ε = O(ε). ■ It follows from Claim 1 and Claim 2 that, as ε → 0, the population p wait time under an arbitrary cµ-preserving policy π ∈ Πcµ ∩Πpriority can be expressed as C ε p = ∑ π∈Πcµ w ε πC π,ε p = ∑ π∈Πcµ w ε π ∑ t αp,tctW π,ε p,t = ∑ π∈Πp,T w ε π ∑ t αp,tctW π,ε p,t + ∑ π∈Πcµ \Πp,T w ε π ∑ t αp,tctW π,ε p,t = ∑ π∈Πp,T w ε παp,T cTWε p,T +O(1) = ∑ π∈Πp,T w ε π ! αp,tcTWε p,T +O(1). For convenience, set w ε p = ∑π∈Πp,T w ε π . Note that w ε p ≥ 0, and since the Πp,T forms a partition of Πcµ ∩ Πpriority, ∑pw ε p = 1. In words, w ε p is the total weight placed on strict, cµ-preserving priority policies that places (p,T) in last priority. Using this notation to rewrite the final line of the previous calculation, we get C ε p = w ε pαp,T cTWε p,T +O(1) for each population p ∈ P. We now show that the weights given by: w ε,∗ 1 = 1 1+∑p̸=1 α1,TWε 1,T αp,TWε p,T , w ε,∗ p = α1,TWε 1,T αp,TWε p,T w ε 1 are “asymptotically fair” as utilization approaches one. The weights are strictly positive for all ε > 0, sum to one, and in light of Claim 3 converge to strictly positive values. Observe that, for all p ̸= 1, 1− C ε,∗ p C ε,∗ 1 = 1− w ε,∗ p αp,T cTWε p,T +O(1) w ε,∗ 1 α1,T cTWε 1,T +O(1) = 1− w ε,∗ 1 α1,TWε 1,T +O(1) w ε,∗ 1 α1,TWε 1,T +O(1) = 1− 1+O(ε) 1+O(ε) = O(ε), as ε → 0. 109 The result follows. ■ Proof of Theorem 1.5.3 We prove Theorem 1.5.3 in the general setting with nonuniform holding costs. For this, we consider the more general version of problem (1.4): min w∈R(P×T)! ∑ π∈Πpriority wπ ∑ p∈P ∑ t∈T λp,tctWπ p,t ! subject to ∑ π∈Πpriority wπ ∑ t∈T αp,tctWπ p,t −αp+1,tctWπ p+1,t ! = 0, ∀p ∈ P \ {P}, ∑ π∈Πpriority wπ = 1, w ≥ 0. Let (x, y) ∈ R |P| be the corresponding dual variables associated with the constraints of the primal problem. Then, the dual problem is given by max (x,y)∈R|P| y subject to P−1 ∑ p=1 xp T ∑ t=1 αp,tctWπ p,t −αp+1,tctWπ p+1,t ! +y ≤ P ∑ p=1 T ∑ t=1 λp,tctWπ p,t , ∀π ∈ Πpriority It suffices to show that the separation problem is solvable in polynomial time in P and T (see Chapter 8 of Bertsimas and Tsitsiklis, 1997). Given a dual vector (x, y), we must check whether it satisfies all the constraints of the dual and if not, we must generate a constraint that is violated. To check if (x, y) is feasible requires checking P−1 ∑ p=1 xp T ∑ t=1 αp,tctWπ p,t −αp+1,tctWπ p+1,t ! +y ≤ P ∑ p=1 T ∑ t=1 λp,tctWπ p,t , for every π ∈ Πpriority. Now, this is true if and only if y ≤ min π∈Πpriority ∑ p∈P ∑ t∈T λp,tctWπ p,t − ∑ p∈P\{P} xp ∑ t∈T αp,tctWπ p,t −αp+1,tctWπ p+1,t . (A.12) Define costs kp,t by kp,t = λp,tct−xpαp,tct λp,t p = 1 λp,tct−αp,tct(xp−xp−1) λp,t 1 < p < P λp,tct+αp,tctxp−1 λp,t p = P It follows that the column generation subproblem defined above is equivalent to the problem min π∈Πpriority P ∑ p=1 T ∑ t=1 kp,tλp,tWπ p,t . (A.13) But observe that the solution to this problem is given by the cµ-policy with holding costs k. To find the cµ-policy requires sorting the kp,tµt in descending order, which can be done in time polynomial in P and T. If (x, y) satisfies (A.12), then we are done. If not, then add the constraint corresponding to the minimizer of (A.13). It follows that we have solved the separation problem in time polynomial in P and T. ■ Proof of Proposition 1.6.1 Observe that C FIFO B C FIFO A = αA,1c1WFIFO +αA,2c2WFIFO αB,1c1WFIFO +αB,2c2WFIFO = αA,1c1 +αA,2c2 αB,1c1 +αB,2c2 = αA,1(c1 −c2) +c2 αB,1(c1 −c2) +c2 . ■ 11 Proof of Proposition 1.6.2 Recall that W cµ 1 = W0 1−ρ1 and W cµ 2 = W0 (1−ρ1)(1−ρ) . Then, observe that C cµ B C cµ A = αA,1c1W cµ 1 +αA,2c2W cµ 2 αB,1c1W cµ 1 +αB,2c2W cµ 2 = αA,1c1 W cµ 1 W cµ 2 +αA,2c2 αB,1c1 W cµ 1 W cµ 2 +αB,2c2 = αA,1c1(1−ρ) +αA,2c2 αB,1c1(1−ρ) +αB,2c2 . ■ Proof of Proposition 1.6.3 Given a wait time vector W ∈ W (Π), let Cp(W) = αp,1c1Wp,1 + αp,2c2Wp,2, for p ∈ {A,B}. Note that Cp : W (Π) → R is a continuous function on a convex set for each p ∈ {A,B}. Given a policy π ∈ Π, observe that Cp(Wπ ) = C π p . For W ∈ W (Π), define f(W) := CA(W)−CB(W), and observe that f is a continuous function on a convex set. First, assume that C πB A ≥ C πB B and C πA A ≤ C πA B . Equivalently, we have that CA(WπB) ≥ CB(WπB) and CA(WπA) ≤ CB(WπA). This implies f(WπA) ≤ 0 and f(WπB) ≥ 0. By intermediate value theorem, there exists Wπ ∈ W (Π) such that f(Wπ ) = 0. Therefore, there exists a fair policy in Π. Now, for the converse, assume that C πA A >C πA B . Recall that πA is the policy that prioritizes in the order: (A,1),(A,2),(B,2),(B,1). We claim that πA minimizes C π A and simultaneously maximizes C π B . To prove this, consider the optimization problem minimizeπ∈Π αA,1c1Wπ A,1 +αA,2c2Wπ A,2 . Multiplying by λA, we get the equivalent problem: minimizeπ∈Π λA,1c1Wπ A,1 +λA,2c2Wπ A,2 +λB,1(0)Wπ B,1 +λB,2(0)Wπ B,2 . This is just the standard head-of-the-line priority optimization problem. Since c1µ1 ≥ c2µ2 ≥ 0, we get that πA is optimal. Now, consider the optimization problem maximizeπ∈Π αB,1c1Wπ B,1 +αB,2c2Wπ B,2 . 112 Multiplying by λB and switching from max to min, we get the equivalent problem: minimizeπ∈Π λA,1(0)Wπ A,1 +λA,2(0)Wπ A,2 +λB,1(−c1)Wπ B,1 +λB,2(−c2)Wπ B,2 . Again, this is a standard head-of-the-line priority optimization problem. Since −c1µ1 ≤ −c2µ2 ≤ 0, we get that πA is optimal. This proves the claim. It follows that for all other policies π ∈ Π, C π A ≥ C πA A > C πA B ≥ C π B , which implies fairness is impossible. An identical argument applies for the case C πB A < C πB B . ■ Proof of Proposition 1.6.4 Let π be a fair, population-unaware policy. Let Wπ be the wait time vector corresponding to π. Since π is population-unaware, we must have Wπ A,t = Wπ B,t = Wπ t for t ∈ {1,2}. Since π is fair, we have that C π A = C π B . These two observations imply αA,1c1Wπ 1 +αA,2c2Wπ 2 = αB,1c1Wπ 1 +αB,2c2Wπ 2 . Since αA,1 ̸= αB,1, we get that c1Wπ 1 = c2Wπ 2 . However, π must also satisfy wait time conservation: ρ1Wπ 1 +ρ2Wπ 2 = ρ 1−ρW0. Solving this system of equations for Wπ 1 and Wπ 2 yields the result. ■ Proof of Theorem 1.6.1 First, from the Kleinrock wait time conservation theorem, we have that any cµ preserving policy must have corresponding wait times vector in the set W (Πcµ) = Wπ p,t p∈P,t∈T : π ∈ Πcµ = Wπ p,t p∈P,t∈T ∈ W (Π): ρA,1WA,1+ρB,1WB,1 = ρ1 1−ρ1 W0 . Note that the set W (Πcµ) is convex since any mixture of two cµ-preserving wait times is always cµ-preserving. We now proceed as we did in the proof of Proposition 1.6.3. Given a wait time vector W ∈ W (Πcµ), let Cp(W) = αp,1c1Wp,1 + αp,2c2Wp,2, for p ∈ {A,B}. Note that Cp : W (Πcµ) → R is a continuous function on a convex set for each p ∈ {A,B}. Given a policy π ∈ Πcµ, observe 1 that Cp(Wπ ) = C π p . For W ∈ W (Πcµ), define f(W) := CA(W)−CB(W), and observe that f is a continuous function on a convex set. First, assume that C cµA A ≤ C cµA B and C cµB A ≥ C cµB B . Equivalently, we have that CA(WcµA) ≤ CB(WcµA) and CA(WcµB) ≥ CB(WcµB). This implies f(WπA) ≤ 0 and f(WπB) ≥ 0. By intermediate value theorem, there exists Wπ ∈ W (Π) such that f(Wπ ) = 0. Therefore, there exists a fair policy in Πcµ. For the converse, assume that C cµA A > C cµA B . Observe that cµA ∈ argmin{C π A : π ∈ Πcµ} and cµA ∈ argmax{C π B : π ∈ Πcµ}. Therefore, for any policy π ∈ Πcµ, C π A ≥ C cµA A > C cµA B ≥ C π B , which implies there is no cµ-preserving policy that achieves waiting cost equity across populations A and B. An identical argument applies to the case when C cµB A < C cµB B . ■ B Proofs for Chapter 2 B.1 Reduction of the Signal Space Proof of Proposition 2.4.1 Note that (ii) implies (i). To prove that (i) implies (ii), let M be an BMP mechanism with signal space S that attains value v ∗ by inducing the equilibrium (β,δ). For every s ∈ S let ps = {δ(a | s)}a∈A denote the action distribution that is induced by a signal realization s. Let {κ1,...,κn} be the set of distinct price levels of M, and let Si be the set of signals with cost κi . Let Γi = {ps : s ∈ Si} be the set of action distributions corresponding to signals in Si . For γi ∈ Γi , we obtain the naturally induced prices: κγi = κi . Set Γ = Sn i=1 Γi . Given γi ∈ Γi , for each ω ∈ Ω, we define π˜(γi | ω) = ∑ s∈Si :ps=γi ∑ ωr∈Ω β(ω r | ω)π(s | ω r ). 114 We claim that M˜ = (π˜,Γ,{κγ∈Γ}) is a generalized-straightforward and direct BMP mechanism. For each ω ∈ Ω, let ˜β(ω|ω) = 1 and for each a ∈ A , γi ∈ Γ let ˜δ(a | γi) = γi(a). We claim that ( ˜β, ˜δ) is a Bayes Nash equilibrium. Observe that, given a report x ∈ Ω, Sender’s payoff under π˜ is ∑ γ∈Γ,a∈A π˜(γ | x) ˜δ(a | γ) v(a)−κγ = n ∑ i=1 ∑ γi∈Γi ,a∈A π˜(γi | x) ˜δ(a | γi) (v(a)−κi) = n ∑ i=1 ∑ γi∈Γi ,a∈A ∑ s∈Si :ps=γi ∑ ωr∈Ω β(ω r | x)π(s | ω r ) ! γi(a) (v(a)−κi) = n ∑ i=1 ∑ γi∈Γi ,a∈A ∑ s∈Si :ps=γi ∑ ωr∈Ω β(ω r | x)π(s | ω r )ps(a) ! (v(a)−κi) = ∑ ωr∈Ω β(ω r | x) ∑ s∈S,a∈A π(s | ω r )δ(a | s) (v(a)−κs) ! = ∑ ωr∈Ω β(ω r | x)g(ω r |ω) ≤ ∑ ωr∈Ω β(ω r | ω)g(ω r |ω) = ∑ ωr∈Ω β(ω r | ω) ∑ s∈S,a∈A π(s | ω r )δ(a | s) (v(a)−κs) ! , where the final inequality follows from the fact that any ω r ∈ suppβ(· | ω) is a maximizer of g(·|ω). Therefore, this upper bound is achieved with truthful reporting, and so it is optimal for Sender to truthfully report. Now let γi ∈ Γi be such that there exists ω ′ s.t. π˜(γi |ω ′ ) > 0 and a ∈ A with γi(a) = ˜δ(a | γi) > 0. Note that for all s ∈ Si such that ps = γi , it follows that δ(a|s) = ps(a) > 0, and since (β,δ) is a BNE for π, we must have that a ∈ argmax a ′∈A ∑ ω,ωr∈Ω µ0(ω)β(ω r | ω)π(s | ω r )u(a ′ ,ω) 11 Now, the payoff that Receiver gets for action a ′ ∈ A under π˜ given Sender’s direct strategy ˜β is ∑ ω,ωr∈Ω µ0(ω) ˜β(ω r | ω)π˜(γi | ω r )u(a ′ ,ω) = ∑ ω∈Ω µ0(ω)π˜(γi | ω)u(a ′ ,ω) = ∑ ω∈Ω µ0(ω) ∑ s∈Si :ps=γi ∑ ωr∈Ω β(ω r | ω)π(s | ω r ) ! u(a ′ ,ω) = ∑ s∈Si :ps=γi ∑ ω,ωr∈Ω µ0(ω)β(ω r | ω)π(s | ω r )u(a ′ ,ω) ! ≤ ∑ s∈Si :ps=γi ∑ ω,ωr∈Ω µ0(ω)β(ω r | ω)π(s | ω r )u(a,ω) ! , which shows that a is a maximizer for Receiver when signal γ is received. Now we prove that (ii) implies (iii). Let π be a generalized-straightforward and direct signal with signal realizations s ∈ S ⊂ ∆(A) and corresponding beliefs µs ∈ ∆(Ω) given by µs(ω) = π(s | ω)µ0(ω) ∑ω′∈Ω π(s | ω′)µ0(ω′) . Observe that (µs ,s) ∈ U for all s ∈ S . Let τ be the distribution induced by π defined by τ(µs ,s) = ∑ ω′∈Ω π(s | ω ′ )µ0(ω ′ ) ∀s ∈ S . Let P be the induced cost partition comprised of sets Us = {(µs ,s)} with cost κs . Then, we have that Eτ [µ] = ∑ s∈S τ(µs ,s)µs(ω) = ∑ s∈S ∑ ω′∈Ω π(s | ω ′ )µ0(ω ′ ) ! π(s | ω)µ0(ω) ∑ω′∈Ω π(s | ω′)µ0(ω′) = µ0(ω). 116 So, τ is Bayes Plausible. Furthermore, by definition of τ and since π is direct, we see that Eτ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) v((µ,δ);P) = ∑ s∈S τ(µs ,s) µs(ω) µ0(ω) − µs(ω ′ ) µ0(ω′) (v(µs ,s)−κs) = ∑ s∈S π(s | ω)−π(s | ω ′ ) (v(µs ,s)−κs) = ∑ s∈S,a∈A π(s | ω)−π(s | ω ′ ) s(a) (v(a)−κs) = 0. Therefore, τ satisfies Payoff Plausibility. The value of τ is Eτ [v((µ,δ);P)] = ∑ s∈S τ(µs ,s)(v(µs ,s)−κs) = ∑ s∈S τ(µs ,s) ∑ ω∈Ω (v(µ,s)−κs)µs(ω) ! = ∑ s∈S,ω∈Ω (v(µ,s)−κs)π(s | ω)µ0(ω) = ∑ s∈S,ω∈Ω ∑ a∈A s(a)v(a) ! π(s | ω)µ0(ω) = v ∗ . Now we prove (iii) implies (ii). Suppose τ is BP and PP, with cost partition P, and attains value v ∗ . Index the support of τ with an index set S: suppτ = {(µs ,δs)}s∈S. Let {Si} n i=1 partition S such that s ∈ Si if and only if κs = κi if and only if (µs ,δs) ∈ Ui . Then, set π(δs | ω) = τ(µs ,δs)µs(ω) µ0(ω) . (B.1) 1 Since τ is BP, ∑s∈S π(δs | ω) = 1 for each ω ∈ Ω. The belief upon receiving signal δs is P(ω | δs) = π(δs | ω)µ0(ω) ∑ω′∈Ω π(δs | ω′)µ0(ω′) = τ(µs ,δs)µs(ω) ∑ω′∈Ω τ(µs ,δs)µs(ω′) = µs(ω). Since δs ∈ ∆(aˆ(µs)), it follows immediately that π is straightforward. Furthermore, since τ satisfies Payoff Plausibility, we have that 0 = Eτ µs(ω) µ0(ω) − µs(ω ′ ) µ0(ω′) v((µs ,δs),P) = ∑ s∈S τ(µs ,δs) µs(ω) µ0(ω) − µs(ω ′ ) µ0(ω′) (v(µs ,δs)−κs) = ∑ s∈S π(δs | ω)−π(δs | ω ′ ) (v(µs ,δs)−κs) = ∑ s∈S ,a∈A π(δs | ω)−π(δs | ω ′ ) δs(a)(v(a)−κs), which shows that π is direct. The value of the mechanism is seen to be v ∗ from a calculation analogous to the one in the proof that (ii) implies (iii). ■ B.2 Constructing an Optimal Mechanism for the Information Selling Problem Given a fixed set of cost levels, we show that there is a BMP-2 mechanism that performs weakly better than any other BMP mechanism. We do so by first constructing an upper bound to (OPT-P) for every cost partition P that partitions according to the list of fixed cost levels, and then by showing that the upper bound is achieved by a BMP-2 mechanism. We note that the following argument provides the basis of the proof of Theorem 2.6.1, as well as the proof of Proposition 2.7.1. Formally, we will prove the following proposition: Proposition B.1 For the Information Selling problem, VBMP-2(µ0;κ1,κn) ≥V(µ0;P) for any cost partition P with cost levels κ1 ≤ κ2 ≤ ··· ≤ κn. 1 Proof of Proposition B.1. Suppose we have a given cost partition P on the n price levels {κ1,...,κn} with κ1 ≤ κ2 ≤ ··· ≤ κn. Recall that the optimal BMP mechanism given this cost partition P solves (OPT-P): V(µ0;P) = max τ∈∆(U ) Eτ [v((µ,δ);P)] s.t. Eτ [µ − µ0] = 0 Eτ [(µ − µ0)v((µ,δ);P)] = 0. Since |Ω| = 2, we take µ0,µ ∈ [0,1] to represent the belief that Project A is better. The P.P. constraint of (OPT-P) is simplified because of this. Define Umod = {(µ,δ,κ) : (µ,δ) ∈ U ,κ ∈ {κ1,...,κn}} and define a modified payoff function vmod by vmod(µ,δ,κ) = v(µ,δ) + (κn −κ). We consider the following optimization problem Vmod(µ0) = max τ∈∆(Umod) Eτ [vmod(µ,δ,κ)] (Primal) s.t. Eτ [µ − µ0] = 0 Eτ [(µ − µ0)vmod(µ,δ,κ)] = 0. Note that this problem has the same B.P. and P.P. constraints of (OPT-P) applied to the modified payoff function. In fact, (Primal) is a relaxation of (OPT-P), where we may apply any of the costs at each compatible belief-action pair. We make this formal now. Claim B.1 Vmod(µ0)−κn ≥ V(µ0;P). 119 Proof: First observe that vmod(µ,δ, v(µ,δ)−v((µ,δ);P)) = v((µ,δ);P) +κn. Let τ be optimal for problem (OPT-P). Define the naturally induced distribution τ˜ ∈ ∆(Umod) by τ˜(µ,δ, v(µ,δ) − v((µ,δ);P)) = τ(µ,δ) for all (µ,δ) ∈ suppτ, and zero otherwise. Then, by inspection we see that τ˜ inherits Bayes Plausibility and Payoff Plausibility from τ, and thus is feasible for the modified problem. So, τ˜ attains a value ofV(µ0;P)+κn, and thusVmod(µ0)−κn ≥ (V(µ0;P) +κn)−κn = V(µ0;P). ■ The Lagrangian dual of (Primal) is equivalent to the following problem: min (λ,ρ,g)∈R3 g (Dual) s.t vmod(µ,δ,κ) +λ(µ − µ0)−ρ(µ − µ0)vmod(µ,δ,κ) ≤ g, ∀(µ,δ,κ) ∈ Umod. For a proof that this is in fact the Lagrangian dual of the primal problem, we note that an identical argument of the proof of Theorem 2.5.2 given below in Appendix B.3 can be applied here as well by replacing v((µ,δ);P) by vmod(µ,δ,κ). Now we will construct a feasible solution for (Dual) to obtain an upper bound on (Primal), and thus on the value of all BMP mechanisms with costs in the set {κ1,...,κn}. Claim B.2 Consider the dual variables λ ∗ = ˆℓeˆµ eˆ(µ − µ0) + ˆℓ(1− µ)µ0 , ρ ∗ = eˆ− ˆℓ(1− µ) eˆ(µ − µ0) + ˆℓ(1− µ)µ0 , g ∗ = ˆℓeˆµ(1− µ0) eˆ(µ − µ0) + ˆℓ(1− µ)µ0 , 120 where ˆℓ := κn − κ1 is the highest possible payoff on [0,µ), and eˆ := e + (κn − κ1) is the highest possible payoff on [µ,µ]. Then, (λ ∗ ,ρ ∗ ,g ∗ ) is dual feasible. Proof: Observe that ρ ∗ (µ − µ0) = 1 if and only if µ = 1 ρ ∗ + µ0 = eˆµ eˆ− ˆℓ(1− µ) := µc. It is straightforward to see that µc ∈ [µ,µ]. Furthermore, at this belief, the constraints of problem (Dual) reduce to λ ∗ /ρ ∗ ≤ g ∗ ⇐⇒ ˆℓeˆµ eˆ− ˆℓ(1− µ) ≤ ˆℓeˆµ(1− µ0) eˆ(µ − µ0) + ˆℓ(1− µ)µ0 ⇐⇒ eˆ(µ − µ0) + ˆℓ(1− µ)µ0 ≤ (eˆ− ˆℓ(1− µ))(1− µ0) ⇐⇒ (1− µ)(eˆ+ ˆℓ(1−2µ0)) ≥ 0, which is true since ˆe = e+ (κn −κ1) ≥ (κn −κ1) = ˆℓ. Now, it suffices to show that vmod(µ,δ,κ) ≤ g ∗ −λ ∗ (µ − µ0) 1−ρ ∗(µ − µ0) , if ρ ∗ (µ − µ0) < 1, vmod(µ,δ,κ) ≥ g ∗ −λ ∗ (µ − µ0) 1−ρ(µ − µ0) , if ρ ∗ (µ − µ0) > 1. First, observe that g ∗ −λ ∗ (µ − µ0) 1−ρ ∗(µ − µ0) = ˆℓeˆµ(1− µ) eˆ(µ − µ) + ˆℓ(1− µ)µ is a rational function in µ with vertical asymptote at µ = µc that takes value ˆℓ at µ = 0, ˆe at µ = µ and 0 at µ = 1. Furthermore, this function is increasing on the interval [0,µc) and increasing on the interval (µc,1]. Since ˆℓ and ˆe are the highest possible payoffs on the intervals [0,µ) and [µ,µ], and µc ∈ [µ,µ], it follows that the rational function lies above vmod(µ,δ,κ) when µ < µc. Similarly, 121 since 0 is the lowest possible payoff on the interval [µ,1] and lower bound for the payoff on the interval [µ,µ], it follows that the rational function lies below vmod(µ,δ,κ) when µ > µc, which concludes the proof of the claim. ■ Now that we have a feasible solution for (Dual) that provides an upper bound, we now construct a feasible BMP mechanism that achieves the value of the upper bound g ∗ −κn. Claim B.3 Let l = (0,B), m = (µ,H) and r = (1,A). Consider the distribution τ ∗ defined by τ ∗ (l) = eˆ(µ − µ0)(1− µ0) eˆ(µ − µ0) + ˆℓ(1− µ)µ0 τ ∗ (m) = ˆℓµ0(1− µ0) eˆ(µ − µ0) + ˆℓ(1− µ)µ0 τ ∗ (r) = µ0(µ − µ0)(eˆ− ˆℓ) eˆ(µ − µ0) + ˆℓ(1− µ)µ0 , with price partition P∗ defined by U1 = U \ {r} with corresponding cost κ1 and Un = {r} with corresponding cost κn. Then, τ ∗ satisfies B.P. and P.P. and achieves value g∗ −κn. Proof: First observe that τ ∗ (l) +τ ∗ (m) +τ ∗ (r) = eˆ(µ − µ0)(1− µ0 + µ0) + ˆℓµ0((1− µ0)−(µ − µ0)) eˆ(µ − µ0) + ˆℓ(1− µ)µ0 = 1. It easy to see that the denominators of the distribution are nonnegative since µ0 < µ < 1. Using this along with the fact that ˆe ≥ ˆℓ, we see that the numerators are also nonnegative. So, τ ∗ ≥ 0. Therefore, we have established that τ as defined is a valid distribution on U . 122 Now we check that τ ∗ is Bayes Plausible: Eτ ∗ [µ − µ0] = τ ∗ (l)×0+τ ∗ (m)× µ +τ ∗ (r)×1− µ0 = ˆℓµ0(1− µ0)µ + µ0(µ − µ0)(eˆ− ˆℓ) eˆ(µ − µ0) + ˆℓ(1− µ)µ0 − µ0 = µ0 ˆℓµ0(1− µ) +eˆ(µ − µ0) eˆ(µ − µ0) + ˆℓ(1− µ)µ0 − µ0 = 0. We must check that Payoff Plausibility is satisfied: Eτ ∗ [(µ − µ0)v((µ,δ);P∗ )] = τ ∗ (l)(0− µ0)(−κ1) +τ ∗ (m)(µ − µ0)(e−κ1) +τ ∗ (r)(1− µ0)(−κn) = µ0(1− µ0)(µ − µ0)(eˆκ1 + ˆℓ(e−κ1)−eκn) eˆ(µ − µ0) + ˆℓ(1− µ)µ0 = eκ1 +κnκ1 −κ 2 1 +eκn −eκ1 −κ1κn +κ 2 1 −eκn eˆ(µ − µ0) + ˆℓ(1− µ)µ0 = 0. So, τ ∗ is a feasible BMP-2 mechanism. Observe that Eτ ∗ [v((µ,δ);P∗ ) +κn] = τ ∗ (l)(−κ1 +κn) +τ ∗ (m)(e−κ1 +κn) +τ ∗ (r)(−κn +κn) = ˆℓeˆ(µ − µ0)(1− µ0) + ˆℓeˆµ0(1− µ0) eˆ(µ − µ0) + ˆℓ(1− µ)µ0 = g ∗ . Therefore, τ ∗ achieves a value of g ∗ −κn. ■ Clearly τ ∗ , as defined above, corresponds to a feasible BMP-2 mechanism with low cost κ1 and high cost κn. Moreover, by Claim B.1, this BMP-2 mechanism achieves the highest possible 123 value of g ∗ − κn = VBMP-2(µ0;κ1,κn) among all mechanisms with prices in the set {κ1,...,κn}. This finishes the proof. ■ B.3 Remaining Proofs Proof of Lemma 2.6.1. Assume the Firm has belief µ = P(ω = A). If they choose project A, their expected payoff is Eµ[u(A,ω)] = µh. If they choose project B, their expected payoff is Eµ[u(B,ω)] = (1− µ)h. If they choose H, they get Eµ[u(H,ω)] = h−e. After inspection, we see that B is optimal when 0 ≤ µ ≤ e/h = µ, H is optimal when µ = e/h ≤ µ ≤ (1−e/h = µ, and A is optimal when µ = 1−e/h ≤ µ ≤ 1. The points of indifference are at µ = e/h and µ = 1−e/h, respectively. Therefore, the Consultant receives a payoff of 0 on the interval [0,µ)∪(µ,1]. At the belief µ the user may mix between actions B and H, which gives rise to payoffs eδ(H) for δ(H) ∈ [0,1]. At the belief µ the user may mix between actions H and A, which gives rise to payoffs eδ(H) for δ(H) ∈ [0,1]. ■ Proof of Theorem 2.5.1. Let τ ∗ be an optimal Bayes Plausible and Payoff Plausible distribution on U with support denoted by {(µm,δm)}. Let v ∗ be the value of τ ∗ . We proceed by showing that V(µ0;P) ≥ v ∗ and V(µ0;P) ≤ v ∗ . Observe that (µm,y((µm,δm);P), v((µm,δm);P)) ∈ GP for all m. Since τ ∗ is Bayes Plausible, we have ∑m τ ∗ (µm,δm)µm = µ0. Since τ ∗ is Payoff Plausible, we immediately see that ∑m τ ∗ (µm,δm)y((µm,δm);P) = 0. Finally, we have that v ∗ = ∑m τ ∗ (µm,δm)v((µm,δm);P). Therefore, (µ0,0, v ∗ ) ∈ co(G). By definition, V(µ0;P) ≥ v ∗ . Now, let (µ0,0,z) ∈ co(GP). By Caratheodory’s Theorem, there exists finitely many elements (µm,y((µm,δm);P), v((µm,δm);P)) ∈ GP with corresponding weights τm, satisfying 124 ∑m τm(µm,y((µm,δm);P), v((µm,δm);P) = (µ0,0,z). Notice that τ can be viewed as a distribution on U . It is Bayes Plausible since ∑m τmµm = µ0. Also, it is Payoff Plausible since for each ω,ω ′ ∈ Ω, ∑m τm µm(ω) µ0(ω) − µm(ω ′ ) µ0(ω′) v((µm,δm);P) = ∑m τmyω,ω′((µm,δm);P) = 0. Therefore, τ is a BP and PP distribution on U that achieves value z. By Proposition 1, there exists a generalized-straightforward and direct signal that achieves value z. We have shown that if (µ0,0,z) ∈ co(GP), then there exists a generalized-straightforward and direct signal that achieves value z. Since every generalized-straightforward and direct signal achieves value at most v ∗ , it follows that the set {z : (µ0,0,z) ∈ co(GP)} is bounded above by v ∗ . Therefore, V(µ0;P) ≤ v ∗ , which completes the proof. ■ Proof of Theorem 2.5.2. For ω ∈ Ω, let λω ∈ R be the dual variables corresponding to the Bayes Plausibility constraints. Similarly, for each ω ̸= ω ′ ∈ Ω let ρω,ω′ ∈ R be the dual variables corresponding to the Payoff Plausibility constraints. Then, the Lagrangian function is given by L(τ,λ,ρ) = Eτ [v((µ,δ);P)] +∑ ω λωEτ [µ(ω)− µ0(ω)]− ∑ ω̸=ω′ ρω,ω′Eτ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) v((µ,δ);P) = Eτ " v((µ,δ);P) +∑ ω λω(µ(ω)− µ0(ω))− ∑ ω̸=ω′ ρω,ω′ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) v((µ,δ);P) # . The Lagrangian dual function is given by g(λ,ρ) = max τ∈∆(U ) L(τ,λ,ρ), 125 Observe that, since the dual function is optimizing over distributions τ ∈ ∆(U ), it will simply place all weight on the set of maximizers of the integrand of the expectation. Therefore, the dual function g := g(λ,ρ) is equivalently given by g = max (µ,δ)∈U " v((µ,δ);P) +∑ ω λω(µ(ω)− µ0(ω))− ∑ ω̸=ω′ ρω,ω′ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) v((µ,δ);P) # . Note that, since g is a maximum, this relationship holds if and only if g is the smallest real number satisfying v((µ,δ);P) +∑ ω λω(µ(ω)− µ0(ω))− ∑ ω̸=ω′ ρω,ω′ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) v((µ,δ);P) ≤ g, ∀(µ,δ) ∈ U . In other words, the Lagrangian dual maxλ,ρ g(λ,ρ) can be equivalently written as max g,λ,ρ g s.t. v((µ,δ);P) +∑ ω λω(µ(ω)− µ0(ω))− ∑ ω̸=ω′ ρω,ω′ µ(ω) µ0(ω) − µ(ω ′ ) µ0(ω′) v((µ,δ);P) ≤ g, ∀(µ,δ) ∈ U . We obtain the second part of the result by rearranging the constraints so that v((µ,δ);P) is isolated to one side of the inequality. ■ Proof of Theorem 2.6.1. The proof given in Appendix B.2 with cost levels {κℓ ,κh} proves (ii) and (iii). To prove (i), take the expression for the optimal payoff given by (ii) and observe that when µ0 = 0, the optimal payoff is −κℓ ≤ 0 and when µ0 = µ, the optimal payoff is e−κℓ . It is easy to see that the optimal payoff is increasing in µ0, and so as long as κℓ < e and ∆κ > 0, there exists a region in (0,µ) where the payoff is strictly positive, which concludes the proof. ■ 126 Proof of Proposition 2.6.1. Note that VBMP-2(µ0;κℓ ,κh) = ∆κ(e+∆κ)µ(1− µ0) (e+∆κ)(µ − µ0) +∆κ(1− µ)µ0 −κh = ∆κ(e+∆κ)µ(1− µ0) e(µ − µ0) +∆κµ(1− µ0) −κh. Hence, upon adding and subtracting κℓ from the above expression, we get that the partial derivative of VBMP-2(µ0;κℓ ,κh) with respect to ∆κ is ∂VBMP-2 ∂∆κ = [(e+∆κ)(µ − µ0) +∆κ(1− µ)µ0]×[eµ(1− µ0) +2∆κµ(1− µ0)]−[∆κ(e+∆κ)µ(1− µ0)]×[µ(1− µ0)] [e(µ − µ0) +∆κµ(1− µ0)]2 −1. Evaluating this derivative at ∆κ = 0, we get ∂VBMP-2 ∂∆κ ∆κ=0 = e 2µ(µ − µ0)(1− µ0) e 2(µ − µ0) 2 −1 = µ(1− µ0) µ − µ0 −1 = µ0(1− µ) µ − µ0 , which is positive whenever µ > µ0. From the proof of Appendix B.2, when ∆κ = 0, the optimal dual variable for the P.P. constraint is ρ ∗ = 1/(µ − µ0), and τ(r) = µ0 is the weight placed on r in the optimal primal solution. Therefore, we see that −τ(r) +ρ ∗ τ(r)(1− µ0) = −µ0 + 1 µ − µ0 µ0 (1− µ0) = −µ0µ + µ 2 0 + µ0 − µ 2 0 µ − µ0 = µ0(1− µ) µ − µ0 , as desired. ■ Proof of Proposition 2.7.1. This directly follows from the proof of Proposition B.1 given in Appendix B.2 upon identifying κℓ = κ1 and κh = κn. ■ Proof of Proposition 2.7.2. Proposition 2.7.1 says that we can restrict our attention to the class of BMP-2 mechanisms without loss of optimality. Moreover, the corresponding value of the optimal BMP-2 mechanism for the information selling problem is given explicitly by Theorem 2.6.1. It is straightforward to check that both the value, as well as the belief-action distribution, of BMP-2 127 mechanisms only depend on the high price κh and the price difference ∆κ = κh −κℓ . Therefore, without loss of generality, we can take κℓ = 0 so that ∆κ = κh. In light of these observations, the value of (OPT-1) is given by sup P V(µ0;P) = sup κh≥0 VBMP-2(µ0,0,κh) = sup κh≥0 ( κh(e+κh)µ(1− µ0) (e+κh)(µ − µ0) +κh(1− µ)µ0 −κh ) = sup κh≥0 ( κh(e+κh)µ(1− µ0)−κh(e+κh)(µ − µ0)−κ 2 h (1− µ)µ0 (e+κh)(µ − µ0) +κh(1− µ)µ0 ) = sup κh≥0 ( κh(e+κh)µ0(1− µ)−κ 2 h (1− µ)µ0 (e+κh)(µ − µ0) +κh(1− µ)µ0 ) = sup κh≥0 ( κheµ0(1− µ) (e+κh)(µ − µ0) +κh(1− µ)µ0 ) = sup κh≥0 eµ0(1− µ) µ(1− µ0) + e(µ−µ0) κh = eµ0(1− µ) µ(1− µ0) . Finally, note that by a straightforward application of the results from (Kamenica and Gentzkow, 2011) we get Vcommit = eµ0/µ by noticing that the concave closure of the graph of Sender’s payoff function is a line connecting (0,0) and (µ, e). Combining this with the above limit expression yields the result. ■ B.4 Calculations for Table 2.1 Here we compute the BMP mechanism π given by Table 2.1. To do so, we leverage that we have already computed the optimal BMP-2 belief-action distribution given by Theorem 2.6.1. When κℓ = 0, we have that ∆κ = κh, and so this distribution further simplifies to 128 τ(l) = (e+κh)(µ − µ0)(1− µ0) e(µ − µ0) +κhµ(1− µ0) , τ(m) = κhµ0(1− µ0) e(µ − µ0) +κhµ(1− µ0) , τ(r) = µ0(µ − µ0)e e(µ − µ0) +κhµ(1− µ0) . We will construct π according to the construction given by (B.1) found in the proof of Proposition 2.4.1 of Appendix B.1. The Consultant will send the message “Project A” to persuade the Firm to select Project B. This corresponds to to the belief-action pair r = (1,A), which happens with probability τ(r). Thus, the corresponding conditional probabilities for sending the “Project A” signal are π(Project A | A) = τ(r)· 1 µ0 = (µ − µ0)e e(µ − µ0) +κhµ(1− µ0) , π(Project A | B) = τ(r)· 0 1− µ0 = 0. Similarly, the Consultant will send the message “Hire” to persuade the Firm to hire. This corresponds to the belief-action pair m = (µ,H), which happens with probability τ(m). Thus, the corresponding conditional probabilities for sending the “Hire” signal are π(Hire | A) = τ(m)· µ µ0 = κhµ(1− µ0) e(µ − µ0) +κhµ(1− µ0) , π(Hire | B) = τ(m)·(1− µ) 1− µ0 = κh(1− µ)µ0 e(µ − µ0) +κhµ(1− µ0) 129 Finally, the Consultant will send the message “Project B” to persuade the Firm to select Project B. This corresponds to the belief-action pair l = (0,B), which happens with probability τ(l). Thus, the corresponding conditional probabilities for sending the “Project B” signal are π(Project B | A) = τ(l)· 0 µ0 = 0, π(Project B | B) = τ(l)· 1 1− µ0 = e(µ − µ0) +κh(µ − µ0) e(µ − µ0) +κhµ(1− µ0) . ■ C Proofs for Chapter 3 C.1 Proof of Theorem 3.4.1 Lemma C.1 Under the π true,i mechanism, if Qt = 1, then Nout i (Gˆ) = N out i (G), for all Gˆ ∈ Gˆ t . Suppose Qt = 1. If j ∈ N out i (G), then b ∈ Sj,t . Since the b message is sent to customer i only when Qt = 1, customer j knows for certain that Qt = 1, and thus purchases the product. On the other hand, if j ̸∈ N out i (G), then Sj,t consists only of empty messages. Therefore, j learns nothing of the state, resorts to their prior belief on Qt and does not purchase the product. Suppose (i, j) ∈ E(Gˆ), but (i, j) ̸∈ E(G). Then, aj,t = n, but a(Sˆ j,t) = b. Similarly, if (i, j) ̸∈ E(Gˆ), but (i, j) ∈ E(G), then aj,t = b, but aj(Sˆ j,t) = n. In either case, Gˆ is incompatible with the data. The result follows. Proof of Theorem 3.4.1 Define gt = |{t ′ : Qt ′ = 1,t ′ ≤ t}| be the number of times that the product is of good quality up to time t. By definition of the ST policy, each of the π true,i mechanisms have 130 executed at some time ti with Qti = 1. Therefore, by Lemma C.1, T(G;ST) = inf{t : gt = N}. This is simply the time required to achieve N ”successes” of a Bernoulli trial with probability µ0. Thus, E[T(G;ST)] = N µ0 , When Qt = 0, no customers purchase and when Qt = 1, |Ni(G)| ≤ N customers purchase with strict inequality for at least one i when G is not complete. This bounds the expected revenue above by N pµ0, with equality if and only if G is complete. ■ C.2 Proof of Theorem 3.4.2 Lemma C.2 π˜j ≤ µ0 1−µ0 v j−p p for all j. Proof: This follows directly from the definition of π˜j . If j ̸= i, then π˜j = µ0 1− µ0 v j − p p −ei ≤ µ0 1− µ0 v j − p p . If j = i, then π˜j = µ0 1− µ0 vi − p p . Lemma C.3 Under the π −i signaling mechanism, customer j purchases the product at time t if and only if n ̸∈ Sj,t . Proof: Suppose that customer j purchases the product, but for sake of contradiction suppose also that there is another customer k ∈ Nj(G) with sk = n. Then, customer j learns Qt = 0 since under π i , n is only sent when the product quality is bad. Therefore, customer j will not purchase the product. Now suppose that every customer in j’s incoming neighborhood N in j receives b. Let jk be the index of the customer with jk = argmin x {π˜x : x ∈ Nj(G)} 131 . Case 1: i ∈ N in j (G). Because every customer in j’s incoming neighborhood received a b message, and i ∈ N in j (G), si = b and so n was not sent by the mechanism. It also follows that b −i was not sent by π i . It follows that b(π˜jl ) was sent for some l. Note that, by construction of π i , since jk received b, it must be that l ≥ k. Therefore, from j’s perspective, P(Sj |Qt = 0) = ∑ ℓ≥k π(b(π˜jk )|Qt = 0) = π˜jk ≤ π˜j ≤ µ0 1− µ0 v j − p p , where the first inequality follows from definition of jk and the second inequality follows from applying Lemma C.2. Case 2: i ̸∈ Nj(G). As in case 1, since at least one customer received b, the mechanism did not send n. However, because i ̸∈ Nj(G), it is possible that either b −i was sent or b(π˜jl ), for l ≥ k. Therefore, from j’s perspective, the probability that they have message set Sj is P(Sj |Qt = 0) = ∑ l≥k π(b(π˜jk )|Qt = 0) +π(b −i |Qt = 0) = π˜jk +ei ≤ π˜j +ei = µ0 1− µ0 v j − p p , where the inequality follows from definition of jk and the final equality follows from definition of π˜j along with the fact that j ̸= i in this case. 132 In either case, we get that P(Sj |Qt = 0) ≤ µ0 1−µ0 v j−p p . Therefore, P(Qt = 1|Sj) = P(Sj |Qt = 1)µ0 P(Sj |Qt = 1)µ0 +P(Sj |Qt = 0)(1− µ0) ≥ µ0 µ0 + µ0 1−µ0 v j−p p (1− µ0) = (1− µ0)p (1− µ0)p+ (1− µ0)(v j − p) = p v j , from which it immediately follows that customer j finds it optimal to purchase the product. Lemma C.4 Suppose π i sends b −i at time t. Then, every Gˆ ∈ Gˆ t is such that Nout i (Gˆ) = N out i (G). Proof: Let Gˆ ∈ Gˆ t . When b −i is sent, it follows from Lemma C.4 that customer j will purchase the product if and only if j ̸∈ N out i . Case 1: j ∈ N out i (G). Then, aj,t = n. If j ̸∈ N out i (Gˆ), then a(Sˆ j,t) = b, a contradiction. Case 2: j ̸∈ N out i (G). Then, aj,t = b. If j ∈ N out i (Gˆ), then a(Sˆ j,t) = n, a contradiction. Therefore, N out i (Gˆ) = N out i (G). Proof of Theorem 3.4.2 The SCP policy commits to mechanism π i until the message b −1 is sent. This happens with probability ei . Let fi be the expected time until b −i is sent. This is given by fi = 1+ (1−(1− µ0)ei)fi , which implies fi = 1/((1−µ0)ei). Therefore, the expected time it takes for the SCP policy to send b −i for all customers i is E[T(G;SCP)] = N ∑ i=1 1 (1− µ0)ei ≤ N (1− µ0)mini ei = O(N). 133 Since all customers purchase when Qt = 1 and there is positive probability that all customers purchase when Qt = 0, it follows that the expected revenue exceeds N pµ0. C.3 Remaining Proofs Proof of Proposition 3.4.2: Under the classic Bayesian persuasion mechanism, the probability of Ei occurring is equal to P(Ei) = P({b is sent to j ̸= i} = π classic(b,b,...,n,...,b|Qt = 1)µ0 +π classic(b,b,...,n,...,b|Qt = 0)(1− µ0) = π classic(b,b,...,n,...,b|Qt = 0)(1− µ0) = (1− µ0) ∏ j̸=i π classic j (b|Qt = 0) ! π classic i (n|Qt = 0) = (1− µ0) ∏ j̸=i µ0 1− µ0 p−v j v j ! 1− µ0 1− µ0 p−vi vi . Note that the expected time for Ei to occur, which we will denote by fi , is given by fi = 1+ (1−P(Ei))fi , or fi = P(Ei) −1 . Therefore, the expected time for all Ei to occur is exponential in N. ■ Lemma C.5 Under the CBP policy, customer j purchases at time t if and only if n ̸∈ Sj,t . If n ∈ Sj,t , then there exists a customer l ∈ N in j (G) such that sl = n. We have that π classic l (n|Qt = 1) = 0 which means P(Sj,t |Qt = 1) = ∏k∈N in j (G) π classic(sk |Qt = 1)µ0 ∏k∈N in j (G) π classic(sk |Qt = 1)µ0 + ∏k∈N in j (G) π classic(sk |Qt = 0)(1− µ0) = 0. 134 On the other hand, if n ̸∈ Sj,t , then every customer k ∈ N in j (G) received b. It follows that P(Sj,t |Qt = 1) = ∏k∈N in j (G) π classic(sk |Qt = 1)µ0 ∏k∈N in j (G) π classic(sk |Qt = 1)µ0 + ∏k∈N in j (G) π classic(sk |Qt = 0)(1− µ0) = µ0 µ0 + ∏k∈N in j (G) µ0 1−µ0 vk−p p (1− µ0) ≥ µ0 µ0 + µ0 1−µ0 v j−p p (1− µ0) = p v j . So customer j purchases in this case. ■ Proof of Corollary 3.4.1. Take G = KN, the complete directed graph on N vertices. Then, as a consequence of Lemma C.5, at each time step t ′ ≤ t either all customers receive b and buy the product or some customer receives n and no one buys the product. Suppose Ei has not occurred by time t. Let Gˆ be G minus all outgoing edges (i, j) for j ̸= i. We claim that Gˆ is compatible with the data at time t. To see this, we consider the following three cases. Case 1: i receives b and some other customer receives n. Then, aj,t ′ = a(Sˆ j,t ′) = n for all j. Case 2: i receives n and some other customer receives n. Then, aj,t ′ = a(Sˆ j,t ′) = n for all j Case 3: i receives b and all other customers receive b. Then, aj,t ′ = a(Sˆ j,t ′) = b for all j. Note that these cases are exhaustive, as the case when i is the only customer to receive n is impossible under the assumption that Ei has not occurred at time t ′ ≤ t. Therefore, in any case, aj,t ′ = a(Sˆ j,t ′) at every t ′ ≤ t. It follows that Gˆ ∈ Gˆ t . Therefore, T(G;π classic) is equal to the first time that all Ei have occurred. By Proposition 3.4.2, this takes exponentially long in expectation. The result follows. 135
Abstract (if available)
Abstract
This work explores optimizing service systems with fairness considerations, harnessing blockchain technology for trustworthy mediation, and leveraging information design for network learning.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on information design for online retailers and social networks
PDF
Essays on service systems with matching
PDF
Essays on information, incentives and operational strategies
PDF
Patient choice and wastage in cadaveric kidney allocation
PDF
Essays on bounded rationality and revenue management
PDF
Decision-aware learning in the small-data, large-scale regime
PDF
More than a game: understanding the value of funding that corporate partnership decision-makers can offer clubs within the National Women’s Soccer League
PDF
Exploration of human microbiome through metagenomic analysis and computational algorithms
PDF
The impact of digital transformation on urban communities, welfare, and distributional outcomes
PDF
Essays on service systems
PDF
Blockchain migration: narratives of lived experiences in Puerto Rico at the dawn of a new digital era
PDF
Integer optimization for analytics in high stakes domain
PDF
The effects of fast walking, biofeedback, and cognitive impairment on post-stroke gait
PDF
Real-time controls in revenue management and service operations
PDF
Disclosure of changes in taste: implications for companies and consumers
PDF
Three essays on linear and non-linear econometric dependencies
PDF
Ethnic studies as critical consciousness and humanization
PDF
The interactive effects of incentive threshold and narcissism on managerial decision-making
PDF
Belief as credal plan
PDF
Towards trustworthy and data-driven social interventions
Asset Metadata
Creator
Mulvany, Justin
(author)
Core Title
Essays on fair scheduling, blockchain technology and information design
School
Marshall School of Business
Degree
Doctor of Philosophy
Degree Program
Business Administration
Degree Conferral Date
2024-05
Publication Date
03/29/2024
Defense Date
03/25/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Bayesian persuasion,blockchain,fairness,information design,learning,mediation,networks,OAI-PMH Harvest,queueing theory,scheduling,smart contracts
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Drakopoulos, Kimon (
committee chair
), Randhawa, Ramandeep (
committee chair
), Vayanos, Phebe (
committee member
)
Creator Email
jamulvany18@gmail.com,mulvany@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113861896
Unique identifier
UC113861896
Identifier
etd-MulvanyJus-12736.pdf (filename)
Legacy Identifier
etd-MulvanyJus-12736
Document Type
Dissertation
Format
theses (aat)
Rights
Mulvany, Justin
Internet Media Type
application/pdf
Type
texts
Source
20240401-usctheses-batch-1133
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
Bayesian persuasion
blockchain
fairness
information design
learning
networks
queueing theory
scheduling
smart contracts