Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays on service systems
(USC Thesis Other)
Essays on service systems
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ESSAYS ON SERVICE SYSTEMS by Dongyuan Zhan A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (BUSINESS ADMINISTRATION) August 2015 Copyright 2015 Dongyuan Zhan To my parents. ii Acknowledgments I want to express my deepest gratitude to my advisor Professor Amy R. Ward, who is the most patient, inspiring, and devoting mentor. I am very fortunate to have her continuous guidance since day one at this PhD program. I am highly grateful to Professors Sampath Rajagopalan, Ramandeep Randhawa, Leon Zhu, Sha Yang from USC Marshall School, who constantly gave me valuable advice and supports during my PhD study. I owe much to Professors Gideon Weiss from University of Haifa, Seyed Emadi from University of North Carolina at Chapel Hill and Josh Reed from New York University, from whom I learned a lot during collaboration. iii Contents Acknowledgments iii List of Tables vi List of Figures vii Abstract ix 1 On the Generalized Drift Skorokhod Problem in One Dimension 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 The Generalized Drift Skorokhod Problem Solution (in One Dimension) . . . . . . . . . . . 4 1.3 Reflected Ornstein-Uhlenbeck (O-U) Processes . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3.1 Computing the First Hitting Time . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.2 Uniform Integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.4 Approximating the Transient Distribution of theGI=GI=1 +GI andM=M=N=N Queues . 13 1.4.1 TheGI=GI=1 +GI Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.4.2 TheM=M=N=N Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2 Threshold Routing to Trade Off Waiting and Call Resolution in Call Centers 18 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.1.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2 The Model and Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2.1 The Halfin-Whitt Limit Regime . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.3 The Approximating Diffusion Control Problem (DCP) . . . . . . . . . . . . . . . . . . . . 27 2.3.1 The DCP Solution WhenJ = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.3.2 The DCP Solution WhenJ > 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.4 The Proposed Reduced Pools Threshold (RPT) Control . . . . . . . . . . . . . . . . . . . . 37 iv 2.4.1 The RPT Control WhenJ = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.4.2 The RPT Control WhenJ = 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.4.3 The RPT Control With GeneralJ . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.5 The Efficient Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.6 Conclusions and Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3 Compensation and Staffing to Trade Off Speed and Quality in Large Service Systems 54 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.1.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.2 A Compensation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.3 The Symmetric Equilibrium Service Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.3.1 The M/M/1+M Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.3.2 The M/M/N+M Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.4 Staffing and Compensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.4.1 The System Manager Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.4.2 The Lower Bound (Centralized) Problem . . . . . . . . . . . . . . . . . . . . . . . 70 3.4.3 A Limiting First Best Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.4.4 The Proposed Staffing and Compensation . . . . . . . . . . . . . . . . . . . . . . . 75 3.5 Economically Optimal Limit Regimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.5.1 Providing Servers with Idle Time . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.5.2 Letting Customers Abandon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 3.5.3 Simultaneous Customer Waiting and Server Idling . . . . . . . . . . . . . . . . . . 86 3.5.4 The Proposed Staffing and Compensation . . . . . . . . . . . . . . . . . . . . . . . 90 3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 References 94 A Technical Appendix to Chapter 2 101 A.1 Further Details on the Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 A.2 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 B Technical Appendix to Chapter 3 120 v List of Tables 2.1 The two pool RPT control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.2 The pool priority order depending on system idleness . . . . . . . . . . . . . . . . . . . . . 39 2.3 The three pool RPT control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.1 The optimal regimes under different utility functions and cost structures . . . . . . . . . . . 78 A.1 Summary of Tables 1 and 2 in Appendix B in Mehrotra et al. (2012) . . . . . . . . . . . . . 102 vi List of Figures 1.1 Simulated and approximated results for theM=M=1 +M queueing model . . . . . . . . . 14 1.2 Simulated and approximated results for theGI=GI=1 +GI queueing model . . . . . . . . 16 1.3 Simulated and approximated results for theM=M=N=N queueing model . . . . . . . . . . 17 2.1 An inverted-V call center model with callbacks . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2 The RPT control whenp 1 >p 2 >p 3 in a 3-pool system . . . . . . . . . . . . . . . . . . . 40 2.3 The reduction of poolj depends on its location . . . . . . . . . . . . . . . . . . . . . . . . 41 2.4 The idleness allocation in each pool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.5 The RPT control of a 5-pool system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.6 Simulated comparison between RPT and QIR controls in a 2-pool system . . . . . . . . . . 46 2.7 Simulated comparison between RPT, heuristic, and QIR controls in a 3-pool system . . . . . 48 2.8 Simulated comparison between RPT and QIR in a 2-pool system with lognormal service times 51 3.1 Performance of the proposed equilibrium approximation . . . . . . . . . . . . . . . . . . . 66 3.2 The limiting equilibrium service rate ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.3 Performance comparison of three polices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.4 The approximate symmetric equilibrium (equilibria) ^ E vs the staffing levelN =b +o() 81 3.5 Performance comparison of four polices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 A.1 The trade-off between waiting time and call resolution in a 2-pool system with different . . 102 vii A.2 Simulated comparison between RPT and QIR in a 2-pool system with Gamma service times 103 B.1 The equivalence of two summations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 B.2 The graph of terms comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 viii Abstract My dissertation applies queueing models to analyzing and designing service systems. It contains three chapters. The first chapter investigates transient behavior of service systems whereas the second and third chapters focus on steady state. Both the second and third chapters consider the speed quality trade-off in large service systems. The second chapter studies the optimal routing policy for heterogeneous servers whereas the third chapter searches for the joint optimal compensation and staffing policy for homogeneous but strategic servers who respond to incentives. The first chapter “On the Generalized Drift Skorokhod Problem in One Dimension” is a joint work with Professors Josh Reed and Amy Ward published in Journal of Applied Probability (2013). We show how to write the solution to the one-dimensional generalized drift Skorokhod problem in terms of the supremum of the solution of a tractable unrestricted integral equation (that is, an integral equation with no boundaries). As an application of our result, we equate the transient distribution of a reflected Ornstein-Uhlenbeck (O-U) process to the first hitting time distribution of an unreflected O-U process. Then, we use this relationship to approximate the transient distribution of theGI=GI=1 +GI queue in conventional heavy traffic and the M=M=N=N queue in a many-server heavy traffic regime. We validate the transient behavior approximation by simulation. The second chapter “Routing to Minimize Waiting and Callbacks in Large Call Centers” is a joint work with Professor Amy Ward published in Manufacturing & Service Operations Management (2014). The setting is an inverted-V queueing model with heterogeneous server pools, differentiated by their service speed and quality measured by call resolution. We propose a threshold policy that routes customers to faster servers when the system is congested and otherwise routes customers to slower servers with better quality (lower callback probability). We show both analytically and by using simulation that the policy is nearly ix Pareto optimal for large systems; that is, the percentage of callbacks would be larger for any other routing policy achieving smaller average waiting time. One important factor that was not explored in the second chapter is server incentives. This component is addressed in the third chapter “Compensation and Staffing to Trade Off Speed and Quality in Large Service Systems,” which is also a joint work with Professor Amy Ward. We consider homogeneous servers who have flexibility in choosing their service speeds, depending on the compensation scheme and the competition among them. We describe the server behavior by a symmetric Nash equilibrium and study how the manager can design both staffing and compensation so as to affect the servers’ speed and the corresponding quality, thereby minimizing the system costs on staffing, queueing and quality. We solve a fluid approximation problem when the system is large. Interestingly, different optimal regimes, including critically-loaded, quality-driven, efficiency-driven and mixed regimes, arise with different server utility functions and system cost structures. x Chapter 1 On the Generalized Drift Skorokhod Problem in One Dimension 1.1 Introduction The Skorokhod problem was originally introduced by Skorokhod Skorokhod (1961) in order to study continuous solutions to stochastic differential equations with a reflecting boundary at zero. Definition (Skorokhod Problem). Given a process X 2 D([0;1);R), we say that the pair of pro- cesses (Z;L)2 D 2 ([0;1);R) satisfy the Skorokhod problem forX if the following four conditions are satisfied, 1. Z(t) =X(t) +L(t) fort 0, 2. Z(t) 0 fort 0, 3. L is non-decreasing withL(0) = 0, 4. R 1 0 1fZ(t)> 0gdL(t) = 0. It is well known that for each X 2 D([0;1);R), the unique solution (Z;L) = ((X); (X)) to the Skorokhod problem is Z(t) = X(t) + sup 0st X(s)_ 0 and L(t) = sup 0st X(s)_ 0: (1.1) In subsequent papers, the Skorokhod problem has been extended to multiple dimensions and also to include both smooth and non-smooth domains (see, for example, Chaleyat-Maurel et al Chaleyat-Maurel et al. (1980), Dupuis and Ishii Dupuis and Ishii (1991), Harrison and Reiman Harrison and Reiman (1981), 1 Ramanan Ramanan (2006), Tanaka Tanaka (1979)), although we do not treat such cases in the present paper. There is a useful integral representation of the one-dimensional Skorokhod problem solution (see Anantharam and Konstantopoulos Anantharam and Konstantopoulos (2011)). There is also an explicit solu- tion to the (one-dimensional) Skorokhod problem when there is an upper boundary (see Kruk et al Kruk et al. (2007) Kruk et al. (2008))) and to the (one-dimensional) Skorokhod problem in a time-dependent interval (see Burdzy et al Burdzy et al. (2009)). In this paper, we study a generalization of the one-dimensional Skorokhod problem that incorporates a state-dependent drift. Definition (Generalized Drift Skorokhod Problem in One Dimension). Given a process X 2 D([0;1);R) with X(0) = 0 and a Lipschitz continuous function f : R + ! R, we say that the pair of processes (Z;L)2D 2 ([0;1);R) satisfy the Skorokhod problem forX with state dependent drift function f if the following four conditions are satisfied, 1. Z(t) =X(t) R t 0 f(Z(s))ds +L(t) fort 0, 2. Z(t) 0 fort 0, 3. L is non-decreasing withL(0) = 0, 4. R 1 0 1fZ(t)> 0gdL(t) = 0. The unique solution to the generalized drift Skorokhod problem in one dimension can be written in terms of the solution to the Skorokhod problem following a standard construction; see, for example Zhang Zhang (1994). Specifically, set (Z;L) = ( (M(X)); (M(X))); (1.2) forM : D ([0;1);R)! D ([0;1);R) the mapping that setsM(X) = V forV that solves the integral equation V (t) =X(t) Z t 0 f ( (V ) (s))ds; for allt 0: (1.3) Note that sincef :R + !R is a Lipschitz continuous function, a standard Picard iteration shows that there exists a unique solution to (1.3). The fact that ( (M(X)); (M(X))) solves the Skorokhod problem 2 forM(X) (and so satisfies conditions 1-4 in the definition of the Skorokhod problem) shows that condi- tions 1-4 in the definition of the generalized drift Skorokhod problem are satisfied. The uniqueness of the representation (1.2) follows from the uniqueness of the mappingsM and (; ). Next, we observe that the solution Z can be represented in terms of an unrestricted integral equation (that is, an integral equation with no boundaries). Specifically, note from (1.3) that V (t)V (s) =X(t)X(s) Z t s f ( (V ) (u))du: Since whenX(0) = 0, (V )(u) = sup 0ru V (u)V (r); if we define R(s;t) =V (t)V (s); then R(s;t) =X(t)X(s) Z t s f sup 0ru R(r;u) du: (1.4) Finally, it follows from (1.2) and the above displays that Z(t) = sup 0st R(s;t): (1.5) However, the integral equation (1.4) is not tractable. In this paper, we establish how to representZ in terms of the solution to a tractable unrestricted integral equation. Specifically, we establish that Z(t) = sup 0st Z s (ts); t 0; (1.6) forZ s =fZ s (t);t 0g that solves Z s (t) = X(s +t)X(s) Z t 0 f e (Z s (u))du; (1.7) 3 and f e : R! R any extension of f : R + ! R that preserves the Lipschitz continuity of f. For one example, letf e (x) = f(0) ifx < 0 andf e (x) = f(x) ifx 0. It is interesting to observe that it follows from (1.5) and (1.6) that sup 0st R(s;t) = sup 0st Z s (ts) As an application of the representation (1.6), we show how to use (1.6) to write the transient distribution of a reflected Ornstein-Uhlenbeck (O-U) process in terms of the first hitting time distribution of an unre- flected O-U process, which additionally yields a uniform integrability result for reflected O-U processes. Such a result can also be derived using duality theory (see, for example, Cox and Rosler Cox and Rosler (1983) and Sigman and Ryan Sigman and Ryan (2000)); however, the proof methodology is much dif- ferent, because there is no sample path representation that is equivalent to (1.6) in either Cox and Rosler (1983) or Sigman and Ryan (2000). Because the reflected O-U process has been shown to approximate the GI=GI=1 +GI and M=M=N=N queues (see Ward and Glynn Ward and Glynn (2005) and Srikant and Whitt Srikant and Whitt (1996)), we see that the transient distribution of the number-in-system process for theGI=GI=1 +GI andM=M=N=N queues can be approximated by the first hitting time distribution of an O-U process (that is not reflected). The remainder of this paper is organized as follows. Section 1.2 proves (1.6). Section 1.3 applies (1.6) in the context of a reflected O-U process. Section 1.4 performs simulation studies that support approximating the transient distribution of the number-in-system process for theGI=GI=1 +GI andM=M=N=N queues with the first hitting time distribution of an O-U process (that is not reflected). 1.2 The Generalized Drift Skorokhod Problem Solution (in One Dimension) In this section, we establish (1.6). Theorem 1.2.1. Let (Z;L) be the unique solution to the generalized Skorokhod problem forX withX(0) = 0, and with state dependent drift functionf that is Lipschitz continuous. For eachs 0, letZ s be defined as in (1.7). Then, for eacht 0, Z(t) = sup 0st Z s (ts): 4 Proof. We first claim that for each 0st, Z s (ts) Z(t): To see this, first recall from (1.7) thatZ s (ts) is the solution to the equation Z s (u) = X(s +u)X(s) Z u 0 f e (Z s (v))dv; (1.8) evaluated at the pointu = ts, wheref e : R7! R is an arbitrary Lipschitz extension off : R + 7! R. Next, it is straightforward to see from part (1) of the definition of the generalized Skorokhod problem that Z(t) is the unique solution to the equation Z(s +u) =Z(s) + (X(s +u)X(s) +L(s +u)L(s)) Z u 0 f e (Z(s +v))dv; (1.9) foru 0, also evaluated at the pointu = ts (note that in (1.9) we have replacedf byf e ). Subtracting (1.8) from (1.9) we therefore obtain that (Z(s +u)Z s (u)) = Z(s) +L(s +u)L(s) Z u 0 (f e (Z(s +v))f e (Z s (v)))dv; foru 0. Note also that by the Lipschitz continuity off e , we have that for some constantK > 0, (Z(s +u)Z s (u))Z(s) +L(s +u)L(s)K Z u 0 jZ(s +v))Z s (v)jdv; (1.10) foru 0. Now consider the solutionW s =fW s (u);u 0g to the ordinary differential equation W s (u) = Z(s) +L(s +u)L(s)K Z u 0 jW s (v)jdv; u 0: (1.11) We claim that W s (u) = Z(s)e Ku + Z u 0 e K(vu) dL(s +v); u 0: This may be verified by noting that W s (u) 0 for u 0, since Z(s) 0 and L is a non-decreasing function. Subtracting (1.11) from (1.10) and using Gronwall’s inequality, it follows thatZ(s+u)Z s (u) 5 W s (u) 0 and soZ(s +u)Z s (u), which, evaluating atu =ts, yieldsZ s (ts)Z(t), our desired result. We have therefore shown that Z(t) sup 0st fZ s (ts)g: (1.12) It now remains to reverse the direction of the inequality in (1.12). In order to do so, it suffices to show that there exists at least one points ? such thatZ s ?(ts ? ) = Z(t). Lets ? = supfs t : Z(s) = 0g be the last time at which the processZ hit zero. Note thats ? is well defined sinceZ(0) = 0. Also note that L(s) =L(s ? ) forss ? . Thus, by (1.9), we have that Z(s ? +u) = X(s ? +u)X(s ? ) Z u 0 f e (Z(s ? +v))dv; u 0; (1.13) and so,Z(s ? +u) = Z s ?(u) for 0uts ? , and, in particularZ(t) = Z s ?(ts ? ), which completes the proof. 1.3 Reflected Ornstein-Uhlenbeck (O-U) Processes In this section we let the processX in the definition of the generalized Skorokhod problem be a Brownian motion with constant drift and infinitesimal variance 2 defined on a suitable probability space ( ;F;P). We also setf(x) = x forx 0, for some 2 R. The resulting processZ, defined sample pathwise as the solution to the generalized Skorokhod problem forX andf, is referred to as a (;; ) reflected O-U process, that has initial conditionZ(0) = 0. It is immediate that the following definition of a reflected O-U process is equivalent to the prescription given above. Definition (Reflected O-U Process). Let B = fB(t);t 0g be a standard Brownian motion defined on a probability space ( ;F;P) and let > 0, and ; 2 R. We say that the process Z is a (;; ) reflected O-U process if the following four conditions are satisfiedP-a.s. 1. Z(t) =B(t) +t R t 0 Z(s)ds +L(t) fort 0, 2. Z(t) 0 fort 0, 6 3. L is non-decreasing withL(0) = 0, 4. R 1 0 1fZ(t)> 0gdL(t) = 0. Now for eachs 0, recall from (1.7) the definition of the associated unreflected processes Z s (u) = (B(s +u) +(s +u)) (B(s) +s)) Z u 0 Z s (v)dv; u 0, where here we have set X(t) = B(t) +t, and we take the natural extension f e (x) = x for x2R. For clarity of exposition in the sequel, we now holdt 0 fixed and define the new process Y t (u) = Z tu (u); 0ut: SincefY t (u); 0utg is just the processfZ s (ts); 0stg run backwards in time, it follows that sup 0ut fY t (u)g = sup 0st fZ s (ts)g; and so from Theorem 1.2.1 we have that ifZ is a (;; ) reflected O-U process, then Z(t) = sup 0ut fY t (u)g: (1.14) In preparation for our next result, we now say that a processX is a (;; ) O-U process starting from X(0) (note the absence of reflection here) if it is the unique strong solution to the stochastic differential equation X(t) = X(0) +B(t) +t Z t 0 X(s)ds; fort 0, whereB is a standard Brownian motion. We then make the following claim regarding the process fY t (u); 0utg. Proposition 1.3.1. fe u Y t (u); 0 u tg is equal in distribution to a (;; ) O-U process on [0;t] which starts from zero. Proof. First note that sinceX(t) = B(t) +t is a Brownian motion with infinitesimal variance 2 and constant drift, it follows that for eachs 0, the processX s =fX(s+t)X(s);t 0g is also Brownian 7 motion with the same parameters and so we have that for eachs 0, the processZ s =fZ s (u);u 0g is an O-U process whose explicit solution is given by Z s (u) = (= )(1e u ) + Z u 0 e (vu) dB s (v); u 0; whereB s =fB(s +t)B(s);t 0g. SettingY t (u) =Z tu (u), it therefore follows that Y t (u) = (= )(1e u ) + Z u 0 e (vu) dB tu (v): However, sincedB tu (v) =dB(tu +v), the change with respect tov, it follows that making the change of variables =uv, we have that the above becomes Y t (u) = (= )(1e u ) + Z u 0 e (vu) dB(tu +v) = (= )(1e u ) + Z 0 u e dB(t) = (= )(1e u ) Z u 0 e dB(t): However, it clear that the above, as a process, is also equal in distribution to (= )(1e u ) + Z u 0 e t dB(t); u 0: Multiplying both sides of the above bye u , we then obtain that e u Y t (u) = (= )(1e u ) + Z u 0 e (tu) dB(t); which is just an Ornstein-Uhleneck process on [0;t] with infinitesimal variance 2 , constant drift and linear drift . 8 The following is now our main result of this section which relates the distribution of the supremum appearing in (1.14) to the first hitting distribution of an O-U process. Let x = infft 0 :U(t) =xg; whereU =fU(t);t 0g is an O-U process with parameters (; x +; ) and started from 0. In other words, x is the first hitting time ofx byU. We then have the following proposition. Proposition 1.3.2. For eacht 0, P (Z(t)x) = P ( x t): Proof. Note that for eachx 0 sup 0ut Y t (u)x = finffu :Y t (u)xgtg = finffu :e u Y t (u)e u xgtg = finffu :x(1e u ) +e u Y t (u)xgtg: Now, by Proposition 1.3.1,fx(1e u ) +e u Y t (u);u 0g is simply an O-U process with infinitesimal variance 2 , constant drift x + and linear drift . The result then follows immediately. Sigman and Ryan Sigman and Ryan (2000) establish an equivalent result to Proposition 1.3.2; however, their proof methodology is much different. In particular, Sigman and Ryan (2000) relates the transient distribution of any continuous-time, real-valued stochastic process that can be defined recursively (either explicitly in discrete time or implicitly in continuous time, through the use of an integral equation) to the ruin time of a dual risk process. There is no result in Sigman and Ryan (2000) that is equivalent to Theorem 1.2.1, which is the basis for our proof of Proposition 1.3.2. 1.3.1 Computing the First Hitting Time In order to use Proposition 1.3.2 to compute P (Z(t) x), it is necessary that the distribution of x is known. Fortunately, there are various results in the literature available for computing the first hitting time 9 distributions of O-U processes. Linetsky Linetsky (2004) provides a spectral expansion for the first hitting time of O-U processes and the results of Alili et al Alili et al. (2010) provide three different means to compute various probabilities associated with this hitting time. In what follows, we use the results in Alili et al. (2010). Letp (;; ) x 0 !x denote the density of the distribution of x for a (;; ) O-U process, so that we may write P ( x t) = Z t 0 p (;; ) x 0 !x (s)ds; t 0: (1.15) Alili et al. (2010) shows how to calculatep (1;0; ) x 0 !x when > 0. Since we are interested in the more general case, we first expressp (;; ) x 0 !x in terms ofp (1;0; ) x 0 !x . In order to do this, note that since a (;; ) O-U process starting fromx 0 has the same distribution as a (1;=; ) O-U process starting fromx 0 =, it follows that p (;; ) x 0 !x (t) = p (1;=; ) x 0 =!x= (t); t 0: (1.16) Next, Remark 2.5 in Alili et al. (2010) shows that p (1;=; ) x 0 =!x= (t) = p (1;0; ) x 0 ==( )!x==( ) (t); t 0: (1.17) Whenx= = 0, the above expression may be immediately evaluated because p (1;0; ) !0 (t) = jj p 2 sinh(t) 3=2 exp 2 e t 2sinh(t) + t 2 ; (1.18) as is found in Pitman and Yor Pitman and Yor (1981) and reproduced in (2.8) in Alili et al. (2010). Otherwise, when x= 6= 0, one must appeal to one of the three representations in Alili et al. (2010) (one that hinges on an eigenvalue expansion, one that is an integral representation, and one that is given in terms of a functional of a 3-dimensional Bessesl bridge) in order to computeP ( x t). 10 To compute the transient distribution of the (;; ) reflected O-U processZ, we first apply Proposition 3.2, and then use the distributional equalities (1.16) and (1.17) as follows P (Z(t)x) = P ( x t) (1.19) = Z t 0 p (; x+; ) 0!x (s)ds = Z t 0 P (1; x+ ; ) 0! x (s)ds = Z t 0 p (1;0; ) x ! (s)ds: We double-check the calculation (1.19) by recalling that it also follows Sigman and Ryan (2000). Specif- ically, Proposition 4.3 in their paper establishes that P (Z(t)x) =P R t ; (1.20) where R is the first time a (;; ) O-U process with initial pointx> 0 becomes negative. To see that (1.19) and (1.20) are equivalent, first observe that P ( R t) = Z t 0 p (;; ) x!0 (s)ds = Z t 0 p (1; ; ) x !0 (s)ds = Z t 0 p (1;0; ) x ! (s)ds; where the second and third equalities follow from (1.16) and (1.17). Then, since symmetry implies that p (1;0; ) x ! (s) =p (1;0; ) x ! (s); we conclude thatP ( x t) =P ( R t). 1.3.2 Uniform Integrability It is well known (see, for example, Proposition 1 in Ward and Glynn Ward and Glynn (2003b)) that if > 0, then for a (;; ) reflected O-U process, Z(t)) Z(1) as t!1, where Z(1) is a normal random 11 variable with mean= and variance 2 =(2 ) conditioned to be positive. We now show that the sequence of random variablesfZ(t);t 0g is uniformly integrable as well. Proposition 1.3.3. If > 0, then for a (;; ) reflected O-U process started at the origin, the sequence of random variablesfZ(t);t 0g is uniformly integrable. Proof. First note that without loss of generality we may assume that = 1 since otherwise we may rescale. Now recall that by Proposition 1.3.2, it follows thatP (Z(t) x) = P ( x t), where x = infft 0 : U t =xg, whereU t is an O-U process with parameters (1; x +; ) which is started from 0. Hence, is suffices to show that there exists a functiong integrable onR + such thatP ( x t)g(x) for allx;t 0. Next, it follows from (1.19) that P ( x t) = Z t 0 p (1;0; ) = x!= (s)ds: Remark 2.4 in Alili et al. (2010) 1 shows that p (1;0; ) = x!= (s) = exp 2 2 x 2 s !! p (1;0; ) = x!= (s) Hence P ( x t) = exp x 2 2 x Z t 0 exp( s)p (1;0; ) = x!= (s)ds exp x 2 2 x ; where the last inequality follows since Z 1 0 p (1;0; ) = x!= (s)ds = 1: 1 We note that there is a missing negative sign in the display appearing in Remark 2.4 in Alili et al. (2010); specifically, the correct equation is p () x!a (t) = exp (a 2 x 2 t) p () x!a (t): 12 Finally, since for > 0, Z 1 0 exp x 2 2 x < 1; the proof is complete. 1.4 Approximating the Transient Distribution of the GI=GI=1 +GI and M=M=N=N Queues In this section, we perform simulation studies that support using the first hitting time distribution of an Ornstein-Uhlenbeck (O-U) process (that is not reflected) to approximate the transient distribution of the number-in-system process for theGI=GI=1+GI queue (Section 1.4.1) and theM=M=N=N queue (Section 1.4.2). 1.4.1 TheGI=GI=1+GI Queue TheM=M=1 +M queueing model assumes that customers arrive according to a Poisson process with rate to an infinite waiting room service facility, that their service times form an i.i.d. sequence of exponential random variables having mean 1=> 0, and that each customer independently reneges if his service has not begun within an exponentially distributed amount of time that has mean 1= > 0. Theorem 2 in Ward and Glynn Ward and Glynn (2003a) supports approximating the number-in-system processQ =fQ(t);t 0g by a ( p 2;; ) reflected O-U processZ. The more generalGI=GI=1+GI queueing model assumes that the customer arrival process is a renewal process with rate, the service time distribution is general with mean 1=, and that each customer indepen- dently reneges if his service has not begun within an amount of time that is distributed according to some probability distribution functionF . In the case thatF has a density andF 0 (0) > 0 is finite, Theorem 3 in Ward and Glynn Ward and Glynn (2005) combined with the arguments in the proof of Theorem 2 in Ward and Glynn (2003a) shows that Q may be approximated by a ( p 2;;F 0 (0)) reflected O-U process. 13 Figure 1.1: Simulated and approximated results for theM=M=1 +M queueing model 0 20 40 60 80 100 t 0.2 0.4 0.6 0.8 1.0 PHQHtL>0L Simulation (a)P(Q(t) 1), = 0:001 0 20 40 60 80 100 t 0.2 0.4 0.6 0.8 1.0 PHQHtL>0L Simulation (b)P(Q(t) 1), = 0:01 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (c) P-P plot for P(Q(10) < x), = 0:001 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (d) P-P plot for P(Q(10) < x), = 0:01 Fix = = 0:5. If = 0:001, thenP R = 3:41%; if = 0:01, thenP R = 9:79%, whereP R is the steady-state percentage of arriving customers that renege. Note that this is consistent with the approximation forQ in the previous paragraph since the value of the density of an exponential random variable at 0 is equal to its rate. Our results in Section 1.3 (specifically, Proposition 1.3.2 and equation (1.19)) then imply for the M=M=1 +M case that P (Q(t)x) P (Z(t)x) (1.21) = Z t 0 p (1;0; ) x p 2 ! p 2 (s)ds; whenQ(0) = 0. For theGI=GI=1+GI case, one may replace withF 0 (0) in the above. Hence we have an approximation for the transient distribution for the number-in-system process in a GI/GI/1+GI queue. Note that the theory in Ward and Glynn (2003a) and Ward and Glynn (2005) suggests that the approximation in (1.21) will be good when and are close, and when is small compared to and (that is, the percentage 14 of customers reneging is not too large). For related work, we refer the interested reader to Fralix Fralix (2012), who derives the time-dependent moments of anM=M=1 +M queue, and then uses those to obtain the time-dependent moment expressions for reflected O-U. We now proceed to verify the approximation (1.21) in an M=M=1 +M model via simulation. Note that even in the case of aM=M=1 +M model, the problem of finding an exact expression for its transient distribution appears to be very difficult (as is suggested by the computations in Whitt Whitt (1999), which provide some performance measure expressions in terms of transforms for a many server model with reneg- ing). Figure 1.1 shows that the approximation (1.21) is very accurate, both for calculating the probability that the system is non-empty for a range oft values, and for finding the entire distribution ofQ(t) for a fixed t. The simulation results shown are averaged over 10,000 runs, stopped at the relevant time value. Note that we chose = so that we could use the very simple expression (1.18) when computingP (Z(t) x). When6= , there is another source of error that comes into the approximation (1.21) that is due to the methodology in Alili et al. (2010) for computing the hitting time density function of an O-U process. Figure 1.2 verifies the approximation (1.21) in aGI=GI=1+GI queueing model. Note that the relevant approximating reflected O-U process is exactly the same as in theM=M=1 +M queueing model in Fig- ure 1.1, (a) and (c). We observe that the transient distribution approximation is good for “medium”t but not for “small”t. (The simulation results in Ward and Glynn Ward and Glynn (2005) imply that the approxima- tion is good for “large”t, when the system is close to its steady-state.) TheGI=GI=1 +GI queue that we simulated had simulated steady-state mean number-in-system 18.12, and simulated mean number-in-system at timest = 100,t = 200, andt = 500 of 7.73, 10.43, and 14.43 respectively. Then, the displayed P-P plots forP (Q(t) < x) in Figure 1.2 are such that the transient distribution is relevant (and not the steady-state distribution). 1.4.2 TheM=M=N=N Queue TheM=M=N=N queueing model assumes that customers arrive at rate> 0 in accordance with a Poisson process to a service facility withN servers and no additional place for waiting, and that their service times form an i.i.d. sequence of exponential random variables with mean 1=. Any arriving customer that finds 15 Figure 1.2: Simulated and approximated results for theGI=GI=1 +GI queueing model 0 100 200 300 400 500 t 0.2 0.4 0.6 0.8 1.0 PHQHtL>0L Simulation (a)P(Q(t) 1) 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (b) P-P plot forP(Q(100)<x) 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (c) P-P plot forP(Q(200)<x) 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (d) P-P plot forP(Q(500)<x) The inter-arrival and service time distributions are Gamma(2,2) and the reneging distributionF is uniform on [0; 1000] N customers in the system is blocked from receiving service, and so is lost. Suppose that we let the number of servers in the system be a function of the arrival rate, and assume that N = + p for2R: (1.22) Then, Srikant and Whitt Srikant and Whitt (1996) shows that N Q p )Z; as!1; 16 whereZ is a ( p 2;;) RO-U process. Hence our results in Section 1.3 (specifically, Proposition 1.3.2 and equation (1.19)) imply that P (Q(t)x) = P NQ(t) p Nx p (1.23) P Z(t) Nx p = Z t 0 p (1;0;) N2 +x 2 p ! N p (s)ds; whenQ(0) =N. Figure 1.3: Simulated and approximated results for theM=M=N=N queueing model 0 20 40 60 80 100 t 0.2 0.4 0.6 0.8 1.0 PHQHtL<NL Simulation Approximation (a) P(Q(t) < N), = 0:5, = 0:05, andN = 10 0 20 40 60 80 100 t 0.2 0.4 0.6 0.8 1.0 PHQHtL<NL Simulation Approximation (b)P(Q(t) < N), = 1, = 0:05, and N = 20 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (c) P-P plot forP(Q(10)x), = 0:5, = 0:05, andN = 10 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (d) P-P plot forP(Q(10) x), = 1, = 0:05, andN = 20 Figure 1.3 compares simulated results for theM=M=N=N queue to values obtained using the approxi- mation in (1.23). We see that the approximation becomes more accurate asN becomes larger, which is as expected. Note that by (1.22) this also implies that the utilization is close to 1. The simulation results shown are the average over 10,000 runs, stopped at the relevant time value. 17 Chapter 2 Threshold Routing to Trade Off Waiting and Call Resolution in Call Centers 2.1 Introduction Speed and quality are two key measures to evaluate the operational performance of a manufacturer or a service provider. In general, these two measures are not independent. For example, it is often the case that the quality of a good or service degrades as the production or service speed increases. In other words, there is a trade-off between quality and speed. In a call center environment, agent heterogeneity drives the speed-quality trade-off. One measure of agent heterogeneity is the average time an agent spends talking to each customer, which determines the ser- vice speed. Another measure is an agent’s resolution probability; that is, the percentage of calls successfully handled by that agent that do not result in a callback (a follow-up call to the call center for the same problem, because that agent did not adequately answer the customer’s question). The callback is a sign of poor ser- vice quality, because it indicates that the agent did not properly answer the customer’s question. Sometimes, agents with slow service speed also have low resolution probability, because they are inexperienced. Then, there is no conflict between speed and quality when routing arriving customers to available agents. However, sometimes, agents with slow service speed have high resolution probability. This is because slow service speed can result from the agents spending a longer time listening to, and interacting with their customers. Then, there is the following speed-quality trade-off when routing arriving customers to available agents: should the agent with faster speed or with higher resolution probability serve the customer? To answer this question, it is necessary to know the performance objective. One primary objective in call center management is to minimize the steady state average waiting time. Then, the routing rule that ranks agents according to their effective service rate (their service rate times their resolution probability), and sends an arriving customer to the available agent with the highest effective service rate has been shown 18 to perform very well (see de Vericourt and Zhou (2005) for an exact analysis in an inverted-V model that includes callbacks and Armony (2005) for an asymptotic analysis). However, Mehrotra et al. (2012) shows that such a routing rule may perform very poorly with respect to objectives that involve the call resolution (that is, the overall percentage of arriving calls that do not result in a callback), and the importance of such objectives is recognized in Hart et al. (2006). For example, if the system manager wishes to maximize the call resolution, then the routing rule that ranks agents according to their resolution probability is likely to outperform the aforementioned rule that ranks agents according to their effective service rates. Note that we differentiate between an agent’s resolution probability (the probability that that agent successfully resolves a call) and the overall system call resolution (the probability that an arbitrary arriving call is successfully resolved). Our objective is to develop routing rules that take into account the dual performance objectives of mini- mizing the steady-state average waiting time and maximizing the steady-state call resolution. More specifi- cally, we would like to find a routing rule that lies on the efficient frontier with respect to these two perfor- mance measures. That is, any routing rule that has a lower average waiting time (higher call resolution) than ours must also have a lower call resolution (higher average waiting time). Our ideal is to consider this objective in the context of a call center that has customers calling with different types of requests and agents that are heterogeneous with respect to their service speeds and res- olution probabilities, even within the context of the same call type (as is true in the call center dataset in Mehrotra et al. (2012)). However, finding an analytic solution to that problem is very ambitious (as evi- denced by the fact that the results in Mehrotra et al. (2012) for call centers with heterogeneous customers and heterogeneous agents are all attained via carefully designed simulation studies). Fortunately, the sim- pler problem that assumes homogeneous customers and heterogeneous agents can be viewed as a building block to solving the heterogeneous customer and heterogeneous agent problem. This is because the analysis in Gurvich and Whitt (2009b) and Ward and Armony (2013) for two different call center routing problems that involve heterogeneous customers and agents shows that for large call centers both respective problems separate into one for an inverted-V model (homogeneous customers and heterogeneous agents) and one for a V model (heterogeneous customers and homogeneous agents). More specifically, that separation is true in the many server quality and efficiency driven (QED) regime, first formally defined in Halfin and Whitt (1981). Therefore, we consider the aforementioned objective in the context of a call center that has homoge- neous customers, and agents that are heterogeneous with respect to their service speeds and call resolution 19 probabilities (i.e., an inverted-V model with multiple agent pools). Furthermore, we assume that the call center operates in the QED regime. We note that the QED regime is a natural regime to consider because it is a regime in which average waiting times are small and agents are highly utilized, and can arise as an economically optimal operating regime; see Borst et al. (2004). For the inverted-V model, we solve the diffusion control problem (DCP) that approximates (in the QED regime) the optimization problem whose objective is to minimize a weighted sum of the steady state average waiting time and the steady state average callback rate. We argue that any solution to this problem has associated average waiting time and call resolution on the efficient frontier. The solution to the DCP shows that some pools should always have routing priority, and so can be considered ”reduced”. The remaining pools have state-dependent priorities, that are determined using a threshold control. The DCP solution translates into our proposed routing control, the Reduced Pools Threshold (RPT) control. Finally, we use simulation to evaluate the performance of our proposed RPT control. We complete this introduction with a brief review of the most relevant literature. Then, in Section 2.2, we present our model and problem formulation. In Section 2.3, we formulate and solve the approximating DCP. In Section 2.4, we propose the RPT control resulting from the DCP solution. In Section 2.5, we evaluate the performance of the RPT control via simulation. In Section 2.6, we make concluding remarks and propose directions for future research. The proofs of all our results can be found in Appendix A, as well as more on the simulation study. 2.1.1 Literature Review There is a large literature on call centers. We focus on the papers most relevant to ours, and refer the reader to the survey papers Gans et al. (2003) and Aksin et al. (2007) for a broad review of the call center literature. Our work is motivated by Mehrotra et al. (2012). Their very insightful observation is that focusing solely on routing calls to minimize average waiting time produces routing controls that ignore the fact that agent resolution probabilities are different. Their paper proposes several different call routing rules, and performs comprehensive simulation work (using data obtained from real-life call center) in order to identify an efficient frontier. However, there is no analytic justification of the efficient frontier. Unfortunately, performing an exact analysis to identify points located on the efficient frontier is pro- hibitively difficult. Therefore, we perform an asymptotic analysis in the Halfin-Whitt many-server QED 20 limit regime. This approach is common in the call center literature. The most closely related papers are those which assume QED staffing and optimize system performance with respect to a given performance objective, such as Armony and Ward (2010), Atar (2005), Atar et al. (2004), Gurvich and Whitt (2009b), Harrison and Zeevi (2004), Tezcan and Dai (2010), and Weerasinghe and Mandelbaum (2013). However, none of these papers use callbacks as a performance objective. Our ideal would have been to analyze the model of Mehrotra et al. (2012) in the Halfin-Whitt many- server QED limit regime, and to identify points located on the efficient frontier analytically in that regime. However, the model of Mehrotra et al. (2012) has multiple call types, multiple agent pools, and the agent service speed depends on both the agent pool and the call type. Then, formulating and solving an approx- imating DCP is hard. Therefore, we restrict our model to a single customer type and many agent pools (inverted-V model), which can be approximated by a one-dimensional DCP and so is analytically tractable. For work on models with multiple call types and multiple agent pools in which the agent service speed may depend on both the agent pool and the call type when there are no callbacks, we refer readers to Perry and Whitt (2009, 2011). Our performance objective combines the dual goals of high speed (minimizing average waiting time) and good quality (maximizing call resolution). There are many papers that discuss speed-quality trade-offs. Recent work includes Alizamir et al. (2013), Anand et al. (2011), and Kostami and Rajagopalan (2014). However, there is no concept in those papers of poor service quality resulting in a follow-up service request (retrial), or a callback in the call center terminology. This concept is key to our model, because we view the callback as an important measure of service quality. Retrials appear in many other fields rather than call centers. In the manufacturing setting, the models of Lovejoy and Sethuraman (2000) and Lu et al. (2009) both have speed-quality trade-offs and allow for rework. However, the first paper focuses on when to bring in overtime workers and the second focuses on endogenized routing schemes, neither of which matches our focus. In health care, readmissions are akin to callbacks in call centers. The model of Chan et al. (2014) studies a similar speed-quality trade-off, specifically, a trade-off between the service rate and the probability of readmission to Intensive Care Units. Their control is the service speed and their analysis is based on fluid approximation whereas our control is routing and our analysis is based on diffusion approximation. 21 2.2 The Model and Problem Formulation Consider an inverted-V model withJ pools and callbacks, as shown in Figure 2.1. The model is a parallel server system with a single customer class and J agent types, with each type in its own agent pool, all capable of fully handling customers’ service requirements. Customers arrive to the system according to a Poisson process with rate, and are served in the order of their arrival. Service times are independent and exponential, and the mean service time of a customer served by an agent from poolj is 1= j ,j2J := f1; 2; ;Jg. There is a probability 1p j ,j2J , that a customer who completes service with a poolj agent has not had his or her request adequately handled. In that case, the customer immediately calls back, meaning s/he immediately returns to the system as an arriving customer for another independent service. We assume the available real-time information is unsophisticated, and does not differentiate between customers calling for the first time and those calling back, even though such knowledge can be learned by later analysis of the recorded data (so that the agents can be separated into pools). Then, any customer calling back may be routed to the same pool as his previous service, or another pool, and the service rate and resolution probabilities are again the pool-dependent service rates and resolution probabilities j andp j ,j2J . We let~ and~ p represent the vectors of pool service rates and resolution probabilities. Figure 2.1: An inverted-V call center model with callbacks p 1 µ 1 < p 2 µ 2 <…<p J µ J ... p J 1-p J 1-p 1 µ 1 ,N 1 µ J ,N J Poisson arrive with rate λ p 1 Routing Control There areN j agents in poolj2J , and we defineN := P j2J N j . We assume p 1 1 <p 2 2 <<p J J ; (2.1) 22 meaning that the pools are labeled according to the order of their effective service ratesp j j . Customers that arrive to the system when more than one agent pool has idle agents can be routed to any of the pools with idle agents. One natural routing control is thep-rule, which routes customers to the available pool that has the highest effective service rate. Another natural control is thep-rule, which routes customers to the available pool that has the highest resolution probability. Thep-rule is an intuitive rule for minimizing the average waiting time, whereas thep-rule focuses on maximizing call resolution. The issue is that under thep-rule the call resolution may be low and under thep-rule the average waiting time may be high. In particular, there may be a trade-off between minimizing average waiting time and maximizing the call resolution. Our objective is to devise a routing control that optimally trades off these two competing objectives. Denote by := (; ~ N) a routing control for the system with arrival rate and staffing vector ~ N := (N 1 ;N 2 ; ;N J ). Note that we omit the arguments and ~ N when it is clear from the context which arguments should be used. Lett 0 be an arbitrary time point. We denote byZ j (t;) the number of busy agents in poolj2J at timet andI j (t;) := N j Z j (t;) the number of idle agents in poolj2J . Then, the instantaneous rate at which customers served by poolj agents call back is (1p j ) j Z j (t;). The number of customers waiting to be served at timet isQ(t;), and the total number of customers in the system is X(t;) :=Q(t;) + X j2J Z j (t;): The total number of idle agents is I(t;) := X j2J I j (t;): We omit the time argument when we refer to the entire process. We uset =1 whenever we refer to a process in steady state. Also, we omit from the notation unless it is necessary to avoid confusion between different routing controls. Let be the set of all non-anticipating, non-preemptive, non-idling controls under which a unique steady state forX,Q, andZ j ,j2J exists (which implies a unique steady state exists forI j ,j2J , andW as well). Non-anticipating (roughly speaking) means that a control cannot require knowledge of the future. By non-preemptive, we mean that once a call is assigned to a particular agent, it cannot be transferred to another 23 agent, nor can it be preempted by another call. Non-idling controls are those under which there can never simultaneously be waiting customers and idle agents. We assume that the system load is less than one := P j2J p j j N j < 1: (2.2) Then, the system is stable in the following sense. Proposition 2.2.1. When (2.2) holds, a unique steady-state forX,Q, andZ j ;j2J; exists under any non- idling stationary Markovian control. Otherwise, there does not exist a steady-state under any stationary Markovian control. For any2 , the steady state effective arrival rate to the system, inclusive of callbacks, is e (1;) := + X j2J (1p j ) j E [Z j (1;)]; and the steady-state call resolution is: p(1;) := e (1;) : The total expected amount of time a customer spends waiting (including any time spent waiting after calling back) is E[W (1;)] = E[W (1;)] p(1;) ; where E[W (1;)] = E[Q(1;)] e (1;) is the expected amount of time an arriving customer (new or callback) has to wait before reaching an agent. Our objective is to find a routing control ? 2 whose associated average waiting timeE[W (1; ? )] and call resolution are on the efficient frontier. To do this, we letc> 0 and solve C ? := minimize 2 cE[W (1;)] + X j2J (1p j ) j E[Z j (1;)]: (2.3) 24 Next, we observe that any routing control ? 2 that achieves the minimum in (2.3) for a givenc is on the efficient frontier. To see this, suppose that ~ 2 is such thatE[W (1; ~ )]<E[W (1; ? )]. We claim it is also the case that p(1; ~ )<p(1; ? ): (2.4) To see this, first note that for any2 , X j2J (1p j ) j E[Z j (1;)] = p(1;) : Then, if (2.4) does not hold, it must also be the case that X j2J (1p j ) j E[Z j (1; ~ )] X j2J (1p j ) j E[Z j (1; ? )]; and so cE[W (1; ~ )] + X j2J (1p j ) j E[Z j (1; ~ )]<cE[W (1; ? )] + X j2J (1p j ) j E[Z j (1; ? )]; which contradicts the definition of ? . Therefore, if we can solve (2.3) for anyc 0, then by varyingc from 0 to1, we produce a set of points on the efficient frontier. Although proving the continuity of that set is much more difficult, looking ahead in the paper, our results for systems with large do support such continuity; see Remark 2.3.1 at the end of Section 2.3. The problem (2.3) is a very complex Markov decision problem. Therefore, instead of solving (2.3) exactly, we formulate and solve the diffusion control problem (DCP) that arises as an approximation to (2.3) in the Halfin-Whitt many-server limit regime (Section 2.3). We then use the solution to the approximating diffusion control problem to motivate a proposed routing control (Section 2.4), and we test the performance of that control via simulation (Section 2.5). Before specifying the Halfin-Whitt limit regime, we interpret the parameterc in (2.3). We first observe that the objective function (2.3) is equivalently expressed in terms of the queue-length as: C ? = minimize 2 cE [Q(1;)] + X j2J (1p j ) j E[Z j (1;)]: (2.5) 25 This follows from Little’s Law, because E[W (1;)] = e (1;)E[W (1;)] =E[Q(1;)]: Then, the parameter c determines the relative cost of customers queueing in comparison with customers calling back. Finally, it is worthwhile to comment on some of our modeling assumptions. The assumption of Poisson arrivals and exponential service times is common in the literature, and is made for analytic tractability. However, the structure of the control we propose may still be good, even without these assumptions, and we perform a supporting numerical study in Section 2.5 (see Figure 2.8). Next, the assumption that callbacks are immediate is a simplification. This is reasonable for situations in which the customer can quickly assess whether or not the agent’s answer is helpful. However, in general, callbacks are not immediate. Third, we have assumed that the service time and the fact that the call is a callback are independent. This is consistent with our assumption that the customers in the queue are not differentiated by whether or not they are callbacks. Lastly, our objective function assumes that callbacks are negative and should be minimized, which is reasonable in, for example, a call center that performs technical support. However, in the context of a call center that generates revenue (and probably also under the assumption that there may be delay before the callback occurs), callbacks can be positive, because they are an indication that the customer is returning to buy another item. Then, it is of interest to maximize callbacks. 2.2.1 The Halfin-Whitt Limit Regime We consider a sequence of systems indexed by the arrival rate, and let!1. The service rates j ;j2J are held fixed. The associated total number of agentsN increases as increases. The routing control in the system with arrival rate is :=(;N ). Our convention is to superscript all processes and quantities associated with the system having arrival rate by. Assumption 2.2.1. (The Halfin-Whitt Limit Regime) (i) There is heavy traffic; specifically, the number of agents in each pool satisfies lim !1 p j j N j =a j for eachj2J; wherea j > 0 and X j2J a j = 1: 26 (ii) There is square-root safety staffing; specifically, lim !1 P j2J p j j N j p = for some finite > 0: The Halfin-Whitt regime is thought of as a quality and efficiency driven (QED) regime because (see Halfin and Whitt (1981) and the extension to the Inverted-V model in Armony (2005)): 1. The system load ! 1 and the limiting fraction of delayed customers is strictly between 0 and 1 (efficiency); 2. The average waiting time of delayed customers is small, of order 1 p (quality). Note that part (i) of Assumption 2.2.1 guarantees that := = P j2J p j j N j ! 1 as !1. Fur- thermore, the limiting fraction of agents in each pool is positive sinceN j =N ! a j =(p j j ) P j2J a j =(p j j ) > 0 as !1, for allj2J . The condition > 0 in part (ii) implies the stability condition (2.2) holds for all large enough. The quantity p is commonly referred to as the safety staff. Finally, it is useful to define the scaled processes ^ Q (t) := Q (t) p ; ^ I j (t) := I j (t) p ; j2J; and ^ X (t) := X (t)N p : 2.3 The Approximating Diffusion Control Problem (DCP) We begin by specifying the DCP that arises under Assumption 2.2.1, when formally passing to the limit in the control problem (2.3), or equivalently (2.5), as the arrival rate increases to infinity. We define x + := max(0;x),x := min(0;x) for anyx2<. Under any non-idling control,Q = X N + , it follows that E h Q (1) i = p E ^ X (1) + : Next, we assume that there is a reduction in problem dimensionality, so that in some sense ^ I j v j ^ X ^ X ; (2.6) 27 for some function v :< ! 8 < : (v 1 ;:::;v J ) : 0v j 1 for allj2J and X j2J v j = 1 9 = ; ; (2.7) where< := (1; 0]. The function v specifies the division of idle agents into pools; for example, if there are I = ^ X idle agents, the percentage from pool j is v j ( ^ X ). We do not specify a rigorous weak convergence statement from which (2.6) follows, although our simulation results in Section 2.5 are consistent with this simplification 1 . Then, the problem (2.5) can be approximated as C ? p 0 B @minimize 2 0 B @ cE h ^ X (1; ) + i P j2J (1p j ) j E h v j ^ X 1; ^ X 1; i 1 C A 1 C A + X j2J (1p j ) j N j ; (2.8) so that specifying the diffusion ^ X that approximates ^ X in the Halfin-Whitt limit regime yields the relevant DCP. The process (X ;I 1 ; ;I J ) is a multidimensional Markov process. Since we assume that callbacks return immediately, X does not change when a callback occurs. Then, X increases by 1 at rate and decreases by 1 at a state-dependent rate P j2J p j j (N j I j (t)). The infinitesimal drift is lim h#0 1 h E h ^ X (t +h) ^ X (t)j ^ X (t); ^ I 1 (t);:::; ^ I J (t) i = P j2J p j j N j p + X j2J p j j ^ I j (t); which by (2.6) can by approximated by P j2J p j j N j p + X j2J p j j v j ^ X (t) ^ X (t) : The infinitesimal variance is lim h#0 1 h E ^ X (t +h) ^ X (t) 2 j ^ X (t); ^ I 1 (t);:::; ^ I J (t) = + P j2J p j j N j P j2J p j j ^ I j (t) p ; 1 For example, the approximation (2.6) is supported under any QIR control defined in Gurvich and Whitt (2009a) by their state-space collapse result Theorem 3.1, when there are no callbacks. We conjecture the approximation also holds when there are callbacks. 28 which again by (2.6) can be approximated by + P j2J p j j N j P j2J p j j v j ^ X (t) ^ X (t) p : Since from Assumption 2.2.1, P j2J p j j N j p ! and + P j2J p j j N j ! 2; as!1; we expect that under any control for which (2.6) holds in an appropriate sense, then when is large, ^ X can be approximated by ^ X that solves the stochastic equation ^ X(t) = ^ X(0) + Z t 0 m ^ X(s);v ^ X(s) ds + p 2B(t) (2.9) for m(x;v) = + X j2J p j j v j (x)x : We write ^ X(;v) to denote that ^ X solves (2.9) under the controlv2V. In a slight abuse of notation, we have not specified the extension of the controlv = (v 1 ;:::;v J ) to the case thatx> 0, because then there is no control decision. The relevant DCP follows by replacing ^ X in (2.8) by ^ X and is C ? := minimize v2V 0 @ cE h ^ X(1;v) + i X j2J (1p j ) j E h v j ^ X(1;v) ^ X (1;v) i 1 A ; (2.10) whereV is the space of all functions as in (2.7). We expect that C ? p C ? + X j2J (1p j ) j N j : (2.11) We denote byC(v) the expression to be minimized in (2.10). A controlv ? 2V is optimal ifC ? =C(v ? ) C(v) for allv2V. The key to solving the DCP is the following verification theorem. 29 Theorem 2.3.1. LetV be defined as in (2.7). Suppose there exists a twice-continuously differentiable func- tionV :R!R and a constantd that solve min v2V 8 < : V 00 (x) +m(x;v)V 0 (x) +cx + X j2J (1p j ) j v j (x)x 9 = ; =d for allx2R: (2.12) Also assume there existb 1 ;b 2 2R such thatjV (x)jb 1 x 2 +b 2 for allx2R,E h ^ X(0) 2 i <1. Then, v ? (x) = argmin v2V 8 < : V 00 (x) +m(x;v)V 0 (x) +cx + X j2J (1p j ) j v j (x)x 9 = ; (2.13) is an optimal control and C(v ? ) =d: That is, if ^ X satisfies (2.9) under some admissible controlv2V, thenC(v)C(v ? ). The optimal controlv ? is found by solving (2.12) and (2.13). We first solve it whenJ = 2, in Subsection 2.3.1, and then whenJ > 2, in Subsection 2.3.2. 2.3.1 The DCP Solution WhenJ = 2 For the case thatJ = 2, from (2.13), argmin v2V 8 < : V 00 (x) +m(x;v)V 0 (x) +cx + 2 X j=1 (1p j ) j v j (x)x 9 = ; = argmin v2V 8 < : 2 X j=1 p j j V 0 (x) (1p j ) j v j (x)x 9 = ; = 8 > < > : (1; 0) ifp 1 1 V 0 (x) (1p 1 ) 1 p 2 2 V 0 (x) (1p 2 ) 2 (0; 1) otherwise : We conclude that for T (1; 2) := (1p 2 ) 2 (1p 1 ) 1 p 2 2 p 1 1 ; 30 an optimal control is v ? (x) = 8 > < > : (1; 0) ifV 0 (x)T (1; 2) (0; 1) otherwise ; x< 0: (2.14) Observe thatv ? is a static priority control when eitherV 0 (x) > T (1; 2) for allx < 0, orV 0 (x) T (1; 2) for allx< 0. Otherwise,v ? is a state-dependent dynamic control. Recalling the assumptionp 1 1 <p 2 2 , whenp 1 p 2 , the pool 2 agents have both a faster effective service rate and a higher resolution probability. Then, intuition suggests that pool 2 agents should never be idled; i.e., an optimal control isv ? (x) = (1; 0) for allx< 0. To establish this rigorously, it is sufficient to show that the functionV :R!R and constant d that solve V 00 (x)V 0 (x) +cx =d; x 0 V 00 (x) ( +p 1 1 x)V 0 (x) + (1p 1 ) 1 x =d; x< 0 ; (2.15) also solve (2.12) and satisfy the conditions of Theorem 2.3.1. For this, note that it is straightforward to find analytic expressions for the unique functionV and constantd that solve (2.15) such thatV is twice- continuously differentiable. Then, it is also straightforward to show that that functionV hasV 0 (x) increas- ing inx andV 0 (x)! 1p 1 p 1 asx!1. Sincep 1 p 2 , 1p 1 p 1 T (1; 2). Therefore,V 0 (x) T (1; 2), which from (2.14) impliesv ? (x) = (1; 0) for allx< 0 is an optimal control. The next question is: what happens whenp 1 > p 2 ? Now, there is a trade-off: the pool 1 agents have a higher resolution probability, but the pool 2 agents have a higher effective service rate. Hence there is no reason to expect that in general a static priority control will be optimal. However, whenc is low, so that more importance is placed on call resolution than customer waiting, there is a natural intuition that suggests the static priority control that always idles pool 2 agents is optimal; i.e., v ? (x) = (0; 1) for all x < 0. Similar methodology as that described in the preceding paragraph suggests finding the unique functionV and constant d that solve (2.15), with p 2 and 2 replacing p 1 and 1 , such that V is twice-continuously differentiable. Then, algebra shows that when cC := 1 (p 1 p 2 ) p 2 (p 2 2 p 1 1 ) 2 0 @ 1 + p p 2 2 p p 2 2 p p 2 2 1 A ; (2.16) we haveV 0 (0)T (1; 2). SinceV 0 (x) increasing inx, we conclude from (2.14) thatv ? (x) = (0; 1) for all x< 0 is an optimal control. 31 Finally, it remains to derive the optimal controlv ? whenp 1 > p 2 andc > C. Intuitively, whenx is larger, the system is crowded, and so we want to idle the slower pool. On the other hand, whenx is smaller, the system is not crowded, and so we want to idle the faster pool, which has a worse resolution. Based on this intuition, we search for an optimal control within the class of threshold controls, defined as v L (x) := (1fLx< 0g;1fx<Lg): Now, to use Theorem 2.3.1 to establish thatv L is an optimal control, we want to solve for the function V :R!R, constantd and threshold levelL in V 0 (x) = 8 > > > > > < > > > > > : V 0 0 (x) ifx 0 V 0 1 (x) ifLx< 0 V 0 2 (x) ifx<L (2.17) such that V is twice-continuously differentiable and has V 0 (x) < T (1; 2) when x <L and V 0 (x) T (1; 2) whenLx< 0. This follows from Theorem 2.3.2 in Section 2.3.2 (which shows that a threshold control is optimal for the generalJ-pool system, possibly with 0 or infinite threshold levels). In summary, a solution to the DCP whenJ = 2 is v ? (x) = 8 > > > > > < > > > > > : (1; 0) ifp 1 p 2 (0; 1) ifp 1 >p 2 andcC (1fLx< 0g;1fx<Lg) ifp 1 >p 2 andc>C : (2.18) We end this subsection by performing a sensitivity analysis on the parameterC defined in (2.16). Lemma 2.3.1. Whenp 1 1 <p 2 2 andp 1 >p 2 , @C @p 1 > 0; @C @ 1 > 0; @C @p 2 < 0; @C @ 2 < 0; @C @ > 0: The first two inequalities in Lemma 2.3.1 imply that as long asp 1 1 remains belowp 2 2 , increasingp 1 or 1 will increaseC; i.e., the optimal control routes to pool 1 in a larger part of the parameter space. This 32 is not surprising since then either pool 1’s resolution probability or its effective service rate has improved. Similarly, the next two inequalities imply that as long as p 2 remains below p 1 , increasing p 2 or 2 will decrease C; i.e., the optimal control routes to pool 2 in a larger part of the parameter space. Again, this is not surprising since then either pool 2’s resolution probability or its effective service rate has improved. The intuition for the last inequality is that increasing increases the safety staff, resulting in reduced queue- length, so that the optimal control routes to pool 1 in a larger part of the parameter space. 2.3.2 The DCP Solution WhenJ > 2 Recall from the last subsection that whenp 1 > p 2 and the queue-length costc is large (c > C defined in (2.16)), a threshold control is optimal. There is a similar optimal solution in theJ-pool case, except that a threshold control can have multiple thresholds. The threshold values are used to determine which pool should be idled at timet> 0, given the value of ^ X(t). The number of thresholds is determined by the value ofc, and can be any integer value between 0 andJ 1. In the case that the number of thresholds is zero, a static priority control is optimal. In this subsection, we use Theorem 2.3.1 to verify that a threshold control is optimal, and to determine the number of thresholds. SupposeV (x) andd satisfy the conditions of Theorem 2.3.1, and define j ? (x) := min ( argmin j2J fV 0 (x)p j j (1p j ) j g ) : (2.19) Then, argmin v2V 8 < : V 00 (x) +m(x;v)V 0 (x) +cx + X j2J (1p j ) j v j (x )x 9 = ; = argmin v2V 8 < : X j2J p j j V 0 (x) (1p j ) j v j (x )x 9 = ; =e j ? (x) ; wheree j is theJ-dimensional vector that has 0’s everywhere except for a 1 in thejth position. It follows from (2.13) thatv ? (x) =e j ? (x) is an optimal control. We begin by arguing intuitively that there exists a pool setK ? J havingK pools such that the pools in the setJK ? are never idled (j ? (x) = 2JK ? for anyx< 0). We will later verify rigorously that this assertion is correct. First, observe that if there exist two poolsi andj such thatj > i andp j p i , then 33 poolj has both the higher effective service rate and the higher resolution probability, and so should never be idled. In other words, the pools in the setK ? can be re-labeled so that p 1 1 <p 2 2 <<p K K andp 1 >p 2 >>p K : Then, for any two pools in the setK ? , there is a trade-off between resolution probability and effective service rate. Next, consistent with the definition ofT (1; 2) in Section 2.3.1, let T (i;j) := (1p j ) j (1p i ) i p j j p i i ;i<j2J (2.20) measure the ratio of the difference between the call back rates and the resolution rates of poolsi andj. For i < j < k2J , T (i;j) < T (j;k) suggests that poolj combines the strengths of poolsi andk; that is, its resolution is not too much worse than that of pool i (implying the numerator of T (i;j) is small) and its effective service rate is not too much worse than that of pool k (implying the denominator of T (j;k) is small). Hence pool j should never be idled. This can also be seen algebraically by first noting that T (i;j)T (i;k)T (j;k) (see the proof of Corollary 2.3.3 in the EC), and then observing that: WhenV 0 (x)T (i;k), alsoV 0 (x)T (i;j), which is equivalent to p j j V 0 (x) (1p j ) j p i i V 0 (x) (1p i ) i ; WhenV 0 (x)<T (i;k), alsoV 0 (x)<T (j;k), which is equivalent to p j j V 0 (x) (1p j ) j >p k k V 0 (x) (1p k ) k : The definition ofj ? in (2.19) then implies thatj ? (x)6= j for allx 0. We conclude that the pools in the setK ? should be able to be re-labeled so that p 1 1 <p 2 2 <<p K K p 1 >p 2 >>p K T (1; 2)>T (2; 3)>>T (K 1;K) : (2.21) 34 Our next theorem shows that for a system with pool setK ? having parameters that satisfy (2.21), a threshold control is optimal, and specifies the number of strictly positive threshold levels. Definition 2.3.1 (Threshold Control). For a system with K pools, given K 1 thresholds in a vector ~ L = (L 1 ;L 2 ; ;L K1 ) having 0L 1 L 2 L K1 , a threshold control is defined as v ~ L (x) = (1fL 1 x< 0g;1fL 2 x<L 1 g; ;1fL K1 x<L K2 g;1fx<L K1 g) Theorem 2.3.2. For each c > 0, there exists an increasing function V 0 (x) and a constant d that satisfy the conditions of Theorem 2.3.1 withV :=V(K ? ) defined as in (2.7), except using the setK ? instead ofJ . Furthermore, there exists a sequence of finite constants C 0 := 0<C 1 <C 2 <<C K1 <C K :=1 such that ifc2 (C k1 ;C k ] for somek2K ? , then the threshold controlv ~ L withKk nonzero thresholds is optimal v ? =v ~ L where L i = 8 > < > : V 0 1 (T (i;i + 1)) i =k;k + 1; ;K 1 0 i = 1; 2; ;k 1 and 0<L k <<L K1 . The parametersC 1 ; ;C K1 andL 1 ; ;L K1 can be found by a sequence of one-dimensional searches. It is intuitive that asc increases, the threshold levels should increase, because it implies that the pools with the lower effective service rate will idle more. Furthermore, as c becomes very large, the optimal control will be a static priority control that only idles the pool with the smallest effective service rate. Lemma 2.3.2. Whenc>C 1 ,L i is increasing inc, for alli2f1; 2; ;K 1g. Whenc!1,L i !1. Note that Lemma 2.3.2 is only relevant forc>C 1 because otherwise all threshold levels are zero. The optimal control for theJ-pool system follows from the optimal control for theK ? pool system. This can be seen from the following two Corollaries to Theorem 2.3.2. 35 Corollary 2.3.3. Assume the pools satisfy (2.1), and for some i < j < k2J , T (i;j) T (j;k). If a functionV and a constantd satisfy the conditions of Theorem 2.3.1 andv J 0 is defined by (2.13) withV(J 0 ) forJ 0 :=Jfjg replacingV defined in (2.7), then an optimal control (whenV(J 0 ) does not replaceV) is v ? = (v J 0 1 ; ;v J 0 j1 ; 0;v J 0 j+1 ; ;v J 0 J1 ): Corollary 2.3.4. Assume the pools satisfy (2.1), and that there exists somei < j <J such thatp j p i . If a function V and a constant d satisfy the conditions of Theorem 2.3.1 and v J 0 is as defined in (2.13) withV(J 0 ) forJ 0 :=Jfjg replacingV defined in (2.7), then an optimal control (whenV(J 0 ) does not replaceV) is v ? = (v J 0 1 ; ;v J 0 j1 ; 0;v J 0 j+1 ; ;v J 0 J1 ): Finally, Corollaries 2.3.3 and 2.3.4 can be used iteratively to determine a setK ? J such that the pools in the setJK ? are never idled. Furthermore, the pools satisfy (2.21) under an appropriate re-labeling. In summary, the optimal control for theJ-pool system follows fromv ? given in Theorem 2.3.2 for theK-pool system. An optimal control for theJ-pool system is written as follows. Let :J!K ? be a mapping such that (j) = k ifj2K ? and poolj is thek th pool under the re-labeling that satisfies (2.21); and(j) = 0 if j = 2K ? . Letv K ? be the control specified in Theorem 2.3.2 over the control setV(K ? ). Definev ? to be the vector having components v ? j = 8 > < > : v K ? (j) if(j)6= 0 0 otherwise ;j2J: Theorem 2.3.5. The controlv ? is an optimal control for theJ-pool system. In general,v ? has a threshold structure. Note that ifp 1 = min j2J fp j g, thenK ? =f1g, so thatv ? = e 1 : Pool 1 is always idled because it has the lowest resolution probability and the lowest effective service rate, hence almost all the idle agents are in Pool 1. Also, when c is small enough (c < C K1 ), v ? always idles the pool with the re-labeled indexK in (2.21). This is because the cost of queue-length is small and the re-labeled pool K has the lowest resolution probability of the pools inK ? . Finally, since our initial assumption only assumed pools were ordered according to the values of their effective service rates, so that 36 p 1 1 <<p J J , Theorem 2.3.5 gives an optimal control for the entire parameter space (; j , andp j , j2J ) of the DCP for theJ-pool system. Remark 2.3.1. Recall that in Section 2.2 we showed that any routing control ? 2 that achieves the mini- mum in the original objective (2.3) is on the efficient frontier with respect to the steady-state average waiting time and call resolution. Similarly, any controlv2V that achieves the minimum in (2.10) is on the efficient frontier. Ifl 1 ;l 2 ; ;l K1 defined through equations (6)-(13) in the EC are continuous inc, then it follows thatE[ ^ X(1;v ? ) + ] is continuously decreasing inc and P j2J (1p j ) j E[v j ( ^ X(1;v ? )) ^ X(1;v ? ) ] is continuously increasing inc . Noting that thep-rule solves (2.8) whenc = 0 and thep-rule solves (2.8) as c becomes arbitrarily large, it follows that the set of points found by varyingc from 0 to1 in (2.8) generates the entire efficient frontier. 2.4 The Proposed Reduced Pools Threshold (RPT) Control In this section, we translate the optimal controlv ? that solves the approximating DCP in Section 2.3 into a routing control for the original system. More specifically, when the total number of agents at timet > 0, ^ I(t), is positive, the controlv ? determines the poolj ? (t) that should have the lowest routing priority (which is the pool corresponding to the component ofv ? ^ X (t) that is 1). Because the controlv ? is a threshold control on the reduced pool setK ? J , we name our proposed control the reduced pools threshold (RPT) control. We expect that at any timet> 0, almost all idle agents are from the poolj ? (t) and all other pools have very few idle agents. The structure of the RPT control can be stated without reference to the DCP or its solutionv ? , and this is our objective in this section. In particular, for any given set of parameter values, we provide the number of potentially non-zero threshold levels, which is between 0 andJ 1. The threshold values can be set either as the values in Theorem 2.3.2 (which gives the reduced pool DCP solution), multiplied by p , or they can be found by numeric search. The use of Theorem 2.3.2 also evidences the number of strictly positive threshold values, according to the value ofc. We state the structure of the RPT control first in the case thatJ = 2 (Section 2.4.1), then in the case thatJ = 3 (Section 2.4.2), and finally for anyJ (Section 2.4.3). For intuition, it is helpful to recall that the effective service rates are ordered from lowest to highest, as in (2.1). 37 Table 2.1: The two pool RPT control Parameter Values Pool priority order p 1 p 2 2 1 (p-rule equalsp-rule) p 1 >p 2 &cC 1 2 (p-rule) p 1 >p 2 &c>C T (L) 2.4.1 The RPT Control WhenJ = 2 The RPT control is either thep-rule, thep-rule, or a threshold control, depending on the system parameters. The threshold control T (L) forL > 0 assigns a newly arriving customer at timet > 0 to an idle agent according to the priority rule (ij denotes pooli has higher priority than poolj in routing): T (L) := 8 > < > : 1 2 ifI(t)>L 2 1 ifI(t)L ; and, otherwise, in the case that no agent is free, the customer queues. Table 2.1 provides the RPT control for a 2-pool system, where the value ofC is given in (2.16) in Section 2.3.1. Note that whenc is small, so that minimizing queue-length is much less important than minimizing callbacks, the RPT control is the p-rule. For largerc, the RPT control uses a thresholdL to trade off service rate and call resolution. It follows from Lemma 2.3.2 that asc goes to1,L tends to1, so that the RPT control will become equivalent to the p-rule. 2.4.2 The RPT Control WhenJ = 3 It is helpful to let RPT (j 1 ;j 2 ), forj 1 ;j 2 2J denote the 2-pool RPT control having pool priority order defined in Table 2.1, where we pretend the system consists of only the poolsj 1 andj 2 , with the number of idle agentsI(t) redefined asI j 1 (t) +I j 2 (t). Next, the two threshold control T (L 1 ;L 2 ) forL 2 >L 1 > 0 andM := L 1 +L 2 2 assigns a newly arriving customer at timet> 0 to an idle agent according to the priority order given in Table 2.2. In the case that no agent is free, the customer queues. In particular, the pool priority order places more emphasis on resolution probabilities as the number of idle agents increases. The value ofM is chosen so that the RPT control can be thought of as giving priority in accordance with thep-rule excluding pool 2 whenI(t) is lower and thep-rule excluding pool 2 whenI(t) is larger. Pool 2 is excluded because it is the pool that should be idled when we follow the DCP solution. 38 Table 2.2: The pool priority order depending on system idleness System state Pool priority order 0<I(t)L 1 3 2 1 (p-rule) L 1 <I(t)M 3 1 2 M <I(t)L 2 1 3 2 I(t)>L 2 1 2 3 (p-rule) Table 2.3 provides the RPT control for a 3-pool system, where the values of C 1 and C 2 are given in Theorem 2.3.2 in Section 2.3.2. AlthoughC 1 does not explicitly appear in the table, it is necessary to know C 1 in order to use RPT (j 1 ;j 2 ) to interpret the pool priority order. Also, when reading Table 2.3, it is helpful to recall thatT (i;j);i < j2J defined in (2.20) measures the ratio of the difference between the callback rates and resolution rates of poolsi andj. Table 2.3: The three pool RPT control Parameters Pool priority order Intuitive Explanation p 3 minfp 1 ;p 2 g 3 RPT (1; 2) Pool 3 dominates at least one pool in both metrics; it should never be idled. p 2 p 1 >p 3 2 RPT (1; 3) Pool 2 dominates pool 1 in both metrics; it should never be idled. p 1 >p 2 >p 3 & T (1; 2)T (2; 3) 2 RPT (1; 3) T (1; 2) T (2; 3) suggests that pool 2 has a resolution not too much worse than pool 1 and an effective service rate not too much slower than pool 3; it should never be idled. p 1 >p 2 >p 3 & T (1; 2)>T (2; 3) &cC 2 1 RPT (2; 3) The cost of queue-length is small, pool 1 has the best resolution should never be idled. p 1 >p 2 >p 3 & T (1; 2)>T (2; 3) &c>C 2 T (L 1 ;L 2 ) The cost of queue-length is high enough that no pool will ever always have priority. Observe that in 4 of 5 cases, the RPT control gives one pool the highest priority, meaning agents in that pool are not idled. We consider that pool to be “reduced”. Then, the remaining pools are ordered according to the RPT control defined for two pools. In the last case, no pool can be reduced. Figure 2.2 provides illustrations of the RPT control whenp 1 1 <p 2 2 <p 3 3 andp 1 >p 2 >p 3 , for both the subcases thatT (1; 2)T (2; 3) andT (1; 2)>T (2; 3). In Figure 2.2(a), pool 2 is reduced, so that 39 the priority order of pools 1 and 3 is in accordance with the two pool RPT control in Table 2.1. Notice that whencC 1 2 , pools 1 and 3 are ordered as in thep-rule. Otherwise, their priority order is determined by a threshold (L 1 ) on the number of idle agents. In Figure 2.2(b), pool 2 is not reduced. Then, whencC 1 , the priority order of the pools is as thep-rule. Whenc2 (C 1 ;C 2 ], pool 1 is never idled, and the priority order of pool 2 and pool 3 is determined by a single threshold (L 2 ). Finally, whenc>C 2 , there are two thresholds (L 1 andL 2 ) used to determine the priority of the 3 pools. Furthermore, when pool 2 has the lowest priority, we use a middle thresholdM (defined latter in Definition 2) to determine the priority order of pools 1 and 3 (according to either thep-rule excluding pool 2 when the number of idle agents is lower, and thep-rule excluding pool 2 when the number of the idle agents is larger). Figure 2.2: The RPT control whenp 1 >p 2 >p 3 in a 3-pool system 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 c The number of idle agents 2≻1≻ 3 2≻3≻ 1 C 1 L 1 (a)T(1;2)T(2;3), so pool 2 is reduced. 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 45 c The number of idle agents 1≻2≻ 3 1≻3≻ 2 3≻1≻ 2 3≻2≻ 1 C 1 C 2 L 1 L 2 M (b) T(1;2) > T(2;3), so pool 2 may or may not be reduced, depending onc. In (a),~ = (3; 6; 15),~ p = (0:99; 0:8; 0:5), ~ N = (25; 25; 25), and = 0:9. In (b), we changep 2 to 0.6 and keep all other parameter values unchanged. Remark 2.4.1 (Why is the RPT control in the last case of Table 2.3 not defined recursively?). The reader may wonder if the threshold levelsL 1 andL 2 can be determined separately, using the appropriate thresholds from RPT (1; 2) and RPT (2; 3). This does not work, because when we chooseL 1 andL 2 , we must account for “joint” pool performances. In other words, when there are 2 thresholds involved, it is not possible to “reduce” a pool and rely on the two pool RPT control. Remark 2.4.2 (Why does the pool reduction require conditions onT (i;j)?). We explain the intuition for why pool 2 is reduced when T (1; 2) T (2; 3) graphically, using Figure 2.3. First, for each pool j, we 2 Note that when pool 2 is reduced as in Figure 2.2(a),C1 is defined asC in (2.16), except thatp2 and2 are replaced byp3 and 3. 40 draw a point ( j ;p j j ) on thep plane. Next we draw the line segment connecting ( 1 ;p 1 1 ) and ( 3 ;p 3 3 ), defined by p = p 3 3 p 1 1 3 1 + 1 3 (p 1 p 3 ) 3 1 ;2 [ 1 ; 3 ]: (2.22) Claim: (Verified at the end of this remark.) The condition T (1; 2) < T (2; 3) is equivalent to the point ( 2 ;p 2 2 ) lying on the left of the line (2.22) shown in Figure 2.3. Any point (;p) on the line segment (2.22) represents a virtual pool of agents that can be viewed as a combination of agents from pools 1 and 3. For example, if = 1 + (1) 3 for some2 (0; 1), then, from (2.22), p = p 1 1 + (1)p 3 3 . Idling one agent from the virtual (;p) pool is equivalent to idling pool 1 agents and (1) pool 3 agents. Suppose ( 2 ;p 2 2 ) locates on the left of the line segment, and consider the choice between idling one pool 2 agent and one virtual pool agent. If there is a virtual pool that has worse performance, then we will idle an agent from that virtual pool. Define a virtual pool ( m ;p m m ) by m = 1 + (1) 3 using such thatp 2 2 = p 1 1 + (1)p 3 3 . Then,p m m = p 2 2 . Furthermore, as can be seen from Figure 2.3, 2 < m . Hencep 2 > p m . In summary, the virtual pool has worse performance because it has the same effective service rate, and lower resolution probability. We prefer to idle the virtual pool agent. The pool 2 agent is not idled, and so should have routing priority – meaning pool 2 routing priorities need not be assigned dynamically. Pool 2 is reduced. Figure 2.3: The reduction of poolj depends on its location 0.5 1 1.5 2 2.5 3 3.5 0.5 1 1.5 2 2.5 μ pμ (μ 1 ,p 1 μ 1 ) (μ 3 ,p 3 μ 3 ) T(1,2)<T(2,3) zone T(1,2)>T(2,3) zone Pool 2 is reduced Pool 2 is NOT reduced 1 1.5 2 2.5 3 3.5 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 μ pμ (μ 1 ,p 1 μ 1 ) (μ 3 ,p 3 μ 3 ) T(1,2)<T(2,3) zone T(1,2)>T(2,3) zone Pool 2 is reduced Pool 2 is NOT reduced 41 Verification of the Claim: First note thatT (1; 2) = (1p 2 ) 2 (1p 1 ) 1 p 2 2 p 1 1 <T (2; 3) = (1p 3 ) 3 (1p 2 ) 2 p 3 3 p 2 2 is equivalent to 2 1 p 2 2 p 1 1 < 3 2 p 3 3 p 2 2 : (2.23) Consider an example of pool 2 lying at the left of the line segment connecting pool 1 and pool 3. If 2 minf 1 ; 3 g, then the left hand side of (2.23) is nonpositive, whereas the right hand size is positive, (2.23) is valid. If 1 < 2 < 3 or 1 > 2 > 3 , then (2.23) is equivalent to: p 2 2 p 1 1 2 1 > p 3 3 p 2 2 3 2 ; meaning that the slope of the line segment connecting pool 1 and pool 2 is larger than the slope of the line segment connecting pool 2 and pool 3, this can be seen from Figure 2.3. Remark 2.4.3 (Idleness allocation). When the RPT control has a threshold structure (with non-zero and non-infinite thresholds), the steady-state idleness allocation in all the pools is strictly between 0 and 1; see Figure 2.4. Then, the RPT control exactly coincides with the (limiting) threshold control shown in Armony and Ward (2010) to minimize average waiting time subject to a steady-state fairness constraint (when the in Armony and Ward (2010) is replaced byp). This shows that considering the dual objective of minimizing queue-length (equivalently customer waiting) and callback rate can lead naturally to a fair control, provided the parameters satisfy appropriate conditions (in particular, the last row of Table 2.1 whenJ = 2 and the last row of Table 2.3 whenJ = 3). 2.4.3 The RPT Control With GeneralJ The RPT control whenJ = 3 can be naturally extended toJ > 3. We observe in the case ofJ = 3 that in 4 of 5 divisions of the parameter space, there is one pool that always has the highest priority in routing. In the generalJ pool case, the highest priority may be given to a set of poolsJK ? , whereK ? is defined as in Section 2.3.2. Specifically,K ? J is the set ofKJ pools whose indices can be re-labeled so that in addition to the assumed condition p 1 1 <p 2 2 <<p K K ; 42 Figure 2.4: The idleness allocation in each pool 0 2 4 6 8 10 0 10 20 30 40 50 60 70 80 90 100 c Percentage idleness idleness in pool 1 idleness in pool 2 (a) J = 2, ~ = (3;6);~ p = (0:99;0:9), and ~ N = (25;25). 0 20 40 60 80 100 120 0 10 20 30 40 50 60 70 80 90 100 c Percentage idleness idleness in pool 1 idleness in pool 2 idleness in pool 3 (b) J = 3,~ = (3;6;15),~ p = (0:99;0:6;0:5), and ~ N = (25;25;25). it is also true that T (1; 2)>T (2; 3)>>T (K 1;K) (2.24) p 1 >p 2 >>p K K : (2.25) The priority order of the pools in the setK ? is determined by a threshold control with threshold levels L 0 := 0L 1 L 2 L K1 <L K :=1. The number of strictly positive threshold levels depends on the value ofc, and exactly corresponds to the number of positive threshold levels in Theorem 2.3.2. (Note that the issue of positive threshold levels did not arise in Sections 2.4.1 and 2.4.2, because we explicitly separated out these cases in Tables 2.1 and 2.3.) Definition 2.4.1 (The Reduced Pools Threshold (RPT) Control). Upon the arrival of a customer at time t> 0, if no agent is idle, the customer queues; otherwise, the customer will be routed to an available agent in accordance with the following (potentially state-dependent) priority order for the pools: The pools in the setJK ? have the highest priority, equally; Poolj ? that satisfiesL j ? 1 <I(t)L j ? has the lowest priority; 43 The order of the remaining pools is determined by thep-rule ifI(t) M, and is otherwise deter- mined by thep-rule, where M := 8 > > < > > : P K1 j=1 L j P K1 j=1 1fL j >0g if P K1 j=1 1fL j > 0g> 1 0 otherwise : The value M is set to be in the middle of the positive thresholds values. This is consistent with pri- oritizing the pools with higher resolution probability when there are many idle agents, and prioritizing the pools with higher effective service rate when there are few idle agents. Note that the value ofM is set in accordance with our judgement, because the DCP solution gives no guidance on the priority ordering of the remaining pools in the third bullet point. Figure 2.5: The RPT control of a 5-pool system 0 1 2 3 4 5 6 0 5 10 15 c The number of idle agents L 4 L 3 L 2 L 1 M 1≻2≻ 3≻4≻5 1≻2≻ 3≻5≻4 1≻2≻ 4≻5≻3 5≻4≻ 2≻1≻3 5≻4≻ 3≻1≻2 5≻4≻ 3≻2≻1 5≻3≻ 2≻1≻4 C 1 C 2 C 3 C 4 Figure 2.5 shows the RPT control when J = 5 and the conditions (2.24) and (2.25) are satisfied (in addition the assumed upfront ordering of the pools from lowest to highest effective service rate in (2.1)). We see that whencC 1 , the RPT control has no threshold and degenerates top-rule. Asc becomes larger, the RPT control has more and more thresholds and more and more pools may sometimes be idled. When c>C 4 , the RPT control has 4 thresholds, meaning that each pool is sometimes idled. When the system state is between the thresholdsL j andL j+1 , poolj is idled and the remaining pools are prioritized in accordance with thep-rule when the number of idle agents exceedsM and are otherwise prioritized in accordance with thep-rule. Asc!1, all the thresholds go to1, and the RPT controls behaves like thep-rule. In general, the RPT control is a dynamic control that uses thresholds to determine pool priorities. The RPT control becomes a static control that always assigns the lowest priority to the pool with the largest index inK ? whenK ? has cardinalityK > 1 andcC 1 , orK = 1, 44 2.5 The Efficient Frontier We expect that the RPT control is on the efficient frontier as becomes large. This is because the RPT control is motivated by the controlv ? in Section 2.3 that solves the diffusion control problem that arises as an approximation to (2.3) as becomes large. Our objective in this section is to use simulation to show that the RPT control is indeed on the efficient frontier. This requires comparison controls. Thep-rule and thep-rule are natural comparison controls. However, thep-rule results in most all of the idle agents being from pool 1, and thep-rule results in most all the idle agents being from the pool with the smallest p j , j2J . Therefore, we also consider controls in the literature that can produce an entire range of idleness allocations for each pool, anywhere from 0 to 1. One such control is the family of queue-and-idleness ratio (QIR) routing controls introduced in Gurvich and Whitt (2009a). We consider a simple QIR control for the inverted-V model that is specified in terms of static idleness ratiosf j 2 [0; 1];j2J; having P j2J f j = 1. Then, upon the arrival of a customer at time t> 0, the QIR control routes the customer to an available agent in pool j ? :=j ? (t)2 argmax j2J;I j (t)>0 fI j (t)f j I(t)g; and if there are no agents available, the customer queues. Note that when is large, consistent with the results of Gurvich and Whitt (2009a) when there are no callbacks, our simulations show that I j (t)f j I(t); for allj2J: This confirms our expectation that the reduction in problem dimensionality assumed in the informal deriva- tion of the DCP in the first paragraph of Section 2.3 holds. Another natural comparison control is the longest-weighted-idleness control in Definition 3 in Ward and Armony (2013), which extends the longest-idle-server-first control in Atar (2008). This control routes an arriving customer to the agent that has the longest weighted idle time. Theorem 5 in Ward and Armony (2013) shows that this control is asymptotically equivalent to QIR. A final natural comparison control is the randomized-most-idle (RMI) routing control in Mandelbaum et al. (2012). This control routes an arrival to one of the pools with available agents, with probability that 45 equals the fraction of idle agents. The RMI control is asymptotically equivalent to a QIR control; see Corollary 1 in Mandelbaum et al. (2012). It follows that to show the RPT control is on the efficient frontier, it is reasonable to begin by simulating only the RPT and QIR controls. We choose the simulation parameters based on the empirical data used by Mehrotra et al. (2012). In their electronic companion Appendix B, they describe a call center data set in which there are 4 call types and 228 agents, with each agent capable of handling all 4 call types. The agents are clustered into 20 pools according to their service speed and resolution probability for each call type. Each pool has 1 to 39 homogeneous agents. To be consistent with our model, we focus on one call type. We simulate the inverted-V model with 2 and 3 pools, and our choices for the service speeds and resolution probabilities are guided by the empirical observations in Table A.1 in Appendix A.1. Specifically, when J = 2, we let (~ ;~ p) = ((3; 6); (0:99; 0:90)); ((3; 12); (0:99; 0:50)) and when J = 3, we let (~ ;~ p) = ((3; 6; 15); (0:99; 0:80; 0:50)); ((3; 6; 15); (0:99; 0:60; 0:50)). We fix the size of each pool to be 25, in the same order with the data in Mehrotra et al. (2012). (See the EC for details on the simulation study setup, and summary details on the aforementioned data.) Figure 2.6: Simulated comparison between RPT and QIR controls in a 2-pool system 0.926 0.927 0.928 0.929 0.93 0.931 0.932 0.933 0.934 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 Call resolution Average waiting time (min) RPT QIR (a)~ = (3;6);~ p = (0:99;0:90); ~ N = (25;25), and = 0:9 0.585 0.59 0.595 0.6 0.605 0.61 1.4 1.45 1.5 1.55 1.6 1.65 1.7 1.75 1.8 1.85 Call resolution Average waiting time (min) RPT QIR (b) ~ = (3;12);~ p = (0:99;0:50); ~ N = (25;25), and = 0:9 All the figures reporting our simulation results (Figures 2.6, 2.7, and 2.8) display the steady-state call resolutionp(1;) on the x-axis and the steady-state average waiting timeE[W (1;)] in minutes on the y-axis. Then, the figures capture the resulting decrease in call resolution of the system would experience for any given decrease in average waiting time. 46 We begin by reporting our simulation results of a 2-pool inverted-V model in Figure 2.6. For this, we vary c from 0 to1 in order to produce a family of RPT controls with threshold levels that vary so that each pool experiences idle time allocation between 0 and 1. Second, for the QIR control, we let f 1 2 (0:0; 0:1;:::; 1:0) andf 2 = 1f 1 , so that again the idle time allocation experienced by each pool varies between 0 and 1. We plot the simulated performance for each control from both families of RPT controls and QIR controls. Note the point that corresponds to the highest call resolution and highest average waiting time shows the performance of the p-rule (RPT control with threshold 0, or QIR control with f 1 = 0), and the point that corresponds to the lowest call resolution and lowest average waiting time shows the performance of thep-rule (RPT control with threshold1, or QIR control withf 1 = 1). We observe that in both Figures 2.6(a) and (b), the RPT control is on the efficient frontier and the QIR control is not. This is not surprising given that the diffusion approximation in (2.9) associated with the RPT control hasv ? (x) = (1fLx< 0g;1fx<Lg) and the one associated with the QIR control hasv(x) = (f 1 ; 1f 1 ). (We provide additional simulation results regarding the effect of larger and smaller values of on the efficient frontier in the EC.) The efficient frontier in Figure 2.6 (and in all the figures in this section) is small in the sense that the aver- age waiting times are all within half a minute and the call resolutions are all within 0.03. This is consistent with the difference in average speed of answer and call resolution shown in Figure 1 in Mehrotra et al. (2012) when we compare the performance of thep-rule and thep-rule. (The remaining routing rules shown there do not apply to the inverted-V model.) Still, the trade-off decision between emphasizing waiting time and call resolution is important because: (1) small improvements in the experience of each individual customer can lead to large improvements in the performance of the entire system, and (2) developing the efficient frontier for the inverted-V model analytically in the QED regime can be a stepping stone to developing the efficient frontier in the more general model with heterogeneous customers and heterogeneous agents. Recalling that the simplep-rule is focused on maximizing the call resolution and the simplep-rule is focused on minimizing the average waiting time suggests the following routing heuristic: when there are many idle agents (I(t) > M), route customers in accordance with thep-rule, but when there are few idle agents (I(t)M), route customers in accordance with thep-rule. Then, by varying the value ofM from 0 to1, we can find all combinations of call resolution and average waiting time that are achievable by this heuristic routing control. Note that in a 2-pool system, this heuristic control assigns the same priority orderings as the RPT control whenM is set to equal the threshold levelL 1 . It is reasonable to set the value 47 ofM as in Definition 2.4.1 (and then potentially adjust its value to achieve better performance). Note that whenJ 3, the priority ordering of the heuristic control and the RPT control do not in general coincide. Figure 2.7 compares the RPT controls, the heuristic controls and the QIR controls in a 3-pool system. Figure 2.7(a) hasT (1; 2) T (2; 3) so pool 2 is reduced; Figure 2.7(b) hasT (1; 2) > T (2; 3) so no pool is reduced. To generate the figures, we first vary the value of c between 0 and1 . Then, for each given c, we solve for the optimal thresholds and generate the plots for the RPT control. We varyM from 0 to1 to generate plots for the heuristic control. For the QIR control, we generate (f 1 ;f 2 ;f 3 ) as a three-dimensional grid with even spacing 0.2. More specifically, we choose the ratios from the setf(f 1 ;f 2 ;f 3 )jf 1 +f 2 +f 3 = 1 and ;f j is a multiple of 0:2;j = 1; 2; 3g, which results in 21 possible vectors (f 1 ;f 2 ;f 3 ). Figure 2.7: Simulated comparison between RPT, heuristic, and QIR controls in a 3-pool system 0.62 0.625 0.63 0.635 0.64 0.645 0.65 0.655 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Call resolution Average waiting time (min) RPT QIR Heuristic RPT extension (a) ~ = (3;6;15);~ p = (0:99;0:8;0:5); ~ N = (25;25;25), and = 0:9. (Pool 2 is reduced.) 0.57 0.575 0.58 0.585 0.59 0.595 0.6 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 1.1 Call resolution Average waiting time (min) RPT QIR Heuristic (b) ~ = (3;6;15);~ p = (0:99;0:6;0:5); ~ N = (25;25;25), and = 0:9. (Pool 2 is not reduced.) We again observe that the RPT control is on the efficient frontier. However, there is a caveat. Recall that Remark 2.3.1 notes that any solution to the approximating diffusion control problem generates the entire efficient frontier asc varies from 0 to1. However, for that is not large enough, there is some additional thought needed when one or more pools are reduced. For example, in Figure 2.7(a), pool 2 is reduced, and so the RPT control always gives pool 2 the highest priority in routing. Its performance is closest to that of thep-rule when its associated threshold is infinite so that the RPT control has static priority ordering 2 3 1. In order to achieve the same performance as thep-rule in a manner that is consistent with the approximating diffusion control problem solution, we “extend” the RPT control using an additional threshold (which we vary from 0 to1) that trades-off the higher priority pools 2 and 3, keeping pool 1 48 as the lowest priority pool. 3 This is consistent with the solution to the approximating diffusion control problem because that solution only specifies which pool should remain idle. Finally, we note that this issue does not arise when no pools are reduced, as in Figure 2.7(b), because then all the thresholds being 0 is consistent with thep-rule and all the thresholds being1 is consistent with thep-rule. The performance of the heuristic control appears to be slightly worse in Figure 2.7(b) than in Figure 2.7(a). This is consistent with the observation that the RPT and heuristic controls sometimes give different pools the lowest routing priority in Figure 2.7(b), but not in Figure 2.7(a). In particular, the heuristic control either prioritizes the pools as 3 2 1 or 1 2 3. In Figure 2.7(b), the RPT control sometimes gives pool 2 the lowest priority in routing (as shown in Figure 2.2(b)); however, in Figure 2.7(a), pool 2 always has the highest priority in routing and the lowest priority pool is either 1 or 3 (as shown in Figure 2.2(a)). Next, when we compare the QIR and RPT controls, the QIR controls that are closest to the efficient frontier are those having idleness allocations (f 1 ;f 2 ;f 3 ) that are similar to the RPT control. For Figure 2.7(a),f 2 = 0 matches the idleness allocation of the RPT control. This is because the RPT control always gives pool 2 the highest priority, and so there are rarely idle servers from pool 2. For Figure 2.7(b), the RPT idleness allocations are as shown in Figure 2.4(b) in Remark 2.4.3. It is interesting to see that the QIR controls with matching idleness allocations sometimes achieve performance very near or on the efficient frontier. This is not surprising around both endpoints of the efficient frontier, because the RPT, QIR, and heuristic controls mimic the p-rule or p-rule. However, in general, we cannot expect the RPT and QIR controls to have the same performance, because their resulting diffusion approximations are different. We leave the question of when and how close the performance of the RPT and QIR controls can be for future research. Finally, we hypothesize that the threshold structure of the RPT control is good, even when the service times are not exponential. First, the identification of the set of pools that should have highest priority in routing (the setJK ? ) depends only on the mean service times and resolution probabilities, and not on variance. Hence, it is reasonable to expect that this set of pools is robust to changes in the distributional assumptions on inter-arrival and service times (even though the approximating DCP relies on the assumption of exponential service times). Second, Figure 2.8 shows that the RPT control continues to be on the efficient 3 Similarly, the performance of the RPT control is closest to that of thep-rule, when its associated threshold is 0, so that the RPT control has static priority ordering 2 1 3. However, pools 1 and 2 are “close enough” that there is no obvious performance difference between thep-rule and RPT control. 49 frontier when service times follow either a lognormal (not exponential) distribution. Note that the statistical study of a call center in Brown et al. (2005) showed that service times followed a lognormal distribution. We continued to assume Poisson arrivals, which is a simplification of the finding in Brown et al. (2005) that the aforementioned call center had an arrival process that was consistent with a time-varying Poisson process. This can be reasonable in practice, because one way to handle control problems when arrivals are time-varying is to split the time horizon into fixed size intervals, and assume a constant arrival rate over each interval. In Figure 2.8(a) and (b), we investigate lognormal distributed service times. We simulate two different systems with different coefficients of variation.In Figure 2.8(a) we assume the service times of pool 1 agents follow the lognormal distributionF 1 LogNomal(2; 1:5 2 ), where ifX 1 has distributionF 1 , then logX 1 has a normal distribution with mean -2 and variance 1:5 2 . The service times of pool 2 agents follow the lognormal distributionLogNomal(3; 1:5 2 ). They both have the same coefficient of variation, which is p exp(1:5 2 ) 1 = 2:91. The mean service rates of pool 1 and pool 2 are 2.4 and 6.52 respectively. We vary the threshold from 0 to1, which corresponds to varyingc from 0 to1 to draw the points corresponding to the RPT controls. We vary thef 1 in the QIR control from 0 to 1 to draw the points corresponding to the QIR controls. In Figure 2.8(b) we let the service times of pool 1 agents followLogNormal(1; 0:5 2 ), and the service times of pool 2 agents followLogNormal(2; 0:5 2 ). The mean service rates are still 2.4 and 6.52 respectively, whereas the coefficients of variation of the two distributions both change to p exp(0:5 2 ) 1 = 0:533. We see that the RPT controls are still on the efficient frontier. Furthermore, when the coefficient of variation is larger, there is more variation in the average waiting time, but the variation in call resolution is not much affected. (These insights also hold when service times follow a Gamma distribution; see the additional simulation results in the EC.) 2.6 Conclusions and Future Research We have proposed a reduced pools threshold (RPT) routing control for call centers with homogeneous cus- tomers and heterogeneous agents that trades off the dual performance objectives of minimizing the average waiting time and minimizing the callback rate (equivalently, maximizing the call resolution, the overall per- centage of arriving calls that are successfully served and doesn’t result in callbacks). The pools are reduced 50 Figure 2.8: Simulated comparison between RPT and QIR in a 2-pool system with lognormal service times 0.921 0.922 0.923 0.924 0.925 0.926 0.927 2.7 2.8 2.9 3 3.1 3.2 3.3 3.4 3.5 3.6 Call resolution Average waiting time (min) RPT QIR (a) Lognormal service times with CV=2.91 0.921 0.922 0.923 0.924 0.925 0.926 0.927 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Call resolution Average waiting time (min) RPT QIR (b) Lognormal service times with CV=0.533 ~ p = (0:99; 0:90); ~ N = (25; 25); = 0:9. because, depending on the system parameters, there may be some pools that always have priority when rout- ing, and so almost never have idle agents. The remaining pools have routing priorities that are dynamically determined using thresholds on the number of idle agents. Then, those pools sometimes have idle agents, and sometimes not, depending on the system state. We have shown that our proposed control is on the efficient frontier with respect to average waiting time and call resolution. We have done this analytically, by solving the relevant approximating diffusion control problem, and through simulation. The approximating diffusion control problem is analytically tractable because it is one-dimensional. The first important insight coming from the DCP solution is understanding how to reduce the pools; that is, identifying which pools should never have idle agents. We have provided a simple set of parameter conditions (see (2.21)) to separate out the pools that should never have idle agents. Then, depending on the importance of the customer waiting in relation to call resolution, there may be additional pools that should be reduced. The second important insight coming from the DCP solution is understanding how to dynamically determine the routing priorities for the remaining pools (that have not been reduced). In particular, this determination is made using a control that has a simple threshold structure. The main practical implication of our work is that focusing on minimizing average waiting time as the sole performance objective may not deliver the best customer experience. In particular, when the perfor- mance objective is to minimize the average waiting time (corresponding toc being very large), almost all of the idle agents will be the slowest agents. However, it may be the case that some slow agents have high ser- vice quality, because they have high resolution probability. Then, the customer experience suffers, because 51 there is a higher probability that a customer will not have his inquiry properly resolved, and so will call back. Hence it is important to consider objective functions that take into account call resolution. Another practical implication of our work concerns staffing. In particular, the RPT control identifies those agent pools that should never be idled. Then, if a call center manager must make a decision regarding which agents to keep and which agents to let go, the agents to keep are those that should never be idled. This leads naturally to a problem formulation that includes the staffing decisions of the system manager; for example, minimizing staffing costs subject to quality of service constraints on the average waiting time and the call resolution. One interesting direction for future methodological research concerns how to generalize the model to include heterogeneous customers. The identification of the RPT control as the solution to the approximating diffusion control problem for the inverted-V model can be a first step. This is because there are cases in the literature (as mentioned in the Introduction) when the approximating diffusion control problem sepa- rates into one for a V-model and one for an inverted-V model. There is an additional complication in that the rate at which calls of different types are directed to the different agent pools must first be determined before the approximating diffusion control problem can be formulated, and those rates will affect the system performance. Another interesting direction for future research is to consider models that trade-off service quality and service speed in other application domains. For example, in emergency departments, arriving patients are often routed according to the severeness of their condition, determined at triage. In general, urgent cases receive immediate care while non-urgent cases may have the option of waiting for physician attention, or being seen only by a nurse (which could be considered lower quality care). The routing of non-urgent patients may benefit from the use of thresholds to determine when the system is so crowded that non-urgent patients should not wait to be seen by physicians. Our model assumes that service speeds and resolution probabilities are fixed. However, in reality, agents learn (and so, over time, may increase their service speeds and/ or resolution probabilities). It is of interest to develop models that incorporate agent learning when agents are heterogeneous. Arlotto et al. (2014) is one recent work in this direction; however, their focus is on hiring and retention rather than routing. It is also true that agents make their own individual trade-off decisions between how long to spend with a customer and what service quality to give that customer. This presents the challenge of providing an 52 analytic model for how agents make such decisions that is relevant to the call center environment. Such a model could use routing to incentivize agent behavior, or it could use agent compensation. However, this requires care, because it is easy for agent incentive schemes to backfire. For example, Figure 19 in Gans et al. (2003) documents agents hanging up on customers. In the next chapter, we focus on how to use incentives for employees to improve the operational performance of service systems. 53 Chapter 3 Compensation and Staffing to Trade Off Speed and Quality in Large Service Systems 3.1 Introduction The service sector occupies a central position in the U.S. economy. For example, it has grown from 53.3% of GDP in 1999 to to 61.6% in 2014 (Bureau of Economic Analysis (2014)). In August 2014, the service sector activity accelerated to 6.5-year high (Mutikani (2014)). Not surprisingly, there is much research focused on service system design. One common assumption is that employees work at fixed rates. However, recent empiric work by Buell et al. (2014), Song et al. (2014) and Shunko et al. (2014) demonstrates that system-design related incentives can affect service speed and/or quality. In this paper, we build a theoretic model to investigate such an effect. The central question when designing a service system is: how many employees should be staffed and what should be their compensation? This is because for many service systems the most significant percent- age of their operating costs is labor. Hence there are many studies (e.g., Milkovich and Newman (2004), Garnett et al. (2002), Borst et al. (2004)) on compensation and staffing. However, these two problems are very often studied separately; for example, in the aforementioned book and papers, the studies on how to structure compensation ignore staffing, and the studies on staffing ignore compensation design. The issue is that the compensation affects employee motivation, which affects the throughput rate of completed tasks, which affects the staffing required to handle a given workload. On the other hand, the staffing level affects the competition between the employees and their service speed. In other words, the problem of determining how many employees to staff and how to structure their compensation is inherently a joint problem. To study this joint problem, we begin with a classic model used to inform staffing decisions that ignores employee compensation. Then, we enhance the model to incorporate the employee economic incentives. More specifically, we consider an M/M/N+M queueing model except we do not assume fixed service rates 54 that are exogenously given. Instead, we assume a compensation scheme that rewards both volume and qual- ity, and suppose that each employee, (henceforth called server, to be consistent with the queueing nomen- clature), works so as to maximize his or her compensation. Now, the number of servers that must be staffed to meet a given performance metric inherently depends on the compensation scheme, because that compen- sation scheme is a major factor in determining each server’s throughput, or service, rate. The implicit assumption is that the servers have discretion in how long they take to complete each service. This is natural in the professional and complex service work performed by highly skilled workers (such as engineers and physicians), as modeled in Hopp et al. (2007). This can also be true in a revenue- generating call center environment, in which the servers can choose whether or not to upsell (that is, to entice the customer to buy more), and how hard to try to upsell (Armony and Gurvich (2010)). Then, it is natural to assume that the longer a server spends on a service, the more likely that service is to be successful. For a fixed staffing level, the service rates emerge as a Nash equilibrium solution of the game defined by the server compensation structure. When the staffing level is low, so that there are almost always customers waiting to be served, each server will serve at the rate that exactly balances his rewards for volume and quality. On the other hand, when the staffing level is higher, there may not always be customers waiting to be served. Then, there is competition between the servers for customers, which may cause them to increase their service rate. The compensation structure must account for this. The idea of having low staffing may seem appealing to the system manager, because the staffing costs are also low. The problem is that then the customer wait time can become unacceptably long, leading customers to abandon. Hence the system manager must somehow penalize the agents for customer abandonments, as well as for failed services. How the system manager chooses to incorporate this additional penalty in the server compensation structure depends on his own cost structure, and, in particular, the relative emphasis that cost structure places on abandonments vs failed services. Our contribution in this paper is to understand the relationship between the equilibrium service rates, the staffing level, and the compensation offered by the system manager. We do this by first characterizing an equilibrium service rate, and deriving a simple approximation for that rate when the system is large. Then, we solve for the cost-minimizing staffing level and compensation structure, under different assumed cost structures for the system manager. Since the system manager cannot directly control the service rate, this means finding a first best solution. We show when a first best solution staffs so that the system operates in the quality-driven (QD) regime, the critically loaded regime, and the efficiency-driven (ED) regime, as 55 defined in Garnett et al. (2002). Very interestingly, we also see a situation in which a first best solution has the system simultaneously operate in a quality-driven regime and in an efficiency-driven regime - that is, the servers experience idle time even while customers abandon. The remainder of this paper is organized as follows. First, we review some related literature. Then, in Section 3.2, we specify the compensation structure and the resulting game. Section 3.3 characterizes a symmetric equilibrium service rate that solves this game when the arrival rate and staffing level are large. We assume a linear cost structure for the system manager, and show that this causes the system to operate in the critically loaded regime in Section 3.4. Then, we vary the servers’ utility function and the system manager’s cost structure to discover the optimality of a panorama of limit regimes (QD, ED, and a novel combination of those two) in Section 3.5. Finally, in Section 3.6 we make some concluding remarks. 3.1.1 Literature Review Our model assumes there is a trade-off between speed and quality that can be critical to the customer expe- rience of the service. This is true in the call center application settings in the recent papers by Mehrotra et al. (2012) and Zhan and Ward (2014). This is also true in many other application settings, such as health- care and manufacturing (Lovejoy and Sethuraman (2000), Kostami and Rajagopalan (2014), Alizamir et al. (2013)). The difference between these papers and ours is that none of the aforementioned papers model the service rate as a decision made by a selfish, utility-maximizing server. The papers Hopp et al. (2007), Anand et al. (2011), and Lu et al. (2009) all analyze models in which the server accounts for both speed and quality when choosing the service time that maximizes his utility. Hopp et al. (2007) allow for dynamic decisions, whereas Anand et al. (2011) and Lu et al. (2009) restrict to a static service rate choice, as we do. None of those papers consider the problem of how to staff systems with a large number of servers. Our focus on staffing naturally leads to a fluid analysis with a speed-quality trade-off, that is methodologically more similar to Chan et al. (2014). One main difference is that in that paper the servers are restricted to have two possible speeds, whereas we allow for a continuum of service speeds. The service rates chosen by the servers are those that maximize their expected compensation. In other words, the service rates are the solution to a queueing game. There is a large literature on queueing games, and we refer the reader to Hassin and Haviv (2003) for the pre-2003 papers. Much of that literature assumes 56 fixed service rates, and focuses on how customers that strategically decide whether or not to queue for service, and where to queue, affect system performance. Some exceptions (that is, papers that allow the service rates to be the solution to a game) are Kalai et al. (1992), Gilbert and Weng (1998), Cachon and Harker (2002), Cachon and Zhang (2007), Debo et al. (2008), Geng et al. (2013). Still, the maximum number of servers in all of the aforementioned papers is two. The problem of how a system manager influences employee behavior can be thought as a principal-agent problem. In that literature pioneered by Akerlof (1970), Spence (1973), Rothschild and Stiglitz (1976) and Holmstrom (1979), the agents are usually assumed to be risk averse. This is because when the agents are risk neutral, it is generally easy to use payment incentives to transfer the risk to the agents assuming there is no free riding or competition problem (if there is, see Holmstrom (1982)). However, the staffing decision complicates this risk transfer. In relation to this literature, our paper is the first that endogenizes the staffing decision into the policy. The spirit of our analysis is most similar to that in Maglaras and Zeevi (2003), Armony and Maglaras (2004), Maglaras and Zeevi (2005), Allon and Gurvich (2010), Allon et al. (2014), and Gopalakrishnan et al. (2014), all of whom use large system asymptotics to tackle service system design problems. However, with the exception of Gopalakrishnan et al. (2014), none of these papers are focused on the effect of servers in the same firm competing for incoming customers. In Gopalakrishnan et al. (2014), this competition emerges in a fixed-wage or volunteer model, meaning that the service rate chosen by each server maximizes a non-payment-based utility. In this paper, the competition emerges because each server’s compensation is increasing in the number of customers served. The solution concept we use to determine the service rates is a pure strategy Nash equilibrium. Then, at equilibrium, our model is equivalent to a standard M/M/N+M queueing model, with homogeneous cus- tomers and servers. Garnett et al. (2002) show that this system can be divided into three regimes, depending on the staffing level: one in which servers enjoy idle time and no customers abandon, another in which servers have no idle time and no customers abandon, and a final one in which servers have no idle time and customers abandon. This division is reminiscent of the division in the paper Borst et al. (2004) for an M/M/N queueing model. One main objective in this paper is to show how the aforementioned regimes emerge in response to the compensation scheme that is chosen by the system manager in order to minimize cost. 57 The fact that different regimes emerge contrasts with Ren and Zhou (2008), who have a similar cost structure to ours, but find that the critically loaded regime is optimal. The difference is that Ren and Zhou (2008) focus on designing an optimal call center outsourcing contract. Since that contract is between two firms, the finer details of how the outsourcing firm manages its employees are ignored. 3.2 A Compensation Model The objective of this paper is to understand the relationship between compensation incentives, server perfor- mance, and operational costs, in a service setting with discretionary task completion. In particular, servers have flexibility over the speed at which they work, but with the following trade-off: faster speed results in lower quality work whereas slower speed results in higher quality work. The system manager does not monitor servers and cannot directly dictate the speed at which each server works. However, the manager can influence this speed through compensation incentives. The trade-off between the work speed and the quality is captured through a functionp that specifies the probability of successful service, defined as the customer being satisfied. To make that definition concrete, we assume that any unsuccessful service triggers a customer complaint so the manager captures each service failure. The assumption thatp is decreasing captures settings in which customer satisfaction is positively correlated with service time. For example when the service corresponds to problem diagnosis, more time spent understanding the problem often translates to a higher probability that the root cause of the problem is discovered (and it is natural to equate customer satisfaction with correct problem diagnosis). We further assume p() is strictly concave in so that compared with constant service rate, a server working at different speeds dynamically cannot achieve more successful services. Note that ifp() is concave, then p() is strictly concave, but not vice versa. Assumption 3.2.1. (Properties ofp). Each server is capable of completing services at any rate2 [;], where 0<<1. The functionp : [;]!f0; 1g is twice continuously differentiable and monotone decreasing. Furthermore,p() is strictly concave on [;]. 58 We assume a simple linear compensation scheme in which each server is paidP S per completed service and penalizedP F for an unsuccessful, or failed, service. Then, under the assumption that there are always service tasks waiting to be completed, the hourly payment to a server working at service rate is P () := (P S P F (1p())): (3.1) From Assumption 3.2.1,P () is strictly concave on [;], and so there is a unique service rate that maxi- mizes the expected payment, ? := argmax 2[;] P (): (3.2) The compensation function in (3.1) is the potential payment. The word potential is used to emphasize the fact that that payment is only received when there is always work. However, in reality, there may or may not be a request waiting every time a server completes a service; in other words, it is unrealistic to expect to keep servers busy completing service requests 100% of the time. Then, when there areN servers and is the service rate vector representing all the servers, the expected hourly payment a serveri2f1;:::;Ng working at service rate i receives U i () :=P ( i )B i (); (3.3) whereB() is the busy-time percentage vector andU() is the expected payment vector. Later, in Section 3.5, we will see the effect of the servers also valuing non-monetary benefits, such as idle time. The important difference between (3.1) and (3.3) is that a server that wishes to maximize (3.1) needs only worry about his or her own behavior, and so will serve at rate ? in (3.2), whereas a server that wishes to maximize (3.3) must worry about the behavior of others. This is because the busy-time percentage of a given serveri,B i (), depends on the number of servers staffed and their service rates. For example, server i’s busy-time percentage, and therefore also his or her expected payment, may decrease if serverj speeds through the services required, leaving few for serveri. The system manager can encourage or discourage this competition by staffing more or less servers, and by controlling how the work is routed to the servers. 59 The preceding discussion suggests that it is natural to model the servers as strategic players in a non- cooperative game, with each server having utility function (3.3). The service rates the servers choose will be a Nash equilibrium of this game. In particular, an equilibrium is a service rate vector that satisfies U i () = max v2[;] U i ( 1 ; ; i1 ;v; i+1 ; ; N ); i2f1;:::;Ng: (3.4) We denote by E a vector of equilibrium service rates. In this paper we focus on finding a symmetric equilibrium service rate, in which all components of the equilibrium service rate vector are identical. This is natural since the servers have the same utility functions. A symmetric equilibrium service rate vector E depends only on the payment ratio P R := P F P S : This can be seen by observing that the hourly payment in (3.1) is equivalently written as P () =P S (1P R (1p())): Then, there are many values ofP F andP S that result in the same E . The system manager will want to choose the values that result in the least possible payment to the servers, subject to the servers receiving enough money that they do not quit. We begin in Section 3.3 by characterizing E as a function ofP R , and then, in Sections 3.4 and 3.5, assume that the servers have an alternative employment option and incorporate an individual rationality constraint when deriving the system manager’s optimal staffing and compensation decisions. To calculate a symmetric equilibrium rate E from (3.4), we must first determine the busy-time percent- age vectorB(). For that, we must specify how service requests arrive, and how they are routed to these servers. In a slight abuse of notation, when all servers work at the same rate, we let represent both the service rate vector and the rate at which each of the servers work. Then, since the busy-time percentage and the expected utility of every server is the same, we may think ofB() andU() as either one-dimensional functions or vectors according to the context. 60 3.3 The Symmetric Equilibrium Service Rate We adopt a classical model for service systems: the M=M=N +M, or Erlang-A, queue, except that the service rate is not specified exogenously. Specifically, the arrival rate to the system is> 0, the impatience parameter is > 0, and each server determines the service rate by maximizing his or her expected hourly payment (3.3). We assume the longest-idle-server-first (LISF) routing rule; that is, when a request arrives and more than one employee is available, the employee that has been idle the longest is the one designated to serve the request. This, as well as the longest-cumulative idleness variant, is a common routing rule in the literature (e.g., Atar (2008), Ward and Armony (2013), Atar et al. (2011), Reed and Shaki (2014)), because it seems fair and many organizations care about fairness (e.g., Kahneman et al. (1986), Cohen-Charash and Spector (2001), Colquitt et al. (2001)). Although the LISF routing can be difficult to analyze, its behavior is identical to random routing. (See Lemma 3.3.1.) For this system, the vectorB() in (3.3) is very complicated. Hence the problem (3.4) is analytically intractable. Fortunately, for large systems, we can solve (3.4) asymptotically. We begin in Section 3.3.1 by calculating the equilibrium service rate when N = 1. Then, in Sec- tion 3.3.2, we provide a simple approximation for the symmetric equilibrium service rate, and justify it asymptotically, when and N are large. The important insight is that when the staffing N is such that N ? , the servers can attain their maximum paymentP ( ? ) in large systems, but otherwise cannot. 3.3.1 The M/M/1+M Queue In this simple setting with only one server, an equilibrium service rate defined by (3.4) maximizes the payment in (3.3), so that E = argmax 2[;] P ()B(): In particular, the server does not compete with other servers for customers, and needs only worry about his or her own service speed. The trade-off is that, as the server increases speed, there are more completed services, but also less successfully completed services. To calculate E , it is necessary to calculate the steady-state busy-time percentageB(). TheM=M=1+ M queue is a birth-and-death process with birth rate and death rate + [n 1] + when there are n 61 customers in the system, where [x] + = maxfx; 0g. This process is positive recurrent for any positive values of,, and, and so always possesses a steady-state distribution. Then, standard analysis shows that B() = Pr(there are customers in the system) = 1 1 1 + P 1 i=1 Q i1 k=0 +k : A unique interior maximum occurs if there exists exactly one that satisfies the first order condition (FOC) U 0 () =P ()B 0 () +P 0 ()B() = 0; (3.5) and is a maximum. If the FOC does not have a solution, the maximum will be achieved at the boundary points. Although the expression forB() is complicated, it is possible to characterize E as a function of the payment ratioP R . From (3.5) we have P R = B() +B 0 () (1p()) (B() +B 0 ())p 0 ()B() : Define the right-hand side as a functionr(). Proposition 3.3.1. (TheM=M=1 +M equilibrium service rate). a. IfP R >r(), then E =. b. Ifr()<P R <r(), then E is the unique solution to (3.5) and E 2 (;). c. If 0P R r(), then E =. Furthermore, E is non-increasing inP R , and continuously decreasing inP R when E 2 (;). It follows from Proposition 3.3.1 that the server can be influenced to choose any service rate in [;] by appropriately adjusting the ratioP R . 3.3.2 The M/M/N+M Queue WhenN > 1, the payment serveri receives is affected by the service rates chosen by all the other servers, through the termB i () in (3.3). As mentioned at the beginning of this section, this term is very complicated. However, it becomes more manageable when we focus on symmetric equilibrium service rates. In that case, 62 it is sufficient to focus on the utility of a tagged server working at rate 1 , when all the other servers work at rate. Without loss of generality, we let server 1 be this tagged server, and in a slight abuse of the notation in (3.3), write U( 1 ;) =P ( 1 )B( 1 ;); whereU andB, whose arguments are understood as server 1 working at 1 and all the other servers working at, are the steady-state utility and busy-time percentage of server 1 respectively. If E 2 (;), the first order necessary condition for E to be a symmetric equilibrium service rate is that it is a solution to @U( 1 ;) @ 1 1 = = 0: (3.6) In order to solve (3.6), we require the first derivative ofB( 1 ;). Unfortunately, this is complicated, as can be seen from the following lemma. Lemma 3.3.1. Under either LISF or random routing, we have B( 1 ;) = P N1 i=0 ( ) i 1 i! + ( ) N1 1 (N1)! P 1 i=1 Q i k=1 (N1)+ 1 +k P N1 i=0 ( ) i 1 i! + 1 P N1 i=0 ( ) i Ni i! + ( ) N1 1 (N1)! P 1 i=1 Q i k=1 (N1)+ 1 +k To have some intuition, P N1 i=0 ( ) i 1 i! corresponds to the states when the tagged agent is busy and the system has no queue; 1 P N1 i=0 ( ) iNi i! corresponds to the states when the tagged agent is idle and there- fore the system has no queue; ( ) N1 1 (N1)! P 1 k=1 Q k i=1 (N1)+ 1 +i corresponds to the states when the system has a queue. Instead of solving (3.6) directly, we focus on solving (3.6) approximately, when and N are large. This provides a proposed approximate symmetric equilibrium (Section 3.3.2). We then prove that for large enough systems, there exists a solution to (3.6) for any given ratioP R 0 that is a symmetric equilibrium, and that symmetric equilibrium is asymptotically identical to the aforementioned approximate equilibrium (Section 3.3.2). An Approximate Equilibrium We develop an approximate symmetric equilibrium by developing an approximation forB( 1 ;). When N, this is straightforward, because all agents are almost always busy, and soB( 1 ;) 1, for any 63 1 2 [;]. Hence ? in (3.2) is a natural candidate for an approximate equilibrium. On the other hand, when < N, this is more complicated, because every server experiences some idle time. Then even though server 1 cannot influence the overall system idle time distribution, server 1 can influence his or her own busy and idle time percentages. To flesh out the intuition for how to approximateB( 1 ;) when < N 1 , we appeal to regenerative process theory (and make this rigorous subsequently). First, for any serveri, letX n i ;n = 1; 2; , represent the server’s service time for thenth service; letY n i ;n = 1; 2; , represent the server’s idle time between the nth service and then + 1st service (which could be 0). We can think of the sequencef P t n=1 X n i +Y n i ;t = 1; 2;g as regenerative times. Then, it follows from regenerative process theory that B( 1 ;) = E[X 1 1 ] E[X 1 1 ] +E[Y 1 1 ] ; (3.7) whereE[X 1 1 ] = 1 1 . Any serveri2f2; ;Ng has busy-time percentage E[X 1 i ] E[X 1 i ] +E[Y 1 i ] N ; whereE[X 1 i ] = 1 . We can calculate E[Y 1 i ] N ;i2f2;:::;Ng: Although server 1 can control his or her own average service time, server 1 cannot control the idle time between serving requests due to the LISF routing rule, and so E[Y 1 1 ] N : (3.8) Finally, it follows from (3.7) and (3.8) that B( 1 ;) 1 1 N + 1 1 = + 1 N 1 : (3.9) The function ^ B( 1 ;) := + 1 h N 1 i + (3.10) 64 is 1 when N and is consistent with the approximation in (3.9) when < N. This leads to the approximate server 1 utility ^ U( 1 ;) :=P ( 1 ) ^ B( 1 ;) (3.11) We are interested in finding ^ E such that ^ U(^ E ; ^ E ) = max v2[;] ^ U(v; ^ E ): (3.12) SinceP ( 1 ) and ^ B( 1 ;) are both decreasing in 1 when 1 > ? , no server would choose a service rate larger than ? . When N ? , we have ^ B( 1 ;) = 1. Therefore, ? , which maximizesP (), is the unique symmetric equilibrium. When N < ? , similar to (3.6), the first order necessary condition is @ ^ U( 1 ;) @ 1 1 = = 0: That is, P 0 () 2 P () N = 0: (3.13) Proposition 3.3.2. (An Approximate Equilibrium). WhenP R 0, there exists a unique solution to (3.12) and it satisfies ^ E = ? if N ? ; and ^ E 2 max N ; ; ? ; otherwise; where ^ E solves (3.13) when ^ E > max N ; and is non-decreasing in N . Remark 3.3.1. If P R 1 1p() , then P () = P S (1P R (1p())) P S (1P R (1p())) 0 for all2 [;]. This suggests that the servers prefer quitting to continuing their employment. We will address this issue in Section 3.4, where we impose a voluntary participation constraint, that ensures the system manager pays the servers a given minimum expected hourly wage. We expect ^ E to be close to a symmetric equilibrium service rate E when andN are large. Figure 3.1 verifies this statement numerically. 65 Figure 3.1: Performance of the proposed equilibrium approximation 9 9.5 10 10.5 11 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 P F Service rate μ ⋆ >λ/N =5 μ ⋆ <λ/N =5 underload overload μ ⋆ μ E ˆ μ E (a) = 100;N = 20 9 9.5 10 10.5 11 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 P F Service rate μ ⋆ >λ/N =5 μ ⋆ <λ/N =5 underload overload μ ⋆ μ E ˆ μ E (b) = 1000;N = 200 p() = exp(0:2), = 0:1, andP S = 10 Asymptotic Analysis The next step is to show mathematically that there exists a symmetric equilibrium service rate that is close to ^ E when andN are large. To do this, we consider a sequence of systems with increasing arrival rate. Our convention, when we refer to any process or quantity associated with the system having arrival rate, is to superscript the appropriate symbols by. We assume the staffing level as a function of satisfies N =b +o(); forb 0: (3.14) The notation E represents a symmetric equilibrium service rate in the system with arrival rate andN servers. The impatience parameter is independent of and so is not superscripted. We redefine ^ B( 1 ;) by substituting N withb. The following proposition establishes that the function ^ B( 1 ;) is equivalent asymptotically toB ( 1 ;). Proposition 3.3.3. Under the staffing (3.14), lim !1 sup 1 2[;] B ( 1 ;) ^ B( 1 ;) = 0: Proposition 3.3.3 supports approximating server 1’s utility by (3.11), which implies that ^ E that solves (3.12) should be close to the equilibrium service rate. 66 Theorem 3.3.1. (The Symmetric Equilibrium in Large Systems). When is large enough, under the staffing (3.14), for any fixed compensation ratioP R 0, there exists a symmetric equilibrium service rate E 2 ; ? . Furthermore, E ! ^ E ; as!1; where ^ E = ^ E (b;P R ) is as defined in Proposition 3.3.2 except that 1 b replaces N . Remark 3.3.2. The intuition of the approximation from Section 3.3.2 does not require service times to be exponential. Ideally, for generally distributed service times with mean 1=, the same approximation should also work. However, without the Markovian assumption, we cannot rigorously show the existence of a symmetric equilibrium, or the convergence to the approximate equilibrium. Define the system loading function := N E . For the staffing (3.14), it follows from Proposi- tion 3.3.2 and Theorem 3.3.1 that the system is underloaded in the sense that ! 1 b^ E < 1 as!1; ifb> 1 ? ; and is overloaded in the sense that ! 1 b ? 1 as!1; ifb 1 ? : This observation is reflected in the limiting percentage of time the servers are busy. Corollary 3.3.2. Under the staffing (3.14), for any fixedP R , lim !1 B ( E ) = 8 > < > : 1 b^ E < 1 ifb> 1 ? 1 ifb 1 ? : The implication of Theorem 3.3.1 and its corollary is that the servers prefer to be always busy. This is because when the system is overloaded, servers attain their maximum possible expected hourly payment, defined through (3.1) and (3.2). More precisely, since lim !1 U ( E ) =P (^ E ) lim !1 B ( E ); 67 ifb 1 ? , so that the system is overloaded, then as!1, U ( E )! P ( ? ), the maximum possible expected hourly payment; otherwise, if b > 1 ? , the limit in the above display will be strictly less than P ( ? ). The system manager is happy to have industrious servers, that always want to work. However, the service rate that is most desirable to the servers may or may not be the service rate that is most desirable to the system manager. In Sections 3.4 and 3.5, we investigate how the system manager can determine the staffing and compensation that is to his or her best advantage. 3.4 Staffing and Compensation The system manager cannot directly control the equilibrium service rate. However, the system manager can control the equilibrium service rate through the staffing and compensation decisions. Those decisions are influenced by the cost structure faced by the system manager. We would like to understand the resulting staffing level and compensation when that cost structure includes all of staffing, failed service, and aban- donment costs, and the compensation is sufficient to prevent the servers seeking alternative employment. In this section, we study a linear cost structure (and this assumption is generalized in Section 3.5). We begin in Section 3.4.1 by formulating the system manager’s objective, in terms of an optimization problem. That problem is lower bounded by one in which the system manager can directly control the service rate, and we formulate that lower bound problem in Section 3.4.2. A first best solution is attained when the solution to that lower bound problem is achieved. We show in Section 3.4.3 that a linear staffing policy combined with a compensation ratio that transfers the cost of service failure, but not the abandonment cost, to the agents is a first best solution asymptotically, as!1. Finally, in Section 3.4.4, we propose the staffing and compensation parameters for a finite sized system. We evaluate the policy numerically and show that it is very nearly first best. 3.4.1 The System Manager Objective The staffing costs are increasing in N, and so the system manager would like to hire as few servers as possible. The problem is that as the staffing level decreases, the time a request must wait before being served increases, which leads to abandonments. Furthermore, the number of failed services increases if the 68 agents decide to increase their service rate in order to serve more requests. Neither of these outcomes is desirable, and we assume the system manager is penalized for them. In particular, we assume there is a cost to the system manager ofc A per customer abandonment andc F per failed service. The parametersc A andc F control the emphasis placed on serving all requests as opposed to the emphasis placed on serving all those requests successfully. This cost structure is consistent with the model in Ren and Zhou (2008). The cost faced by the system manager depends on the equilibrium service rate E = E (N;P R ). The staffing cost is NU( E ), provided that the servers accept the resulting expected hourly payment U( E ). Specifically, for risk neutral servers that have an alternative employment option with expected hourly pay- mentc S per hour, the individual rationality (IR) constraint U( E )c S (3.15) must hold. Next, the steady-state number of busy agents isNB( E ) and the steady-state abandonment rate is E NB( E ) 0, so that, the expected hourly cost due to customer abandonments and service failures is c A ( E NB( E )) +c F (1p( E )) E NB( E ): The objective of the system manager is to minimizeC(N;P S ;P F ) :=NU( E ) +c A ( E NB( E )) +c F (1p( E )) E NB( E ) subject to:U( E )c S ;N2f0; 1;g;P S > 0;P F 0: (3.16) From Theorem 3.3.1, for any fixedN, the equilibrium service rate E is determined by the ratioP R = P F P S . If a feasible vector (N;P S ;P F ) for (3.16) satisfiesU( E ) =P S (1P R (1p( E )))B( E )>c S , by decreasingP S and keeping the ratioP R = P F P S constant, we achieve a smaller objective value. Therefore, any solution to (3.16) must have the IR condition (3.15) binding. Thus, (3.16) can be rewritten as: minimizeC(N;P R ) :=c S N +c A ( E NB( E )) +c F (1p( E )) E NB( E ) subject to:N2f0; 1;g;P R 0: (3.17) 69 If (N;P R ) solves (3.17), then (N;P S ;P F ) where P S = c S (1P R (1p( E ))) E B( E ) andP F =P R P S (3.18) solves (3.16). The values in (3.18) are found by solving forP S inU( E ) = c S . Hence it is sufficient to focus on solving (3.17) in order to find a staffing level and compensation scheme that minimizes the system manager’s cost. 3.4.2 The Lower Bound (Centralized) Problem The objective (3.17) is lower bounded by the following problem in which the system manager directly controls the service rate: minimizeC LB (N;) :=c S N +c A (NB()) +c F (1p())NB() subject to:N2f0; 1;g;2 [;]: (3.19) This is because any solution (N;P R ) to (3.17) has associated equilibrium service rate E such that (N; E ) is feasible for (3.19). However, if (N;) solves (3.19), there may or may not be a feasible (N;P R ) for (3.17) that achieves the same cost. Hence minC(N;P R ) minC LB (N;): Definition 3.4.1. A solution (N;P R ) to (3.17) is first best if (N; E (N;P R )) is a solution to (3.19). The key difference between (3.16), (3.17) and (3.19) is that (3.16) and (3.17) depend on the server utility maximization problem (3.4) while the solution to (3.19) does not. To gain some intuition into the relationship between (3.4), (3.16), (3.17) and (3.19), we fixN = 1, and solve the fixed-N versions of these problems. By the fixed-N version, we mean that the minimization in the objective function is only overP S andP F in (3.16),P R in (3.17), and in (3.19). Note that the staffing costs are fixed so we can drop them in the objectives (assumingU( E ) =c S ). Also note that the definition of first best solution is modified to the P R 0 such that E (N;P R ) is a solution to fixed-N version of (3.19). 70 Example 3.4.1. (The M/M/1+M First Best Compensation). FixN = 1. The system manager would like the server to work at the rate that solves the fixed-N version of (3.19) min 2[;] c A (B()) +c F (1p())B(): (19’) To see the relationship between (19’) and the server’s maximization problem in (3.4) when N = 1, we convert the minimization in (19’) to a maximization, and find that it is equivalent to max 2[;] c A 1 c F c A (1p()) B()c A : Recalling from (3.4) that the equilibrium service rate E maximizes P S (1P R (1p()))B(); it follows that P R = c F c A implies the servers choose to work at rate E (1;P R ) (whose existence is guaranteed by Proposition 3.3.1) that equals the that solves (19’). In other words, the solutionP R = c F c A to the fixed-N version of (3.17) is a first best solution. Finally, choosingP S andP F such that P F P S =P R andU( E (1;P R )) =c S ensures the IR condition is satisfied, from which we find that P S = c A c S (c A c F (1p()))B() ; andP F = c F c S (c A c F (1p()))B() ; (3.20) where solves (19’). The (P S ;P F ) in (3.20) solves the fixed-N version of (3.16), and has the same minimum objective function value as the fixed-N version of (3.17). The intuition behind the first best solution for the M/M/1+M system given in Example 3.4.1 is as follows: noting each service completion avoids abandonment cost c A and each service failure incurs cost c F , the system manager transfers the proportional risk of service failure and abandonment to the server, and then sets the expected hourly payment as low as possible to satisfy the IR constraint. In a revenue-generating 71 service system, where each successful service generates $R, the parametersc A andc F both equalR, so that (3.20) proposes a fixed commission for each successful service. This is a commonly adopted compensation scheme. For example, this arises in the Pay-Per-Call-Resolved contract in Ren and Zhou (2008). The natural approach to finding a first best solution to (3.17) is to mimic the strategy in Example 3.4.1; that is, to first determine (N;) that solves (3.19), then findP R that causes the equilibrium service rate to equal that, and finally to use the IR condition to uniquely determineP S andP F . This would be simpler if the compensation ratio c F c A in Example 3.4.1 could always be used to translate between a solution (N;) to (3.19) and an equilibrium service rate E N; c F c A that solves (3.17). Unfortunately, this is not the case, because when servers must compete for jobs, there is an incentive for them to speed up. The intuition is best captured by the well-known proverb: “The early bird gets the worm.” Proposition 3.4.1. (Server Speed-up). Suppose (N;) solves (3.19),N > 1 and2 (;). Then, under the compensation ratio c F c A , E >: Remark 3.4.1. If the routing rule is to assign the incoming customer randomly to each server no matter whether the server is busy or idle, then the system is decoupled, or “unpooled”, toN M/M/1+M systems and the optimal compensation ratio is still c F c A . However, from Proposition 3.4.1, when we use that compensation ratio for the system under the LISF routing rule, the servers are working faster than the optimum due to the competition between them. The implication is that when we have multiple servers, the first best solution to (3.17) should have a largerP R than c F c A to counteract the speed-up behavior from the competition. This supports the empiric observation in Song et al. (2014) that decoupling can be beneficial. As seen in Section 3.3, it does not appear possible to obtain an analytic expression for the equilibrium service rate. Therefore, there is no hope to obtain an analytic solution to (3.17). Furthermore, the problem (3.19), which is simpler than (3.17), cannot be solved exactly, due to the complicated expression forB(). Therefore, we lower our aspirations and search for a policy that is first best for large systems, as the arrival rate becomes large. 72 3.4.3 A Limiting First Best Solution As in Section 3.3.2, we consider a sequence of systems with increasing arrival rate. We begin by solving the limiting lower bound problem (3.19), as becomes large. Then, we focus on finding a compensa- tion ratio P R and a sequence of staffing levelsfN : 0g such that the objective function in (3.17), C(N ;P R ), is close to the solution to the lower bound problem (3.19). In order to solve the limiting lower bound problem (3.19), it is necessary to determine the first order growth term ofN . Lemma 3.4.1. For any fixed> 0, IfN =w(), then lim inf !1 C LB (N ;) =1. IfN =o(), then lim inf !1 C LB (N ;) =c A . Intuitively, Lemma 3.4.1 is due to the fact that if the staffing is too small (N =o()), then the abandonment costs dominate. On the other hand, the staffing costs are too high if the staffing is too high (N =w()). The question that arises from Lemma 3.4.1 is: when staffing grows linearly in the arrival rate, can the cost grow at a rate less thanc A ? To answer this question, we analyze the family of linear staffing policies N =b +o(); forb> 0; defined as in (3.14). Proposition 3.4.2. (Linear Staffing Limiting Cost). Under a staffing policy that satisfies (3.14), for any fixed > 0, lim !1 C LB (N ;) =c S b +c A max(1b; 0) +c F (1p()) min (b; 1): (3.21) In contrast to the difficulty in solving (3.19), it is straightforward to find theb and that minimize the limiting lower bound cost. Lemma 3.4.2. The solution ( ^ b; ^ ) to minimize (3.21) satisfies: ^ b = 1 ^ and ^ := argmin 2[;] c S +c F (1p()); 73 if ^ c LB <c A ; (3.22) where ^ c LB := c S ^ +c F (1p(^ )). In fact, ^ c LB is not only the lower bound for the limiting cost, but also a lower bound for the cost per customer in the prelimit system. When the service rate is , serving a customer takes 1= time units on average, which translates toc S = in salary. The expected service failure cost isc F (1p()). This suggests that the cost of serving one customer is at leastc S = +c F (1p()), which is minimized at ^ . Theorem 3.4.1. (Lower Bound Cost). The solution to (3.19) and therefore also the solution to (3.16) and (3.17), are lower bounded by ^ c LB . The next step is to find a compensation ratioP R and a sequence of staffing levelsfN : 0g under whichC(N ;P R ) is close to the solution to the limiting lower bound cost ^ c LB in Theorem 3.4.1. For that, it is natural to start by assuming the same staffing level; that is to assumeN satisfies (3.14) withb = ^ b. Then, the hope is that there exists a compensation ratioP R such that the solution to the server maximization problem in (3.4) equals ^ . The following lemma shows that this will be the case if the servers are always busy as becomes large. Lemma 3.4.3. IfP R = c F ^ c LB , then ? (P R ) = ^ for ? defined as in (3.2). Furthermore, if ^ 2 (;), then c F ^ c LB is the only value forP R under which ? (P R ) holds; otherwise, there may be many such values. The servers have no incentive to deviate from the rate ? when there are enough jobs, and settingb = 1 ? ensures that in the limit this is the case. Hence we arrive at a staffing level and compensation ratio that has associated cost that is very close to ^ c LB . Theorem 3.4.2. (Limiting First Best Solution). In the decentralized control problem, set b = 1 ? (P R ) = 1 ^ ; andP R = c F ^ c LB : Under the staffing (3.14), lim !1 C(N ;P R ) = ^ c LB : 74 3.4.4 The Proposed Staffing and Compensation For a system with arrival rate, Theorem 3.4.2 motivates the proposed staffing 1 N = ^ ; (3.23) and having the compensation ratio P R = c F ^ c LB . Then, to find P S and P F , we let the IR condition binds, which, as in (3.18), leads to P S = c S (1P R (1p( E ))) E B ( E ) andP F =P R P S : However, computing E is difficult. Fortunately, recalling that ^ = ? (P R ). Theorem 3.3.1 and its corollary establish that E is well-approximated by ^ . Then, after substituting for E in the above expression forP S , and also using the definition for ^ c LB in Lemma 3.4.2, we find that c S (1P R (1p(^ )))^ B (^ ) = c S 1 ^ c LB (^ c LB c F (1p(^ )))^ B (^ ) = ^ c LB B (^ ) : This motivates the proposed compensation P S = ^ c LB B (^ ) andP F = c F B (^ ) : (3.24) Lemma 3.4.4. Under the proposed staffing and compensation in (3.23) and (3.24), E ^ , which implies the IR condition (3.15) is satisfied for each. When!1, P S ! ^ c LB ; P F !c F : The proposed staffing in (3.23) is such that the servers are almost always busy (B (^ ) 1), and there are few abandonments. Hence the servers approximately achieve their maximum paymentP ( ? ), and the system manager does not need to transfer the cost of abandonment to the servers. Instead, the system manager fully transfers the cost of service failure and then sets the payment for successful service high 1 Any staffingN = ^ +o() will be a limiting first best solution. Without a finer analysis, it is not possible to determine the optimal second order term. 75 enough so that the servers do not seek the alternative employment option having expected paymentc S per hour (that is, the IR condition holds). The limiting equilibrium service rate ? depends only onc S andc F (and the functionp) sinceP R = c F ^ c LB and ^ c LB = ^ c LB (c S ;c F ). Whenc S is high, the manager wants the agents to work faster to lower the staffing cost; whenc F is high, the manager wants the agents to work slower to lower the service failure cost. The following proposition makes this intuition concrete. Proposition 3.4.3. ? is non-decreasing inc S and non-increasing inc F . If ? 2 (; ), then the mono- tonicity is strict. Example 3.4.2. Supposep() = 1k,2 [0; 1=k], wherek> 0. Then, under the compensation (3.24), from (3.2), we find ? = 8 > > < > > : q c S kc F ; if c S c F 1 k 1 k ; if c S c F > 1 k : In words, ask becomes larger, so that the risk of service failure becomes higher, the servers slow down. Example 3.4.3. Supposep() = exp(k),2 [0; 1=k], wherek> 0. Notep() is concave on [0; 1=k]. Although we cannot solve for ? exactly, we can solve for it numerically. As in Example 3.4.2, it is the ratio c S c F that matters. Figure 3.2 confirms the intuition in Example 3.4.2; that is, ask becomes larger, so that the risk of service failure becomes higher, the servers slow down. Figure 3.2: The limiting equilibrium service rate ? 0 0.5 1 1.5 2 0 1 2 3 4 5 6 7 c S /c F μ ⋆ (c F /ˆ c LB ) k=0.1 k=0.2 p() = exp(k). Of course, Proposition 3.4.3 and Examples 3.4.2 and 3.4.3 provide properties of the limiting equilib- rium service rate. It is important to understand how the proposed staffing and compensation in (3.23) and 76 (3.24) perform in a finite sized system. Figure 3.3 compares the predicted performance of the cost ^ c LB , the equilibrium service rate ? , the staffing N = ^ +o(), and the compensation ratio P R = c F ^ c LB , to the centralized solution found numerically by solving the lower bound problem (3.19). For further comparison purposes, we also plot the performance when the compensation scheme is given by (3.20), the exact solu- tion in the M/M/1+M setting, under the staffing policyN = l ? m . We see that our proposed staffing and compensation in (3.23) and (3.24) is very nearly a first best solution, while the M/M/1+M compensation is not. This confirms the intuition from Proposition 3.4.1 that competition encourages servers to speed up. Note thatc S < 1:84 in Figure 3.3 to make sure the cost to serve a customer is lower than the abandonment Figure 3.3: Performance comparison of three polices 4 6 8 10 12 14 16 500 550 600 650 700 750 800 850 900 950 1000 c S Cost Proposed staffing & compensation M/M/1+M compensation Opt solution to centralized system 4 6 8 10 12 14 16 0 50 100 150 200 250 300 350 400 450 500 550 c S Staff Number Proposed staffing & compensation M/M/1+M compensation Opt solution to centralized system = 1000, = 0:1,p() = exp(0:2), andc A =c F = 10. cost (that is, the condition (3.22) is satisfied). We see that the proposed contract performs very closely to the optimal centralized control contract. Whenc S is larger, the proposed staffing level is decreasing, making the competition between the servers less intense. Therefore, the proposed compensation is more and more close to the M/M/1+M optimal compensation. Remark 3.4.2. Our linear compensation scheme can be generalized to a scheme that each server gets a fixed salarykC S , where 0k< 1; for each service completion, the server is rewarded (1k)P S , and for each service failure, the server is penalized by (1k)P F , whereP S andP F are defined in (3.24). Then,P R is not changed and the IR condition is still satisfied. The generalized compensation scheme can incentivize the servers to works at exactly the same service rate and achieves close to the first best. 77 3.5 Economically Optimal Limit Regimes Our analysis in Section 3.4 shows that when the system manager faces a linear cost structure, then, in order to minimize the total costs, the manager will choose a staffing level and compensation structure under which the system is critically loaded. In other words, as the system arrival rate becomes large, no customers will abandon, and the servers will continuously work at the payment-maximizing rate ? , having no desire for idleness. There are several ways to criticize this model. For one, servers are not machines, and, therefore, they generally do attach value to having some idle time. On the system manager side, there is no clear reason that service failure and abandonment costs should be linear. In this section, we relate various assumptions on the server utility function and the system manager cost structure to the salient properties that can emerge from our large-system analysis, which is summarized in Table 3.1. Table 3.1: The optimal regimes under different utility functions and cost structures Abandonment Cost Utility Function Optimal Regime Limiting Customer Property Limiting Server Property Linear No Idleness value Critically loaded No abandonments No idleness Linear Idleness value Quality-driven No abandonments Some idleness Generalized No idleness value Efficiency-driven Some abandonments No idleness Generalized Idleness value Mixed Some abandonments Some idleness First, in Section 3.5.1, we enhance the server utility in (3.3) to incorporate a value for idle time. Then, we provide conditions under which a limiting first best solution places the system in a quality-driven regime, in which the servers have idleness. Next, in Section 3.5.2, we return to the original server utility function in (3.3), and vary our assumptions on the system manager’s cost structure. We specify the conditions under which an efficiency-driven regime, in which servers have no idleness and there are customer abandonments, emerges from the limiting first best solution. Section 3.5.3 combines the model variants in Sections 3.5.1 78 and 3.5.2 to find when a limiting first best solution simultaneously allows customers to abandon and pro- vides servers with idleness. Finally, in Section 3.5.4, we numerically compare the performance of different policies resulting in different regimes in an example system that we analyze in the large-system limit. Throughout this section, we use the same notation as in Sections 3.3 and 3.4, except that we redefine the server utility function and/or the failed service and abandonment costs. In contrast with the situation when servers only value their payment (that is, when the utility function is defined by (3.3)), it is no longer true that the equilibrium service rate vector E depends only onP R . Instead, for a givenP R , there may be multiple E ’s. Therefore, throughout this section, when we state our results that provide an approximation for E and a limiting first best solution (that is, the results analogous to Theorems 3.3.1 and 3.4.2), we state them in terms ofP S andP F instead ofP R . It is no longer the case that we can decouple the specification of a limiting first best solution in terms ofP R , and its translation using the IR condition to a limiting first best solution in terms ofP S andP F . 3.5.1 Providing Servers with Idle Time Suppose the servers value idle time, and the utility they derive from being idle is a decreasing function U I (x) of the busy-time percentagex. Our objective in this section is to understand how much the servers must value their idle time in order that the system manager will want to ensure they have some. In other words, when will the system operate in a quality-driven regime and when will it operate in a critically loaded regime as in Section 3.4? We assume the marginal utility of extra idle time is decreasing in total idle time, and that the utility of no idle time is zero. Then,U I is a concave decreasing function havingU I (1) = 0. Now, we assume that the servers derive their utility from both their payment and their idleness, so that U i () :=P ( i )B i () +U I (B i ());i2f1; ;Ng: (3.25) In comparison with (3.3), because of the added term that involvesU I , a symmetric equilibrium service rate vector depends on bothP S andP F and not solely onP R . Although it is not possible to provide a simple closed-form expression for the symmetric equilibrium service rate, as in Section 3.3, we can provide a simple approximate expression for systems with large arrival 79 rates, when the staffing levelN is asymptotic tob for someb > 0, as in (3.14). With a slight abuse of notation, we redefine the approximate utility of server 1 in (3.11) in Section 3.3.2 so that ^ U( 1 ;) :=P ( 1 ) ^ B( 1 ;) +U I ( ^ B( 1 ;)); (3.26) where ^ B( 1 ;) = + 1 [b1] + is as defined in (3.10). Similar to Section 3.3.2, we are interested in finding ^ E to satisfy ^ U(^ E ; ^ E ) = max v2[;] ^ U(v; ^ E ); (3.27) which is exactly the same with (3.12) except that ^ U is newly defined. Similar to (3.13), algebra on the first order condition gives P 0 () 2 1 b P () +U 0 I 1 b = 0: (3.28) Solving (3.27) is much more complicated than (3.13) and we cannot guarantee a unique solution, as seen in the following example. Example 3.5.1. Figure 3.4 shows the symmetric equilibrium (equilibria) as the staffing level varies. In the first subfigure, there is a unique solution on [0; ? ]; in the second subfigure, when b2 (0:965; 1), there are three symmetric equilibria. In contrast to Proposition 3.3.2, when servers value idle time, the second subfigure shows increasingb can increase ^ E (whenb> 1 ? = 1). Even though increasing the staffing level gives the servers more idle time, it can still incentivize the servers to work faster to create even more idle time. This can be seen fromB( 1 ;) = +(b1) 1 because @B( 1 ;) @ 1 can increase inb. The following theorem shows that ^ E should well approximate the equilibrium service rate when is large. It is similar to Theorem 3.3.1, where the utility function (3.3) is replaced by (3.25). Theorem 3.5.1. (The Symmetric Equilibrium When Servers Value Idle Time). When is large enough, under the staffing (3.14), for any fixedP S > 0;P F 0, there exists a symmetric equilibrium service rate E . Ifb> 1 ? (P F =P S ) , then any solution ^ E to (3.27) has ^ E > 1 b : Ifb 1 ? (P F =P S ) , then ^ E = ? is a solution to (3.27). Furthermore, if the solution ^ E is unique, then E ! ^ E > 1 b : 80 Figure 3.4: The approximate symmetric equilibrium (equilibria) ^ E vs the staffing levelN =b +o() m ` E 0.5 1.0 1.5 2.0 1.5 1.6 1.7 1.8 1.9 2.0 b (a)PS = 20,PF = 10 m ` E 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 0.98 1.00 1.02 1.04 1.06 1.08 1.10 1.12 b (b)PS = 10,PF = 10 p() = 1 0:5,2 [0; 2],U I (x) = 10 p 1x. Theorem 3.5.1 shows that the system manager can guarantee the system operates in a quality-driven regime by staffing enough servers (b> 1 ? ). However, when there are few servers (b 1 ? ), then the system manager may not be able to control whether the system operates in a critically loaded or an efficiency- driven regime, because the limiting equilibrium service rate ^ E may not be unique. Fortunately, there are conditions under which the linear cost structure in Section 3.5.1 guarantees that the quality-driven regime is optimal, as is shown next. System Manager’s Objective and Lower Bound Problem. After characterizing the servers’ behavior, we model the manager’s decision by using the same cost structure as our model in Section 3.4.1, wherec S ,c A andc F are defined. Then, the system manager’s objective is: minimizeNP ( E )B( E ) +c A ( E NB( E )) +c F (1p( E )) E NB( E ) subject to:P ( E )B( E ) +U I (B( E ))c S ;N2f0; 1;g;P S > 0;P F 0: We assume U I (0) < c S so that the current or alternative employment gives the server more utility than unemployment. Arguments similar to those in Section 3.4.1 show that the IR condition should be binding. 81 Then the hourly salary the manager has to pay to each server to prevent him or her from seeking alternative employment isc S U I (B( E )). Analogous to (3.16) and (3.17), the objective of the manager is minimizeC(N;P S ;P F ) := (c S U I (B( E )))N +c A ( E NB( E )) +c F (1p( E )) E NB( E ) subject to:P ( E )B( E ) +U I (B( E )) =c S ;N2f0; 1;g;P S > 0;P F 0: (3.29) Similar to Section 3.4.2, the minimum cost to (3.29) is lower bounded by the following problem in which the system manager directly controls the service rate: minimizeC LB (N;) := (c S U I (B()))N +c A (NB()) +c F (1p())NB() subject to:N2f0; 1;g;2 [;]: (3.30) As in Proposition 3.4.2, under the staffing policy (3.14), we have: lim !1 C LB (N ;) = c S U I min 1 b ; 1 b +c A max(1b; 0) +c F (1p()) min (b; 1); (3.31) The next step is to optimize overb and in the right-hand side of (3.31) to find the limiting lower bound cost under linear staffing. Lemma 3.5.1. Define ~ B = 1 if lim x"1 U 0 I (x)c S and ~ B2 (0; 1) as the solution to U I (x)U 0 I (x)x =c S otherwise. Let ^ c I := min 2[;] c S U I ( ~ B) ~ B +c F (1p()); ^ I := argmin 2[;] c S U I ( ~ B) ~ B +c F (1p()): If ^ c I <c A , then (b;) = 1 ~ B^ I ; ^ I minimizes the right-hand side of (3.31). Furthermore, lim !1 B 1 ~ B^ I ; ^ I = ~ B: The following example shows different idleness value functionsU I lead to the servers having zero or nonzero idleness. 82 Example 3.5.2. SupposeU I (x) = k(1x) q , wherek < c S , 0 < q 1. Ifq = 1, ~ B = 1; otherwise, ~ B2 (0; 1). In fact, ^ c I is not only the lower bound for the limiting cost, but also a lower bound for the cost per customer in the prelimit system. The expected cost of serving one customer includes the salary cost and (c S U I (B()))= and the service failure costc F (1p()). The minimum of the sum is ^ c I , indicating the system cost is at least ^ c I . Theorem 3.5.2. (Lower Bound Cost When Servers Value Idle Time). The solution to (3.30) and therefore also the solution to (3.29), are lower bounded by ^ c I . As in Section 3.4.3, the next step is to find the staffing and compensation parametersP S andP F that lead to a cost very close to the limiting lower bound cost ^ c I in Lemma 3.5.1. Theorem 3.5.3. (Limiting First Best Solution When Servers Value Idle Time). In the decentralized control problem, let b = ^ b I := 1 ~ B^ I ; P S = ^ c I ; P F =c F : Under the staffing (3.14) and the assumption that the solution to (3.27) is unique, then lim !1 C(N ;P S ;P F ) = ^ c I ; where ~ B, ^ I and ^ c I are defined in Lemma 3.5.1. It follows from Theorem 3.5.3 that if ~ B2 (0; 1), then the limiting first best solution leads to a quality- driven regime, in which servers have idle time, and no customers abandon. If we do not know the uniqueness of the solution to (3.27), under the compensation proposed by Theorem 3.5.3, it can be seen from the proof of Theorem 3.5.3 that working at ^ I is the only symmetric equilibrium that satisfies the IR constraint. Hence we expect the servers will prefer this equilibrium to any other. However, we cannot guarantee the servers will not become stuck in some other equilibrium. 3.5.2 Letting Customers Abandon In this section we generalize the cost structure of the manager to include generalized service failure and abandonment costs. Also, we return to the original server utility function (3.3) that does not value idleness. 83 We show that sometimes the system manager will not staff enough to prevent customer abandonments. In particular, we derive conditions under which the system operates in an efficiency-driven regime. We denote byg A (x) the abandonment cost per abandoned customer, wherex is the percentage of cus- tomers that abandon, and we denote by g F (y) the cost per failed service, where y is the percentage of services classified as failures. We have shown in Section 3.4 that when the functionsg A andg F are con- stants (g A (x) = c A andg F (y) = c F for allx;y 0), the limiting first best solution produces a critically loaded regime, in which there are negligible abandonments. However, as we will see, the functiong A will determine whether the limiting first best solution has a positive abandonment rate. The compensation is the same as in Section 3.2, so the servers’ behavior is characterized by Theorem 3.3.1. Similar argument to that in Section 3.4.1 suggests the IR condition should be binding, so that the hourly salary the manager has to pay to each server isc S . Since there is noU I , we can focus onP R as in (3.17). The objective of the system manager is minimizeC(N;P R ) := c S N + ( E NB( E ))g A 1 E NB( E ) +(1p( E ))g F (1p( E )) E NB( E ) subject to:N2f0; 1;g;P R 0: (3.32) Similar to Section 3.4.2, the minimum cost to (3.32) is lower bounded by the following problem in which the system manager directly controls the service rate: minimizeC LB (N;) := c S N + (NB())g A 1 NB() +(1p())g F (1p())NB() subject to:N2f0; 1;g;2 [;]: (3.33) 84 Under the staffing policy (3.14), as in Proposition 3.4.2, we have lim !1 C LB (N ;) =c S b + max(1b; 0)g A (max(1b; 0)) + (1p())g F (1p()) min (b; 1): (3.34) The minimization overb and in the right-hand side of (3.34) produces a solution that has exactly the same form as that in Lemma 3.4.2, and is exactly the same wheng F is the constantc F . Lemma 3.5.2. Assumeg F is increasing convex, then min 2[;] c S + (1p())g F (1p()) has a unique solution, denoted by ^ A . Denote by ^ c A the minimal value. If ^ c A <g A (1); (3.35) then the (b;) that minimizes the right-hand side of (3.34) is b = ^ b A := 1a ^ A ; and = ^ A ; where a := argmin x2[0;1] (1x)^ c A +xg A (x): Assumeg A (x) is increasing inx. Ifg A (0)< ^ c A ,a> 0; otherwise,a = 0. The parametera in Lemma 3.5.2 is interpreted as the limiting percentage of customers that abandon. Lemma 3.5.3. ForP A that represents the probabilities of customer abandonment, P A !a; as!1: The condition (3.35) is parallel to (3.22) in Section 3.4, with ^ c A playing the role of ^ c LB andg A (1) that of c A . It ensures the system manager prefers to serve customers rather than to let all of them abandon. The assumption thatg A (g F ) is increasing is motivated by the idea that with larger abandonment (service failure) percentage, the abandonment (service failure) cost per customer, such as reputation or goodwill cost, increases. The last condition in Lemma 3.5.3 confirms the intuition that when the cost of abandonment is low compared to the cost of serving a customer (g A (0) < ^ c A ), the system manager will want to let some 85 customers abandon, and will not want to otherwise. We can also see from Lemma 3.5.2 that onlyg A affects whether the optimal regime will be efficiency-driven whileg F does not play a role. In fact, (1a)^ c A +ag A (a) is not only a lower bound for the limiting cost, but also a lower bound for the cost per customer in the prelimit system. The expected cost of serving one customer when the service rate is still has two parts: salaryc S = and service failure cost (1p())g F (1p()). The minimum sum is ^ c A . If the manager could exactly control the abandonment percentage, the minimum cost of handling a customer is (1a)^ c A +ag A (a). Theorem 3.5.4. (Lower Bound Cost with Generalized Abandonment Cost). The solution to (3.33), and therefore also the solution to (3.32), are lower bounded by((1a)^ c A +ag A (a)). In the decentralized control system, the manager wants to findP R to incentivize the servers to work at service rate ^ A , and the servers are almost always busy, which is the same as Section 3.4. It is also true that linear staffing achieves the lowest possible cost. The difference is that not all customers may be served. Theorem 3.5.5. (Limiting First Best Solution with Generalized Abandonment Cost). In the decentralized control problem, set b = ^ b A ;P F =g F (1p(^ A )) + (1p(^ A ))g 0 F (1p(^ A ));P S = min 2[;] c S +P F (1p()): so thatP R =P R;A := P F P S . Under the staffing (3.14), lim !1 C(N ;P R;A ) = (1a)^ c A +ag A (a): Note that wheng A andg F are constantsc A andc F respectively, we haveP R;A = c F ^ c LB , exactly the same as theP R in Theorem 3.4.2. It follows from Theorem 3.5.5 that the limiting first best solution leads to an efficiency-driven regime, in which the servers have no idle time and some customers abandon. 3.5.3 Simultaneous Customer Waiting and Server Idling Now we allow both the servers to value their idle time and to have generalized abandonment and service failure costs. Intuition suggests that the limiting first best solution may mix the two optimal regimes in the 86 above two subsections. Then, the routing rule may not be non-idling. In particular, the system can operate in a mixed regime, in which there are simultaneously customer abandonments and idle servers. We begin by considering the centralized system in which the manager directly controls the service rate . We assume a fixed routing rule, which may or may not be non-idling. The minimum achievable system cost solves minimizeC LB (N;) := (c S U I (B()))N + (NB())g A 1 NB() +(1p())g F (1p())NB() subject to:N2f0; 1;g;2 [;]; (3.36) where B() is the busy time percentage of the servers given the fixed routing rule, and g F is increasing convex, as in Lemma 3.5.2. As in the previous subsections, we consider a sequence of systems with increasing under the staffing policy (3.14). SinceB () is within a compact set [0; 1], it has a subsequence i on whichB i () converges to B2 [0; 1], and so lim i !1 C LB (N i ;) i = c S U I B b + 1b B g A 1b B + (1p())g F (1p())b B: (3.37) The ( B;b;) that minimizes the right-hand side of (3.37) are described in the following lemma. Lemma 3.5.4. Define ^ c M := min 2[;] c S U I ( ~ B) ~ B + (1p())g F (1p()); ^ M := argmin 2[;] c S U I ( ~ B) ~ B + (1p())g F (1p()); 87 where ~ B is defined in Lemma 3.5.1. If ^ c M <g A (1); (3.38) then ~ B; ^ b M := 1a M ~ B^ M ; ^ M minimizes the right-hand side of (3.37), where a M := argmin x2[0;1] (1x)^ c M +xg A (x): Assumeg A (x) is increasing inx. Ifg A (0)< ^ c M ,a> 0; otherwise,a = 0. It follows from Lemma 3.5.4 that when ~ B2 (0; 1) anda> 0, the system manager will allow for servers to idle even while customers wait and abandon. In fact, (1a M )^ c M +a M g A (a M ) is not only the lower bound for the limiting cost, but also a lower bound for the cost per customer in the prelimit system. Theorem 3.5.6. (Lower Bound Cost with Idleness Value and Generalized Abandonment Cost). The solution to (3.36) is lower bounded by((1a M )^ c M +a M g A (a M )). In contrast to the previous subsections, the next step is to design a routing rule that can allow for servers to idle and customers to abandon simultaneously. To do this, the arrival process should be thinned to(1 a M ). If turning away customers leads to the same cost structure as abandonment, then we can directly use an admission control policy that denies each customer entry with independent probabilitya M . Another option is to let all the customers wait for a fixed timeT such that the percentagea M of them abandon the queue. To find T , recalling that the abandonment parameter is , we must solve exp(T ) = 1a M , so that T = log(1a M ) . Then, after waiting for timeT , the customers who do not abandon the system are routed according to LISF, as in the previous sections. 88 The manager’s objective in the decentralized system is minimizeC(N;P S ;P F ) := (c S U I (B( E )))N + ( E NB( E ))g A 1 E NB( E ) +(1p( E ))g F (1p( E )) E NB( E ) subject to:P ( E )B( E ) +U I (B( E )) =c S ;N2f0; 1;g;P S > 0;P F 0: (3.39) Note we have to considerP S andP F instead ofP R due to the idleness value. Analogous to the previous subsections, we expect that staffingN = ^ b M +o() and finding theP R that incentivizes the servers to work at rate ^ M will solve the control problem for the decentralized system (3.39). Theorem 3.5.7. (Limiting First Best Solution with Idleness Value and Generalized Abandonment Cost). In the decentralized system with the staffing (3.14), setb = ^ b M ,T = log(1a M ) , P F =g F (1p(^ M )) + (1p(^ M ))g 0 F (1p(^ M )); and P S = min 2[;] c S U I ( ~ B) ~ B +P F (1p()): Under the assumption that the solution to (3.27) is unique, where ^ B( 1 ;) is redefined as + 1 [(b)=(1a M )1] + , we have lim !1 C(N ;P S ;P F ) = (1a M )^ c M +a M g A (a M ): It follows from Theorem 3.5.7 that the limiting first best solution can lead to a mixed regime, in which some customers abandon and the servers have some idle time when ~ B2 (0; 1). Note that wheng A andg F are constantsc A andc F respectively, the algebra shows that the expression forP S andP F are exactly those in Theorem 3.5.3. 89 3.5.4 The Proposed Staffing and Compensation The proposed staffing is N =dbe; whereb is as defined in Theorems 3.5.3, 3.5.5 or 3.5.7, depending on the assumed server utility function and system manager cost structure. The proposed compensation is to use theP S andP F defined in Theorems 3.5.3, 3.5.5, or 3.5.7, as appropriate. Although this only ensures the IR condition is satisfied in the limit, the system manager can still satisfy the IR condition in the prelimit by providing a vanishing small guaranteed hourly wage, in addition to the performance based compensation dictated byP S andP F . This is in contrast to Section 3.4, where Lemma 3.4.4 guarantees the IR condition is satisfied in the prelimit for the proposed compensation. The reason such a guarantee is more difficult is that we can no longer increaseP S andP F by the same proportion (as done in (3.24) when we divide byB( E )) without changing the equilibrium service rate. Still, the equivalent of Lemma 3.4.4 can be proved when the server utility function does not include a value for idleness, because the E is defined solely byP R;A in Theorem 3.5.5 and the same procedure as in Section 3.5.4 can be used to propose-dependent compensation parametersP S andP F . The server utility function and the system manager’s cost structure may not be known with certainty. Hence it is of interest to investigate the consequences of misspecifying these functions. To do this, we assume the correct server utility function is U i () =P ( i )B i () +U I (B i ()); whereU I (x) = 10 p 1x, and the correct system cost structure is C(N;) =c S N + (NB())g A 1 NB() +c F (1p())NB(); whereg A (x) = 20x. (The reason we assume the service failure cost is constantc F is that that does not affect the resulting economically optimal limit regime.) This limiting first best policy is as given in Section 3.5.3, and will result in a mixed regime in which there is simultaneous customer abandonments and idle servers. The question is: what happens if the system manager ignores the fact that servers value idleness and/or assume the unit abandonment cost is constant? Then the system manager will implement one of the follow- ing three policies, each resulting in a different operating regime. 90 If both the value of idleness and the generalized abandonment cost are ignored, then the policy imple- mented will be the one in Section 3.4.4, that operates the system in a critically loaded regime. If the value of idleness is recognized but the generalized abandonment cost is ignored, then the policy implemented will be the one in Section 3.5.1, that operates the system in a quality-driven regime. If the value of idleness is ignored but the generalized abandonment cost is recognized, then the policy implemented will be the one in Section 3.5.2, that operates the system in an efficiency-driven regime. Figure 3.5 shows the results of comparing the performance of the aforementioned four possible policies. We see that if the hourly salary cost c S is low, then ignoring the generalized abandonment cost is not problematic, because the cost under the policy in Section 3.5.1 is close to the limiting first best policy in Section 3.5.3. One the other hand, when c S is high, correctly estimating the value the servers place on idleness is not important, because the cost under the policy in Section 3.5.2 is close to the limiting first best policy in Section 3.5.3. Figure 3.5: Performance comparison of four polices 8 10 12 14 16 18 20 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 c S Staffing Level b Limiting first best policy (Section 5.3) Generalized abandonment cost ignored (Section 5.1) Idleness value ignored (Section 5.2) Both ignored (Section 4.4) 8 10 12 14 16 18 20 5 6 7 8 9 10 11 c S Cost Per Customer p() = exp(0:2),U I (x) = 10 p 1x,g A (x) = 20x,c F = 10. We end this section with a proposition verifying theoretically the numerical comparisons shown in Fig- ure 3.5. 91 Proposition 3.5.1. Assume g F (x) = c F . Consider ^ c LB ;b in Theorem 3.4.2, ^ c I ; ^ b I in Theorem 3.5.3, a; ^ c A ; ^ b A in Theorem 3.5.5,a M ; ^ c M ; ^ b M in Theorem 3.5.7. We have ^ b A minf ^ b M ;bg maxf ^ b M ;bg ^ b I ; (1a M )^ c M +c A a M minf^ c I ; (1a)^ c A +c A ag maxf^ c I ; (1a)^ c A +c A ag ^ c LB : 3.6 Conclusions The rate at which a server works is influenced by the compensation structure. That observation has important consequences for staffing decisions. This is because the rate at which the servers work is a first-order determinant of the staffing level required to handle work of a given arrival rate. This paper solves a joint staffing and compensation problem. The system manager must decide both how many servers to staff, and what to pay them. There are costs associated with staffing. The quality of service requirements are modeled through costs associated with customer abandonments and failed services (that is, services that trigger customer complaints). The system manager cannot control the service rates, and must use the compensation and staffing to influence the service rates. The problem is tricky because it is not possible to have an analytic characterization of the equilibrium service rate for a given compensation scheme. Fortunately, we are able to develop a simple approximation for the equilibrium service rate when the arrival rate to the system is large. This allows us to solve the system manager’s problem, and to arrive at a staffing and compensation policy that is first best in the limit. We show that that limiting first best solution may be such that the system operates in the critically loaded regime, the quality-driven regime, the efficiency-driven regime, or a mixed regime in which servers idle while customers wait. The limiting first best solution is the most analytically tractable when the system operates in either the critically loaded or the efficiency-driven regime, because then the staffing and compensation problems decouple. One interesting direction for future research is to incorporate time-varying arrival rates and/or arrival rate uncertainty. There is a large literature devoted to staffing systems such systems. However, all of the papers we are aware of assume fixed service rate. The server payments could be used to leverage the staffing decisions, by inducing the servers to speed up or slow down, in order to achieve better system performance. 92 In our model, we find that a compensation scheme which rewards the successful services and penalizes the failed services is enough to achieve the asymptotic first best. In particular, when we assume the servers are risk neutral, even though the manager may have more information on service quality (e.g., a customer survey score scaling from 1 to 10), it is enough to have only two classifications of service quality. However, a more natural assumption is that the servers are risk averse, and then a more sophisticated compensation policy contingent on more quality information can decrease the payment variance and may be preferred. We would like to understand how such a change to the assumed server utility function in (3.3) affects the system manager’s staffing and compensation decisions. Finally, we have fixed the routing rule to be longest-idle-server-first, and shown that the server busy time under that policy is equivalent to random routing (which chooses randomly amongst the available servers when assigning an incoming customer to a server). It would be ideal to jointly optimize over the staffing, the compensation, and the routing. Potentially, the system manager can also use the routing to also influence the server speed by, for example, routing more customers to servers that are faster or have higher quality. 93 References Akerlof, G. 1970. The market for “lemons”: Quality uncertainty and the market mechanism. Quarterly Journal of Economics 84(3) 488–500. Aksin, Z., M. Armony, V . Mehrotra. 2007. The modern call-center: A multi-disciplinary perspective on operations management research. Production and Operations Management 16(6) 655–688. Alili, L., P. Patie, J. L. Pedersen. 2010. Respresentations of the first hitting time density of an Ornstein- Uhlenbeck process. Preprint. Alizamir, S., F. de Vericourt, P. Sun. 2013. Diagnostic accuracy under congestion. Management Sci. 59(1) 157–171. Allon, G., A. Bassamboo, E. Cil. 2014. Skill and capacity management in large-scale service marketplaces. Working Paper. Allon, G., I. Gurvich. 2010. Pricing and dimensioning competing large-scale service providers. Manufacur- ing Service Oper. Management 12(3) 449–469. Anand, K., M. Pac, S. Veeraraghavan. 2011. Quality-speed conundrum: Tradeoffs in customer-intensive services. Management Sci. 57(1) 40–56. Anantharam, V ., T. Konstantopoulos. 2011. Integral representaion of Skorokhod reflection. Proceedings of the American Mathematical Society 139(6) 2227–2237. Arlotto, A., S. E. Chick, N. Gans. 2014. Optimal hiring and retention policies for heterogeneous workers who learn. Armony, M. 2005. Dynamic routing in large-scale service systems with heterogeneous servers. Queueing Systems 51(3-4) 287–329. Armony, M., I. Gurvich. 2010. When promotions meet operations: Cross-selling and its effect on call center performance. Manufacturing Service Oper. Management 12(3) 470–488. 94 Armony, M., C. Maglaras. 2004. Contact centers with a call-back option and real-time delay information. Oper. Res. 52(4) 527–545. Armony, M., A. Ward. 2010. Fair dynamic routing in large-scale heterogeneous-server systems. Oper. Res. 58(3) 624–637. Atar, R. 2005. Scheduling control for queueing systems with many servers. The Annals of Applied Proba- bility 15(4) 2606–2650. Atar, R. 2008. Central limit theorem for a many-server queue with random service rates. The Annals of Applied Probability 18(4) 1548–1568. Atar, R., A. Mandelbaum, M.I. Reiman. 2004. Scheduling a multi class queue with many exponential servers: Asymptotic optimality in heavy traffic. The Annals of Applied Probability 14(3) 1084–1134. Atar, R., Y . Shaki, A. Shwartz. 2011. A blind policy for equalizing cumulative idleness. Queueing System 67(4) 275–293. Borst, S., A. Mandelbaum, M.I. Reiman. 2004. Dimensioning large call centers. Oper. Res. 52(1) 17–34. Brown, L., N. Gans, A. Mandelbaum, A. Sakov, H. Shen, S. Zeltyn, L. Zhao. 2005. Statistical analysis of a telephone call center: A queueing-science perspective. Journal of the American Statistical Association 100(469) 36–50. Buell, R.W., T. Kim, C.J. Tsay. 2014. Creating reciprocal value through operational transparency. Working Paper. Burdzy, K., W. Kang, K. Ramanan. 2009. The Skorokhod problem in a time-dependent interval. Stochastic processes and their applications 119 428–452. Bureau of Economic Analysis. 2014. GDP and the economy advance esti- mates for the second quarter of 2014. Retrieved September 27, 2014, http://www.bea.gov/scb/pdf/2014/09%20September/0914 gdp and the economy.pdf. Cachon, G.P., P.T Harker. 2002. Competition and outsourcing with scale economies. Management Sci. 48(10) 1314–1333. Cachon, G.P., F. Zhang. 2007. Obtaining fast service in a queueing system via performance-based allocation of demand. Management Sci. 53(3) 408–420. Chaleyat-Maurel, M., N. El Karoui, B. Marchal. 1980. Reflexion discontinue et systemes stochastiques. Ann. Probab. 8 1049–1067. 95 Chan, C.W., G. Yom-Tov, G. Escobar. 2014. When to use speedup: An examination of service systems with returns. Oper. Res. 62(2) 462–482. Cohen-Charash, Y ., P.E. Spector. 2001. The role of justice in organizations: A meta-analysis. Organ. Behav. Human Decision Processes 86(2) 278–321. Colquitt, J.A., D.E. Conlon, M.J. Wesson, C.O.L.H. Porter, K.Y . Ng. 2001. Justice at the millennium: A meta-analytic review of 25 years of organizational justice research. J. of Appl. Psych. 86(3) 425–445. Cox, J. T., U. Rosler. 1983. A duality relation for entrance and exit laws for Markov processes. Stochastic Processes and their Applications 16 141–156. de Vericourt, F., Y . Zhou. 2005. Managing response time in a call-routing problem with service failure. Oper. Res. 53(6) 968–981. Debo, L.G, L.B. Toktay, L.N. Van Wassenhove. 2008. Queuing for expert services. Management Sci. 54(8) 1497–1512. Dupuis, P., H. Ishii. 1991. On Lipschitz continuity of the solution mapping to the Skorokhod problem, with applications. Stochastics 35 31–62. Fralix, B.H. 2012. On the time-dependent moments of Markovian queues with reneging. Queueing Systems 1–20. Gans, N., G. Koole, A. Mandelbaum. 2003. Telephone call centers: Tutorial, review, and research prospects. Manufacuring Service Oper. Management 5(2) 79–141. Garnett, O., A. Mandelbaum, M. Reiman. 2002. Designing a call center with impatient customers. Manu- facuring Service Oper. Management 54(3) 208–227. Geng, X., W.T. Huh, M. Nagarajan. 2013. Strategic and fair routing policies in a decentralized service system. Working Paper. Gilbert, S. M., Z. K. Weng. 1998. Incentive effects favor nonconsolidating queues in a service system: The principal-agent perspective. Management Sci. 44(12) 1662–1669. Gopalakrishnan, R., S. Doroudi, A.R. Ward, A. Wierman. 2014. Routing and staffing when servers are strategic. Working Paper. Gurvich, I., W. Whitt. 2009a. Queue-and-idleness-ratio controls in many-server service systems. Math. Oper. Res. 34(2) 363–396. 96 Gurvich, I., W. Whitt. 2009b. Scheduling flexible servers with convex delay costs in many-server service systems. Manufacuring Service Oper. Management 11(2) 237–253. Halfin, S., W. Whitt. 1981. Heavy-traffic limits for queues with many exponential servers. Oper. Res. 29(3) 567–588. Harrison, J. M., M. Reiman. 1981. Reflected Brownian motion on an orthant. Ann. Probab. 9 302–308. Harrison, J.M., A. Zeevi. 2004. Dynamic scheduleing of a multiclass queue in the Halfin and Whitt heavy traffic regime. Oper. Res. 52(2) 243–257. Hart, M., B. Fichtner, E. Fjalsted, S. Langley. 2006. Contact centre performance: In pursuit of first call resolution. Management Dynamics 15(4) 17–28. Hassin, R., M. Haviv. 2003. To Queue or Not to Queue: Equilibrium Behavior in Queueing Systems. Kluwer Academic Publishers, Boston, MA. Holmstrom, B. 1979. Moral hazard and observability. The Bell Journal of Economics 10(1) 74–91. Holmstrom, B. 1982. Moral hazard in teams. The Bell Journal of Economics 13(2) 324–340. Hopp, W., S. Iravani, G. Yuen. 2007. Operations systems with discretionary task completion. Management Sci. 53(1) 61–77. Kahneman, D., J.L. Knetsch, R. Thaler. 1986. Fairness and the assumptions of economics. J. Bus. 59(4) 286–300. Kalai, E., M. I. Kamien, M. Rubinovitch. 1992. Optimal service speeds in a competitive environment. Management Sci. 38(8) 1154–1163. Kostami, V ., S. Rajagopalan. 2014. Speed-quality trade-offs in a dynamic model. Manufacuring Service Oper. Management 16(1) 104–118. Kruk, L., J. Lehoczky, K. Ramanan, S. Shreve. 2007. An explicit formula for the Skorokhod map on [0;a]. Ann. Probab. 35 1740–1768. Kruk, L., J. Lehoczky, K. Ramanan, S. Shreve. 2008. Double Skorokhod map and reneging real-time queues. Markov Processes and Related Topics: A Festschrift for Thomas G. Kurtz, vol. 4. IMS Collections, 169–193. Linetsky, V . 2004. Computing hitting time densities for ou and cir processes: Applications to mean-reverting models. Journal of Computational Finance 7 1–22. 97 Lovejoy, W.S., K. Sethuraman. 2000. Congestion and complexity costs in a plant with fixed resources that strives to make schedule. Manufacuring Service Oper. Management 2(3) 221–239. Lu, L., J. Van Mieghem, C. Savaskan. 2009. Incentives for quality through endogenous routing. Manufacur- ing Service Oper. Management 11(2) 254–273. Maglaras, C., A. Zeevi. 2003. Pricing and capacity sizing for systems with shared resources: Approximate solutions and scaling relations. Management Sci. 49(8) 1018–1038. Maglaras, C., A. Zeevi. 2005. Pricing and design of differentiated services: Approximate analysis and structural insights. Oper. Res. 53(2) 242–262. Mandelbaum, A., P. Momcilovic, Y . Tseytlin. 2012. On fair routing from emergency departments to hospital wards: QED queues with heterogeneous servers. To appear in Management Sci. Mandelbaum, A., S. Zeltyn. 2005. The Palm/Erlang-A queue, with applications to call centers. Working Paper. Mehrotra, V ., K. Ross, G. Ryder, Y . Zhou. 2012. Routing to manage resolution and waiting time in call centers with heterogeneous servers. Manufacuring Service Oper. Management 14(1) 66–81. Milkovich, G.T., J.M. Newman. 2004. Compensation (8th ed.). McGraw-Hill/Irwin, Boca Raton, FL. Mutikani, L. 2014. U.S. private job growth slows, but services sector bullish. Retrieved September 27, 2014, http://www.reuters.com/article/2014/09/04/us-usa-economy-jobless-idUSKBN0GZ1F820140904. Perry, O., W. Whitt. 2009. Responding to unexpected overloads in large-scale service systems. Management Sci. 55(8) 1353–1367. Perry, O., W. Whitt. 2011. A fluid approximation for service systems responding to unexpected overloads. Oper. Res. 59(5) 1159–1170. Pitman, J., M. Yor. 1981. Bessel processes and infinitely divisible laws. Lecture notes in Math., vol. 851. Springer, 285–370. Ramanan, K. 2006. Reflected diffusions via the extended Skorokhod map. Electronic Journal of Probability 36 934–992. Reed, J., Y . Shaki. 2014. A fair policy for the G/GI/N queue with multiple server pools. To appear in Math. Oper. Res. Ren, Z.J., Y .P. Zhou. 2008. Call center outsourcing: coordinating staffing level and service quality. Man- agement Sci. 54(2) 369–383. 98 Rothschild, M., J. Stiglitz. 1976. Equilibrium in competitive insurance markets: An essay on the economics of imperfect information. Quarterly Journal of Economics 90(4) 629–649. Shunko, M., J. Niederhoff, Y . Rosokha. 2014. Humans are not machines: Impact of queueing design on service time. Working Paper. Sigman, K., R. Ryan. 2000. Continuous-time monotone stochastic recursions and duality. Advances in Applied Probability 32(2) 426–445. Skorokhod, A. V . 1961. Stochastic equations for diffusions in a bounded region. Theor. Probab. Its Appl. 6 264–274. Song, H., A. Tucker, K. Murrell. 2014. The diseconomies of queue pooling: An empirical investigation of emergency department length of stay. Working Paper. Spence, M. 1973. Job market signaling. Quarterly Journal of Economics 87(3) 355–374. Srikant, R., W. Whitt. 1996. Simulation run lengths to estimate blocking probabilities. ACM Trans. Modeling Comput. Simulation 6 7–52. Tanaka, H. 1979. Stochastic differential equations with reflecting boundary conditions in convex regions. Hiroshima Math J. 9 163–177. Tezcan, T., J. Dai. 2010. Dynamic control of N-systems with many servers: asymptotic optimality of a static priority policy in heavy traffic. Oper. Res. 58(1) 94–110. Ward, A. R., P. W. Glynn. 2003a. A diffusion approximation for a Markovian queue with reneging. Queueing Systems 43 103–128. Ward, A. R., P. W. Glynn. 2003b. Properties of the reflected Ornstein-Uhlenbeck process. Queueing Systems 44 109–123. Ward, A. R., P. W. Glynn. 2005. A diffusion approximation for a GI=GI=1 queue with balking or reneging. Queueing Systems 50(4) 371–400. Ward, A.R., M. Armony. 2013. Blind fair routing in large-scale service systems with heterogeneous cus- tomers and servers. Oper. Res. 61(1) 228–243. Weerasinghe, A., A. Mandelbaum. 2013. Abandonment vs blocking in many-server queues: Asymptotic optimality in the QED regime. Queueing System 75(2-4) 279–337. Whitt, W. 1999. Improving service by informing customers about anticipated delays. Management Science 45(2) 192–207. 99 Whitt, W. 2004. Efficiency-driven heavy-traffic approximations for many-server queues with abandonments. Management Sci. 50(10) 1449–1461. Zhan, D., A.R. Ward. 2014. Routing to minimize waiting and callbacks in large call centers. Manufacuring Service Oper. Management 16(2) 220–237. Zhang, Tu-Sheng. 1994. On the strong solutions of one-dimensional stochastic differential equations with reflecting boundary. Stochastic Processes and their Applications 50(1) 135–147. 100 Appendix A Technical Appendix to Chapter 2 A.1 Further Details on the Simulation Study We simulate the RPT and QIR controls using C++, and in accordance with the common random numbers variance reduction technique. For each control, we have 10 fixed random seeds, and for each seed, we simulate 200,000 units of time (hours). The first 40,000 time units are the warm-up periods and we cal- culate the time-average queue-length and callback rate over the remaining 160,000 time units. Then, we report the average over the 10 runs. With the average queue-length and callback rate, we can calculate the average waiting time and call resolution. To have the common random numbers, for each fixed random seed, we apriori generate a sequence of customer arrival times and a sequence of service times and indicator random variables on whether the call is resolved. Then, every time a call is routed to an agent, we use that agent’s sequence of service times and indicators to determine how long that caller spends with that agent and whether the call is resolved. (Note that due to the callbacks, we cannot generate one sequence of customer arrival and service times, as would be possible for an inverted-V model without callbacks.) The parameters for our simulation study are chosen in accordance with the empiric data in Mehrotra et al. (2012), which is summarized in Table A.1. Note that our simulation parameters are consistent in the since that within any one customer type, the agent service speeds and resolution probabilities used in our simulations are approximately within the range of service speeds and resolution probabilities shown in Table 3 below. Finally, we provide additional simulation results to supplement Chapter 2. Simulations for a 2-pool System with Different System Loads: We also perform a simulation study that investigates the impact of system load on performance. We keep using the system parameters in Figure 2.8(a) in Chapter 2, except we change the system load. Figure A.1 101 Table A.1: Summary of Tables 1 and 2 in Appendix B in Mehrotra et al. (2012) customer type 1 2 3 4 min. service speed 2.76 2.65 3.14 7.03 max. service speed 13.82 12.80 14.78 27.60 min. resolution prob. 0.62 0.24 0.50 0.73 max. resolution prob. 1.00 0.96 0.92 0.97 shows the results of the comparison of the RPT control and the QIR control for 3 different system loads: 0.85, 0.90, and 0.95. We observe that when the system load is low ( = 0:85), the waiting times are smaller. On the other hand, when the system load is high ( = 0:95), the change in call resolution is smaller because the system is crowded, so there is not as much opportunity to choose between idle agents across pools when routing. Figure A.1: The trade-off between waiting time and call resolution in a 2-pool system with different 0.924 0.926 0.928 0.93 0.932 0.934 0.936 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 Call resolution Average waiting time (min) ρ=0.90 ρ=0.85 0.926 0.927 0.928 0.929 0.93 0.931 0.932 0.933 0.934 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Call resolution Average waiting time (min) ρ=0.90 ρ=0.95 ~ = (3; 6);~ p = (0:99; 0:90), and ~ N = (25; 25) Simulation of RPT and QIR with Gamma Service Times: To complement the simulation study we perform that assumes lognormal distributed service times (Fig- ure 2.8 in Chapter 2), we also perform a simulation study that assumes Gamma distributed service times. 102 We simulate two systems with different coefficients of variation, but keep the mean service rates fixed at 3 for pool 1 and 6 for pool 2. In Figure A.2(a) we let the service times of pool 1 agents followGamma( 1 4 ; 4 3 ), where 1 4 is the shape parameter and 4 3 is the scale parameter. The service times of pool 2 agents follow Gamma( 1 4 ; 2 3 ). In Figure A.2(b) we change the service time distributions of pool 1 and 2 toGamma(4; 1 12 ) andGamma(4; 1 24 ) respectively. From the plots of the RPT and QIR controls, we again see that the RPT controls are on the efficient frontier, and that the variation in the average waiting time increases when the coefficient of variation increases. Figure A.2: Simulated comparison between RPT and QIR in a 2-pool system with Gamma service times 0.926 0.927 0.928 0.929 0.93 0.931 0.932 0.933 0.934 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 Call resolution Average waiting time (min) RPT QIR (a) Gamma service times with CV=2 0.926 0.927 0.928 0.929 0.93 0.931 0.932 0.933 0.934 0.65 0.7 0.75 0.8 0.85 0.9 Call resolution Average waiting time (min) RPT QIR (b) Gamma service times with CV=0.5 ~ p = (0:99; 0:90); ~ N = (25; 25); = 0:9. 103 A.2 Proofs Proof of Proposition 2.2.1: Under any non-idling control, the state-space for the continuous time Markov chain (CTMC)M := fQ(t);I 1 (t); ;I J (t);t 0g can be described as S = 8 > < > : q2f1; 2;g ifq> 0 (i 1 ; ;i J )2f0; 1; ;N 1 gf0; 1; ;N J g ifq = 0 : The transitions for the CTMCM occur when there is: 1) an arrival; 2)a service completion, with no callback; 3) a service completion, followed by a callback. The control decisions are made when case 1 or 3 occurs, and there are idle servers in more than one pool. We begin by ”splitting” the CTMCM into two ”sub” CTMC’s: 1.Q :=fQ(t);t 0g having state spacef0; 1;g and 2.I :=f(I 1 (t); ;I J (t));t 0g having state spacef0; 1; ;N 1 gf0; 1; ;N J g. We next observe thatQ evolves as a one-dimensional birth-and-death process with constant birth rate and constant death rate P j2J p j j N j , that are not affected by the control. When (2.2) holds,Q has a unique stationary distribution that we denote by Q . Next, we assume that the (possibly state-dependent) transition rates of the sub-CTMCI between any two states (i 1 ; ;i J ) when P j2J i j > 0 are exactly the same as for the CTMCM (and so, in contrast toQ, these rates are affected by the control). The state (0; ; 0) is a reflecting boundary state. Then,I is a finite-state, irreducible, aperiodic, J-dimensional birth-and- death process under any non-idling stationary Markovian control. Hence there exists a unique stationary distribution forI, that we denote by I . Define the candidate stationary distribution forM by (s) := 8 > < > : 1 Q (s) ifs =q> 0 2 I (s) ifs = (i 1 ; ;i J ) fori j 2f0; 1; ;N j g ; 104 where Q ; I 2 (0; 1) satisfy both the flow balance equation fromM Q Q (1) X j2J p j j N j = I I ((0; ; 0)) (A.1) and the normalization of the probabilities X s2S (s) = Q + I = 1: (A.2) Together, (A.1) and (A.2) uniquely define Q and I , which uniquely defines (s). Furthermore, is a probability distribution and satisfies all the balance equations for the CTMCM (since Q and I satisfy the balance equations in their respective parts of the state space). We conclude that is the unique stationary distribution forM. In the case that (2.2) is not satisfied,Q is not positive recurrent. Since the transition rates ofQ are the same regardless of the control, it follows that there does not exist a stationary distribution under any routing control. Proof of Theorem 2.3.1: It follows very similarly to the proof of Lemma 2 in Armony and Ward (2010) thatC(v ? ) = d. It follows very similarly to the proof of Lemma 4 in Armony and Ward (2010) thatC(v)C(v ? ), for any v2V. Hence we omit the details. Proof of Lemma 2.3.1: It is straightforward to verify the first two inequalities using the explicit expression for C in (2.16). We show the algebra for the last three inequalities. Recall that ^ h(x) = (x)=(x), and note that 0 (x) = (x); 0 (x) =x(x). It is helpful to first observe that forx(p 2 ) == p p 2 2 , @ [x(p 2 ) ^ h(x(p 2 ))] 1 @p 2 = @ h x ^ h(x) i 1 @x @x @p 2 = 1 2p 2 (x) 2 (x) 2 + (x) x(x) + x(x) (x) : 105 Furthermore, (x) 2 (x) 2 + x(x) (x) < 1; because ^ h(x) increasing inx implies 1 ^ h(x) 2 = (x) 2 (x) 2 is decreasing inx, so that (x) 2 (x) 2 < (0) 2 (0) 2 < 0:64 for all x> 0; the first order condition @(x(x)=(x)) @x = 0 and use of Mathematica for numeric calculation show that the function x(x) (x) has the unique maximum on [0;1) located atx = 0:8399, so that x(x) (x) 0:8399(0:8399) (0:8399) < 0:3. Then, @C @p 2 = 2 1 p 2 1 1 +p 2 2 2 2p 1 p 2 2 p 2 2 (p 2 2 p 1 1 ) 2 1 + (x) x(x) + (p 1 p 2 ) 2p 2 2 (p 2 2 p 1 1 ) (x) 2 (x) 2 + (x) x(x) + x(x) (x) < 2 1 p 2 1 1 +p 2 2 2 2p 1 p 2 2 p 2 2 (p 2 2 p 1 1 ) 2 1 + (x) x(x) + (p 1 p 2 ) 2p 2 2 (p 2 2 p 1 1 ) 1 + (x) x(x) = 2 1 2p 2 2 (p 1 p 2 ) (p 1 +p 2 )(p 2 2 p 1 1 ) 2p 2 2 (p 2 2 p 1 1 ) 2 1 + (x) x(x) < 0 Similarly, @C @ 2 = 2 1 p 1 p 2 p 2 p 2 (p 2 2 p 1 1 ) 2 1 + (x) x(x) + 1 2 2 (p 2 2 p 1 1 ) (x) 2 (x) 2 + (x) x(x) + x(x) (x) < 0: Finally, to see @C @ > 0, first note that forx(p 2 ) == p p 2 2 C = 1 (p 1 p 2 ) p 2 (p 2 2 p 1 1 ) p 2 2 x (x) (x) +x : It is sufficient to show @( (x) (x) +x) @x = (x) 2 x(x)(x)(x) 2 (x) 2 > 0: This follows because 1. @((x)x(x)) @x =x 2 (x) 0 so that (x)x(x) is nondecreasing inx; 106 2. (x) ((x)x(x)) is increasing in x and (x) is decreasing in x for x > 0, so that (x) 2 x(x)(x)(x) 2 > (0) 2 (0) 2 = 1 4 1 2 > 0. Proof of Theorem 2.3.2: For simplicity of notation, defineT (0; 1) :=1,T (K;K + 1) := 0, so thatT (0; 1)>T (1; 2)>T (2; 3)> >T (K 1;K)>T (K;K + 1). We begin under the assumption thatc is arbitrarily large. Suppose we can find functionsV 0 1 ; ;V 0 K1 , a constantd, and threshold valuesl 0 := 0 < l 1 < l 2 < < l K1 < l K :=1; that solve V 00 0 (x)V 0 0 (x) +cx =d; x 0 V 00 1 (x) ( +p 1 1 x)V 0 1 (x) + (1p 1 ) 1 x =d; l 1 x< 0 V 00 2 (x) ( +p 2 2 x)V 0 2 (x) + (1p 2 ) 2 x =d; l 2 x<l 1 . . . . . . V 00 K (x) ( +p K K x)V 0 K (x) + (1p K ) K x =d; x<l K1 (A.3) having V 0 0 (0) =V 0 1 (0); andV 0 j (l j ) =V 0 j+1 (l j ) =T (j;j + 1); j2f1; 2; ;K 1g; (A.4) and for whichV (x) = 8 > > > > > > > > > < > > > > > > > > > : V 0 (x) x 0 V 1 (x) l 1 x< 0 . . . . . . V K (x) x<l K1 satisfies jV (x)jb 1 x 2 +b 2 for allx2R; and someb 1 ;b 2 2R: (A.5) We first argue that such a (V;d) satisfies the conditions of Theorem 2.3.1 and has an associated optimal control that is of threshold structure, i.e.,v ? (x) =v ~ L (x) with ~ L = (l 1 ; ;l K ). This requires the following claim, which we verify at the end of the proof. Claim 1. IfV satisfies (A.3) and (A.4), thenV 0 is increasing. 107 Also, recall thatj ? defined in (2.19) asj ? (x) := min ( argmin j2J fV 0 (x)p j j (1p j ) j g ) gives the argmin in (2.12) in Theorem 2.3.1 that defines the optimal controlv ? . The conditions of Theorem 2.3.1 are satisfied because: For any x >L 1 , V 0 (x) > T (1; 2) = (1p 2 ) 2 (1p 1 ) 1 p 2 2 p 1 1 (because V 0 is increasing by Claim 1) implies V 0 (x)p 2 2 (1p 2 ) 2 > V 0 (x)p 1 1 (1p 1 ) 1 . Since T (1; 2) > T (i;i + 1) for i 2 f2; 3; ;K 1g, also V 0 (x) > T (i;i + 1), so that V 0 (x)p i+1 i+1 (1p i+1 ) i+1 > V 0 (x)p i i (1p i ) i . Hencej ? (x) = 1. For anyx2 (l i1 ;l i ),i2f2; 3; ;Kg, similar reasoning shows thatT (i 1;i) > V 0 (x) > T (i;i + 1) impliesV 0 (x)p j j (1p j ) j >V 0 (x)p i i (1p i ) i for allj6=i. Hencej ? (x) =i. The above two bullet points imply that (2.13) in Theorem 2.3.1 is equivalent to (A.3). The condition (A.4) impliesV 00 0 (0) =V 00 1 (0) andV 00 j+1 (l j ) =V 00 j (l j ); for allj2f1; 2; ;K 1g, so thatV is twice-continuously differentiable. In summary, to conclude from Theorem 2.3.1 that v ~ L with K 1 non-zero thresholds is optimal, it is sufficient to show that there existsV 0 0 ;V 0 1 ; ;V 0 K1 ;d andl 1 ; ;l K1 that solve (A.3) and satisfy (A.4)- (A.5). It is convenient to have the piecewise general solution to (A.3), which is straightforward to find because each ODE is linear. DefineH j (x) := ( p p j j x+ p p j j ) p p j j ( p p j j x+ p p j j ) ; j2K ? . Then, the general solution to (A.3) is V 0 0 (x) = c 2 + cxd ; x 0 V 0 1 (x) = ( d + 1p 1 p 1 )H 1 (x) + 1p 1 p 1 + 1 ( p p 1 1 x + p p 1 1 ) ; l 1 x< 0 V 0 2 (x) = ( d + 1p 2 p 2 )H 2 (x) + 1p 2 p 2 + 2 ( p p 2 2 x + p p 2 2 ) ; l 2 x<l 1 . . . . . . V 0 K (x) = ( d + 1p K p K )H K (x) + 1p K p K + K ( p p K K x + p p K K ) ; x<l K1 : In order that condition (A.5) holds, we must have K = 0. Then, lim x!1 V 0 K (x) = 1p K p K <T (K 1;K), which is consistent withV 0 K increasing up to somel K1 < 0. 108 The next step is to derive the 2K 1 equations, thatd;l 1 ; ;l K1 ; and 1 ; ; K1 must satisfy. In particular, it follows from the general solution to (A.3) and the condition (A.4) that we must solve the following equations: ( d + 1p 1 p 1 )H 1 (0) + 1p 1 p 1 + 1 ( p p 1 1 ) = c 2 d (A.6) ( d + 1p 1 p 1 )H 1 (l 1 ) + 1p 1 p 1 + 1 ( p p 1 1 l 1 + p p 1 1 ) =T (1; 2) (A.7) ( d + 1p 2 p 2 )H 2 (l 1 ) + 1p 2 p 2 + 2 ( p p 2 2 l 1 + p p 1 1 ) =T (1; 2) (A.8) . . . ( d + 1p K2 p K2 )H K2 (l K3 ) + 1p K2 p K2 + K2 ( p p K2 K2 l K3 + p p K2 K2 ) =T (K 3;K 2) (A.9) ( d + 1p K2 p K2 )H K2 (l K2 ) + 1p K2 p K2 + K2 ( p p K2 K2 l K2 + p p K2 K2 ) =T (K 2;K 1) (A.10) ( d + 1p K1 p K1 )H K1 (l K2 ) + 1p K1 p K1 + K1 ( p p K1 K1 l K2 + p p K1 K1 ) =T (K 2;K 1) (A.11) ( d + 1p K1 p K1 )H K1 (l K1 ) + 1p K1 p K1 + K1 ( p p K1 K1 l K1 + p p K1 K1 ) =T (K 1;K) (A.12) ( d + 1p K p K )H K (l K1 ) + 1p K p K =T (K 1;K) (A.13) Claim 2. There existsC K1 (that can be found by a one-dimensional search) such that for allc > C K1 , there existsd;l 1 ; ;l K1 ; and 1 ; ; K1 that solve (A.6)-(A.13). The threshold valuesl 1 ; ;l K1 can be found by a sequence of one-dimensional searches. Once c falls below C K1 , we set l 1 = 0. Then, to conclude from Theorem 2.3.1 that v ~ L with K 2 non-zero thresholds is optimal, it is sufficient to show that there exists V 0 0 ;V 0 2 ;V 0 3 ; ;V 0 K ;d; andL 2 ; ;L K1 , that solve (A.3) and satisfy (A.4) forj2f2; ;K 1g and also satisfy (A.5). Note that whenl 1 = 0,V 0 1 does not appear in the equations. This follows by repeating the same argument. Continued repetition of the argument evidencesC 1 < C 2 < < C K1 that can be 109 found by a sequence of one-dimensional searches, and thatv ~ L withK 1 non-zero thresholds is optimal whenc2 (C K1 ;C K ]. Proof of Claim 2: We first evaluate equations (A.11)-(A.13), which have the variables l K2 ;l K1 ; andd. We consider l K2 as fixed, and solve for K1 ;l K1 ; andd as functions of l K2 . More precisely, we find d as a function ofl K1 , K1 as a function ofd, andl K1 as a function ofl K2 . From (A.13), d(l K1 ) = (T (K 1;K) 1p K p K )=H K (l K1 ) 1p K p K (A.14) From (A.11), K1 (d(l K1 )) = p p K1 K1 l K2 + p p K1 K1 T (K 2;K 1) 1p K1 p K1 d(l K1 ) + 1p K1 p K1 H K1 (l K2 ) : (A.15) Next, we use (A.12) to show that there exists (finite)l K1 > l K2 so that (A.12) is satisfied whend and K1 are determined by (A.14) and (A.15). For that, we view the left-hand side of (A.12) as a function of l K1 f(l K1 ) := d(l K1 ) + 1p K1 p K1 H K1 (l K1 ) + 1p K1 p K1 + K1 (d(l K1 )) ( p p K1 K1 l K1 + p p K1 K1 ) Substituting in for K1 (d(l K1 )) in the above shows f(l K1 ) = 1p K1 p K1 (1 +H K1 (l K1 )) + g(l K1 ) p p K1 K1 l K1 + p p K1 K1 (A.16) for g(l K1 ) = 0 B B B B B B B B B B @ d(l K1 ) p p K1 K1 p p K1 K1 l K1 + p p K1 K1 p p K1 K1 l K2 + p p K1 K1 + p p K1 K1 l K2 + p p K1 K1 T (K 2;K 1) 1p K1 p K1 (1 +H K1 (l K2 )) 1 C C C C C C C C C C A : 110 Note that lim l K1 !1 H j (L K1 ) = 0;j2K ? and, from (A.14),d(l K1 )!1 asl K1 !1. Then, recalling thatl K2 is fixed, it follows form (A.16) that f(l K1 )!1 asl K1 !1: Since by construction f(l K1 )!T (K 2;K 1)>T (K 1;K) asl K1 #l K2 ; it follows that there existsl K1 >l K2 that satisfies (A.12). Now we find l K1 as a function of l K2 0, denoted by L K1 (l K2 ). The next step is to show L K1 (l K2 ) is increasing inl K2 . The argument is by contradiction. Considerl A K2 <l B K2 , and suppose we have used (A.11)-(A.13) to solve for the correspondingl A K1 =L A K1 (l A K2 ) andl B K1 =L B K1 (l B K2 ). Then, the functions V A 0 K1 and V B 0 K1 in the general solution to (A.3) are uniquely determined, and, by construction V A 0 K1 (l A K1 ) =V B 0 K1 (l B K1 ) =T (K1;K) andV A 0 K1 (l A K2 ) =V B 0 K1 (l B K2 ) =T (K2;K1)>T (K1;K): Supposel A K1 l B K1 . Then, sinceV A 0 K1 andV B 0 K1 are increasing by Claim 1, it follows that there must be an intersection point. The ODE V 00 K1 (x) = ( +p K1 K1 x)V 0 K1 (x) (1p K1 ) K1 x +d :=g(x;V 0 K1 ) hasg being Lipschitz continuous inV 0 K1 and continuous inx, so that the Picard-Lindelof theorem guar- antees a unique V 0 K1 in an interval around the intersection point. This is a contradiction, therefore, l A K1 <l B K1 , and we concludeL K1 (l K2 ) is increasing inl K2 . We next evaluate the equations (A.9) and (A.10), in addition to (A.11)-(A.13). We consider l K3 as fixed, and solve for K2 and l K2 that satisfy (A.9) and (A.10). Similar to our previous argument, we can conclude that there existsl K2 > l K3 so that (A.10) is satisfied. Furthermore, similar argument also shows thatl K2 =L K2 (l K3 ) is increasing inL K3 . 111 Continued iteration of the above argument shows that given l 1 0, we can find L 2 (l 1 ), L 3 (L 2 (l 1 )), , L K1 (L K2 ( (L 2 (l 1 )) )) and d 1 (l 1 ) := d(L K1 (L K2 ( (L 2 (l 1 )) ))). We need to show that there existsl 1 and 1 that solve (A.6) and (A.7). From (A.6), we can solve for 1 as a function ofd(l 1 ). Next, we view the left-hand side of (A.7) as a function ofl 1 f(l 1 ) := d 1 (l 1 ) + 1p 1 p 1 + 1p 1 p 1 + 1 (d 1 (l 1 )) ( p p 1 1 l 1 + p p 1 1 ) : As l 1 ! 1, l K1 ! 1, it follows from (A.14) that d(l 1 ) ! 1. Then, it follows from (A.6) that 1 (d(l 1 ))!1. Similar argument to that in the second paragraph of the proof of this claim shows that lim l 1 !1 d(l 1 ) H 1 (l 1 ) is finite. Hence, f(l 1 )!1 asl 1 !1. Furthermore, by construction, f(l 1 )! c d 1 (0) asl 1 # 0. Then, providedcC 1 :=T (1; 2) 2 +d 1 (0), it follows that c 2 d 1 (0) >T (1; 2) = V 0 1 (l 1 ), so that there existsl 1 and(l 1 ) satisfying (A.6) and (A.7). Proof of Claim 1: We know thatV 0 K (x) is monotone inx on (1;L K1 ] and lim x!1 V 0 K (x)! 1p K p K . IfL K1 > 0, thenV 0 K (l K1 ) = T (K 1;K) > 1p K p K , soV 0 K (x) is increasing inx; ifL K1 = 0, then we need to connectV 0 0 (0) andV 0 K (0). Then, the condition (A.6) with 1 = 0 implies d + 1p K p K = c 2 (1 +H K (0)) > 0: Therefore,V 0 K (x) is increasing inx. We next show thatV 0 K1 (x) is increasing inx on [L K1 ; 0]. To do this, we first show (step 1) that V 00 K1 (L K1 )> 0. We then show (step 2) that if there exists a stationary point it must be a local maximum. We finally show (step 3) that having a local maximum leads to a contradiction. Step 1: From the ODE’s, at the pointL K1 , V 00 K1 (L K1 ) (p K1 K1 L K1 )V 0 K1 (L K1 ) (1p K1 ) K1 L K1 =d V 00 K (L K1 ) (p K K L K1 )V 0 K (L K1 ) (1p K ) K L K1 =d 112 It follows from the above equations and the fact thatV 0 K1 (L K1 ) =V 0 K (L K1 ) =T (K 1;K) that V 00 K1 (L K1 ) =V 00 K (L K1 ) SinceV 0 K (x) is increasing inx,V 00 K (L K1 )> 0. Step 2: Suppose there exists a stationary pointx 0 >L K1 (We letx 0 to be the smallest stationary point if there is more than one), whereV 00 (x 0 ) = 0. Then, taking the derivative of the ODE shows that V 000 K1 (x) ( +p K1 K1 x)V 00 K1 (x) (1p K1 ) K1 = 0 At the pointx 0 , we have V 000 K1 (x 0 ) +p K1 K1 V 0 (x 0 ) (1p K1 ) K1 = 0 Since V 0 K1 (x 0 ) = V 0 K1 (L K1 ) = T (K 1;K), it follows that V 000 K1 (x 0 ) = (1p K1 ) K1 p K1 K1 V 0 K1 (x 0 )< (1p K1 ) K1 p K1 K1 T (K 1;K)< 0. Hencex 0 is a local maximum. Step 3: If there exists a local maximum, then there exists a horizontal line atH 0 2 (T (K1;K)V 0 (x 0 )) andL K1 <L c 1 <x 0 <L c 2 havingV 0 K1 (L c 1 ) =V 0 K1 (L c 2 ) =H 0 andV 00 K1 (L c 1 )> 0,V 00 K1 (L c 2 )< 0. From the ODE, V 00 K1 (L c 1 )V 00 K1 (L c 2 ) = (L c 2 L c 1 )((1p K1 ) K1 p K1 K1 H 0 ) This is a contradiction because V 00 K1 (L c 1 ) V 00 K1 (L c 2 ) > 0 but (L c 2 L c 1 )((1 p K1 ) K1 p K1 K1 H 0 ) < (L c 2 L c 1 )((1p K1 ) 2 p K1 K1 T (K 1;K)) < 0. We conclude that such a local maximum cannot exists and soV 0 K1 (x) is increasing inx on [L K1 ; 0]. In the same way, we can show thatV 0 j (x) is increasing inx on [L j ; 0],j2K ? fKg. Note thatV 0 0 (x) is increasing inx on [0;1). Therefore,V 0 (x) is increasing on (1;1). 113 Proof of Lemma 2.3.2: The proof requires use of the functions V 0 1 ;:::;V 0 K1 , the constant d, and the threshold values L 1 ;L 2 ;:::;L K1 that were shown in the proof of Theorem 2.3.2 to solve (A.3) and satisfy the condi- tions of Theorem 2.3.1. Recall that the threshold values can all be expressed as functions of the smallest non-zero threshold valuel k . Without loss of generality, we assumek = 1. We have already shown in the proof of Theorem 2.3.2 thatL j is increasing inL j1 forj2f2; 3;:::;K 1g. Therefore, it is sufficient to show that the thresholdL 1 increases asc increases. For this, we regardV 0 1 (0) that satisfies the left-hand side of (A.6) and connects the ODE solutionsV 0 0 andV 0 1 in (A.3) in a twice-continuously differentiable manner as a function ofl 1 . Suppose we can show that V 0 1 (0) is increasing in l 1 . From (A.6), we have that c = d + 2 V 0 1 (0). Furthermore, from the last paragraph of the proof of Claim 2 in the proof of Theorem 2.3.2, d = d(L K1 (L K2 ( (L 2 (l 1 )) ))), and so is increasing inl 1 . Therefore, c is increasing inl 1 , which also impliesl 1 is increasing inc. The argument to showV 0 1 (0) is increasing inl 1 is by contradiction. It is helpful to first note that since V 0 1 satisfies (A.3) V 0 1 (x) =V 0 1 (l 1 ) + Z x l 1 d + ( +p 1 1 y)V 0 1 (y) (1p 1 ) 1 ydy forx2 [l 1 ; 0]: (A.17) Let l A 1 < l B 1 ;V 0 1;A (0) := V 0 1 (0;l A 1 );V 0 1;B (0) := V 0 1 (0;l B 1 ), and d A 1 := d(L K1 (L K2 ( (L 2 (l A 1 )) )));d B 1 := d(L K1 (L K2 ( (L 2 (l B 1 )) ))). Suppose V 0 1;A (0) V 0 1;B (0). Then, since V 0 1;A (l 1 ) = V 0 1;B (l 1 ) = T (1; 2) by construction, there must exist at least one intersection point z on (l 1 ; 0] where V 0 1;A (z) = V 0 1;B (z). We let z be the smallest intersection point if there is more than one. Since from (A.17), V 0 1;A (z) = V 0 1;A (l 1 ) + Z z l 1 d + ( +p 1 1 y)V 0 1;A (y) (1p 1 ) 1 ydy V 0 1;B (z) = V 0 1;B (l 1 ) + Z z l 1 d + ( +p 1 1 y)V 0 1;B (y) (1p 1 ) 1 ydy; andV 0 1;A (y)<V 0 1;B (y) on [l 1 ;z), it follows thatV 0 1;A (z)<V 0 1;B (z). This is a contradiction. 114 Next, we show that whenc!1, thenl j !1,j2f1; 2; ;K 1g. For this, we extendV 0 K (x) to (1; 0) so that V 0 K (x) = ( d + 1p K p K )H K (x) + 1p K p K ; forx 0: (A.18) Here d is the same d that appears in the second paragraph of this proof, and ensures that V satisfies the conditions of Theorem 2.3.1. It is helpful to note that when c!1, then d!1. This follows from Theorem 2.3.1, since d =C(v ? ) =cE h ^ X(1;v ? ) + i X j2J (1p j ) j E h v ? j ^ X(1;v ? ) ^ X (1;v ? ) i : and it is straightforward to see that the second term on the right-hand side of the above expression is bounded andE h ^ X(1;v ? ) + i > 0. Then, we may assume in the remaining of this proof thatc andd are arbitrarily large. Finally, recall thatV 0 given in Theorem 2.3.2 hasV 0 (x) =V 0 K (x) forxL K1 . Suppose we can establish thatV 0 (x) V 0 K (x) for allx 0. SinceV 0 andV 0 K are both increasing by Theorem 2.3.2, they have unique inverse functionsV 0 1 andV 0 1 K . DefineL K j :=V 0 1 K (T (j;j + 1)), and recallL j :=V 0 1 (T (j;j + 1)),j2f1; 2; ;K 1g. (Note thatL K j is well-defined from (A.18) for large enoughd andL j is well-defined from the proof of Theorem 2.3.2 for large enoughc.) Then, since V 0 (x)V 0 K (x) , it follows that L j =V 0 1 (T (j;j + 1))V 0 1 K (T (j;j + 1)) =L K j : (A.19) From (A.18), L K j =H 1 K (T (j;j + 1) 1p K p K )=( d + 1p K p K ) : (A.20) Then,L K j !1 asd!1 sinceH K (x)# 0 asx!1 andH K (x) is monotone increasing inx. The fact thatL j !1 asc!1 follows from (A.19) and the fact thatd!1 asc!1. To complete the proof, we showV 0 (x) V 0 K (x) for allx 0. It is equivalent to showV 0 1 (y) V 0 1 K (y) whenever both inverse functions are defined. We know V 0 1 (y) = V 0 1 K (y) on ( 1p K p K ;T (K 1;K)]. Next consider the interval (T (K 1;K);T (K 2;K 1)]. First we show that there cannot exist z >T (K 1;K) such thatV 0 1 (z) =V 0 1 K (z), and second we use this fact to showV 0 1 (y)<V 0 1 K (y) fory2 (T (K 1;K);T (K 2;K 1)]. Suppose such az exists. Note that there can be at most one such z because at any such pointV 0 has a larger slop thenV 0 K . To see that, note from (A.3) andK 1 is the 115 index that minimizep j j V 0 (x) (1p j ) j x; j2K whenV 0 (x)2 [T (K 1;K);T (K 2;K 1)] from the proof of Theorem 2.3.2, then V 00 (z)V 00 K (z) = ((p K1 K1 (1p K1 ) K1 ) (p K1 K1 (1p K1 ) K1 ))V 0 1 (w)> 0 Furthermore, it follows thatV 0 K (x)V 0 (x) forx2 (L K1 ;L K2 ), so that V 0 1 K (y)V 0 1 (y) fory2 (T (K 1;K);T (K 2;K 1)): (A.21) We will derive a contradiction in order to conclude that no suchz exists. The first step is to take the derivative of the inverse function to find V 0 1 (z) =l K1 + Z z T (K1;K) d + ( +p K1 K1 V 0 1 (y))y (1p K1 ) K1 V 0 1 (y) 1 dy; V 0 1 K (z) =l K1 + Z z T (K1;K) d + ( +p K K V 0 1 (y))y (1p K ) K V 0 1 (y) 1 dy: For eachy2 (T (K 1;K);T (K 2;K 1)), d +y + (p K1 K1 y (1p K1 ) K1 )V 0 1 (y) 1 d +y + (p K K y (1p K ) K )V 0 1 (y) 1 > d +y + (p K K y (1p K ) K )V 0 1 K (y) 1 (A.22) The first inequality follows becauseK 1 is the index that minimizep j j V 0 (x) (1p j ) j x; j2K when V ( x)2 [T (K 1;K);T (K 2;K 1)] and V 0 1 (y) < 0. The second inequality follows by (A.21) andp K K y (1p K ) K > 0. We conclude from (A.22) thatV 0 1 (z) < V 0 1 K (z), which is a contradiction. Since no intersection pointz can exist, eitherV 0 1 (y)>V 0 1 K (y) orV 0 1 (y)<V 0 1 K (y) on (T (K1;K);T (K2;K1)). Similar argument shows thatV 0 1 (y)>V 0 1 K (y) leads to a contradiction. Finally, iterate this argument we can show V 0 1 (y) < V 0 1 K (y) on each [T (j 1;j);T (j;j + 1));j 2 f1; 2;:::;K 1g. That is, V 0 1 (y)V 0 1 K (y);y2 ( 1p K p K ;V 0 1 K (0)]: (A.23) 116 Proof of Corollary 2.3.3: First we show thatT (i;j)T (i;k)T (j;k). It is straightforward to verify that for anya;b;c;d satisfying a b c d ,b> 0 andd> 0, also a b a+c b+d c d . From the definition ofT (i;j) in (2.20),T (i;j)+1 = j i p j j p i i and T (j;k) + 1 = k j p k k p j j . Hence T (i;j) T (j;k) is equivalent to j i p j j p i i k j p k k p j j . Both denominators in the inequality are positive, so we can apply a b a+c b+d c d and get j i p j j p i i k i p k k p i i k j p k k p j j . The middle term is exactlyT (i;k) + 1. Therefore,T (i;j)T (i;k)T (j;k): It follows by assumption thatV andd satisfy (2.12) so that V 00 (x) +cx + V 0 (x) + min v2V(J 0 ) ( X n2J 0 V 0 (x)p n n (1p n ) n v n (x) ) x =d for allx2R: We next show that this sameV andd satisfy V 00 (x) +cx + V 0 (x) + min v2V(J ) ( X n2J V 0 (x)p n n (1p n ) n v n (x) ) x =d for allx2R: SinceT (i;j) T (i;k) T (j;k), from the text following (2.20) in Chapter 2, j cannot be the index to minimize the expressionV 0 (x)p j j (1p j ) j , and so min v2V(J 0 ) ( X n2J 0 V 0 (x)p n n (1p n ) n v n (x) ) = min v2V(J ) ( X n2J V 0 (x)p n n (1p n ) n v n (x) ) : We conclude that the same V and d that satisfy the conditions of Theorem 2.3.1 withV =V(J 0 ) also satisfy the conditions of Theorem 2.3.1 withV =V(J ). Therefore, (v J 0 1 ; ;v J 0 i1 ; 0;v J 0 i+1 ; ;v J 0 J1 ) is an optimal control whenV =V(J ). Proof of Corollary 2.3.4: First, reduce all poolsj for whichj >i (equivalently,p j j >p i i ) andp j p i for somei2J , and group the remaining pools in the setS. Then, theS =jSj pools in the setS can be re-labeled so that p 1 1 <p 2 2 <<p S S andp 1 >p 2 >>p S 117 Furthermore, p S = min j2J fp j g. Also, the pool re-labeled asS must be an element inK ? S. Hence p S =p K , forp K in the re-labeling (2.21). It follows from Theorem 2.3.2 that there existsV andd that satisfy the conditions of Theorem 2.3.1; specifically, that solve (2.12) withK ? replacingJ . Then, to complete the proof, it is sufficient to show that this sameV andd satisfy V 00 (x)+cx + V 0 (x)+ min v2V(J ) ( X n2J V 0 (x)p n n (1p n ) n v n (x) ) x =d for allx2R (A.24) and V 00 (x) +cx + V 0 (x) + min v2V(J 0 ) ( X n2J 0 V 0 (x)p n n (1p n ) n v n (x) ) x =d for allx2R: (A.25) Consider any poolk2J butk = 2S. Then, there existsn2S for whichp n n <p k k andp n p k . Since n2S,p n p K . Recall thatV 0 is increasing from Theorem 2.3.2 andV 0 (x)! 1p K p K asx!1 from the proof of Theorem 2.3.2. It follows that V 0 (x) 1p K p K 1p n p n 1p k p k : Furthermore, also noting the definition (2.20), 1p k p k T (n;k) = (p k p n ) n p k (p k k p n n ) 0: Hence V 0 (x)T (n;k) = (1p k ) k (1p n ) n p k k p n n ; which is equivalent top k k V 0 (x) (1p k ) k p n n V 0 (x) (1p n ) n . In particular, letting = K ? [fkg, min v2V() 8 < : X n2K ? [fkg V 0 (x)p n n (1p n ) n v n (x) 9 = ; = min v2V(K ? ) 8 < : X n2K ? [fkg V 0 (x)p n n (1p n ) n v n (x) 9 = ; (A.26) Repeating the above argument for each pool k 2 J but k = 2 S in turn implies (A.26) also holds with =K ? [JS Finally, note that any poolk2S butk = 2K ? must satisfy the conditions of Corollary 2.3.3. 118 Then, iteratively applying Corollary 2.3.3 shows that (A.26) also holds with any satisfyingK ? V, from which we conclude (A.24) and (A.25) is valid. Proof of Theorem 2.3.5: This theorem follows from Theorem 2.3.2, Corollary 2.3.3 and 2.3.4. 119 Appendix B Technical Appendix to Chapter 3 Proof of Proposition 3.3.1: First we introduce a useful lemma. Lemma B.0.1. B() +B 0 ()> 0; 2(B 0 ()) 2 B()B 00 ()> 0: The second order derivative of objectiveU() is U 00 () = (P ()B()) 00 =P 00 ()B() + 2P 0 ()B 0 () +P ()B 00 () =P 00 ()B() + 2B 0 () B() (P ()B()) 0 P () B() 2(B 0 ()) 2 B()B 00 () (B.1) SinceP 00 ()< 0 due to strict concavity, andB 0 ()< 0, from (B.1) and Lemma B.0.1, as long asU() 0 0, we haveU 00 () < 0. That means,U 0 () is decreasing in whenU 0 () 0, which implies thatU 0 () only has at most one zero point. Consider the value of U 0 () at and (to be more rigorous, the right derivative at and the left derivative at), we have 3 cases: a. U 0 () 0, the maximum is at. b. U 0 () 0, the maximum is at. c. U 0 ()> 0 andU 0 ()< 0, then there exists a unique zero point E 2 (;) achieving the maximum. The FOC (3.5) in Chapter 3 is P R =r() := B() +B 0 () (1p())(B() +B 0 ())p 0 ()B() : 120 The derivative regarding is r 0 () = p 0 () 2 2(B 0 ()) 2 B()B 00 () + (p 00 () + 2p 0 ())B()(B() +B 0 ()) ((1p())(B() +B 0 ())p 0 ()B()) 2 : Since p() is strictly concave on [; ], we have (p()) 00 = p 00 () + 2p 0 () < 0. Combining Lemma B.0.1, we have r 0 ()< 0 Therefore,r() is decreasing in. Ifr()>P R >r(), there exists a unique2 (;) such that FOC is satisfied. Note that if = 0,r(0) = 1 1p(0) is still well defined. Proof of Lemma 3.3.1: Similar to the proof of Theorem 8 in Gopalakrishnan et al. (2014), we can show that in M/M/N+M systems, LISF routing or random routing gives the same steady-state idleness distributions. Similar to Theorem 1 in Gopalakrishnan et al. (2014), we can prove this lemma through Markov chain flow charts. Proof of Proposition 3.3.2: We show the existence of a unique symmetric equilibrium satisfies (3.12) in two steps: First we show for any fixed, @ ^ U( 1 ;) @ 1 has at most 1 zero-point and it hit the zero-point, if exists, from above. Then we show that the zero-point of @ ^ U( 1 ;) @ 1 is decreasing in. Therefore, there is a unique fixed point: either the boundary point or ^ E 2 (;) satisfying @ ^ U( 1 ; ^ E ) @ 1 1 =^ E = 0: From the first step, we know that the fixed point is a maximum, therefore it is the unique symmetric equilibrium. 121 Ideally, we can show the concavity of ^ U( 1 ;). However, that may not be the case. Therefore, we lower our aspiration and show there is at most 1 zero-point. When<N, ^ B( 1 ;) = + (N= 1) 1 ; @ ^ B( 1 ;) @ 1 = (N= 1) ( + (N= 1) 1 ) 2 : The derivative of ^ U( 1 ;) on 1 is @ ^ U( 1 ;) @ 1 = 8 > < > : P 0 ( 1 ) ifN P 0 ( 1 ) ^ B( 1 ;) +P ( 1 ) @ ^ B( 1 ;) @ 1 ifN> @ 2 ^ U( 1 ;) @ 2 1 = 8 > > < > > : P 00 ( 1 ) ifN P 00 ( 1 ) ^ B( 1 ;) + 2 @ ^ U( 1 ;) @ 1 @ ^ B( 1 ;) @ 1 ^ B( 1 ;) ifN> The last equation is due to 2 @ ^ B( 1 ;) @ 1 ! 2 ^ B( 1 ;) @ 2 ^ B( 1 ;) @ 2 1 = 0: (B.2) Note that @ ^ U( 1 ;) @ 1 , @ 2 ^ U( 1 ;) @ 2 1 are both continuous in. SinceP 00 ( 1 )< 0 and @ ^ B( 1 ;) @ 1 < 0, when @ ^ U( 1 ;) @ 1 is positive, @ 2 ^ U( 1 ;) @ 2 1 is negative. Therefore, @ ^ U( 1 ;) @ 1 has at most 1 zero-point. The zero-point, if exists, is the maximum. Also note that each agent would not choose a service rate lager than ? , otherwise, deviating to ? will get largerP ( 1 ) value and no smallerB( 1 ;) value. Case 1:N ? . Since each agent would not choose a service rate larger than ? , we have ^ B( 1 ;) = 1. Now each agent chooses her service rate ? to maximizeP (). Therefore, ? is the unique symmetric equilibrium. Case 2: < N ? . Since no agent would choose a service rate larger than ? , ^ B( 1 ;) 1. We simplify the expression ^ B( 1 ;) = 1 1 +k() 1 ; wherek() =N= 1=. We have @ ^ U( 1 ;) @ 1 = P S P F (1p( 1 )) + 1 (1 +k 1 )P F p 0 ( 1 ) (1 +k 1 ) 2 : 122 Given, if we have FOC point, agent 1 chooses that 1 ; otherwise, agent 1 chooses corner points. Denote by the numerator asF ( 1 ;k). We have @F ( 1 ;k) @ 1 = (1 +k 1 )P F (p( 1 ) 1 ) 00 < 0; @F ( 1 ;k) @k = 2 1 P F p 0 ( 1 )< 0: Therefore, @ 1 @k = @F ( 1 ;k) @ 1 @F ( 1 ;k) @k < 0: (B.3) Given , if F ( 1 ;k()) > 0 on for 1 2 [; ? ], then the choice of server 1 is ? . Increasing would decreaseF ( 1 ;k()), and may generate a FOC point on [; ? ]. When server 1 chooses the FOC point 1 , from (B.3), combining thatk() is increasing in, the FOC point 1 is continuously decreasing in. If F ( 1 ;k())< 0 on for 1 2 [; ? ], then the choice is server 1 is. Increasing would further decrease F ( 1 ;k()), and server 1 keeps choosing. In summary, the choice of 1 is non-increasing in. Therefore, when we increase from to ? , the choice of agent 1 1 is non-increasing on [; ? ] and there will be a unique intersection 1 =, which is the unique symmetric equilibrium. Furthermore, if< N < ? , @ ^ U( 1 ; N ) @ 1 1 = N =P 0 N >P 0 ( ? ) 0; where the first inequality is due to the concavity of P (). Therefore, when others are working at speed N >, agent 1 would choose 1 > N . Using similar argument, the symmetric equilibrium is on N ; ? . Therefore, we have a unique ^ E 2 maxf N ;g; ? as a symmetric equilibrium. When N increases,k() = N 1 decreases, so the choice of 1 () decreases from (B.3) if it is not the corner point. Noting ^ E is the fixed point of 1 (), ^ E is non-increasing in N . 123 Proof of Proposition 3.3.3: First we show that when!1,B ( 1 ;)! ~ B( 1 ;). Then we show the uniform convergence. B ( 1 ;) = P N 1 i=0 i 1 i! + N 1 1 (N 1)! P 1 i=1 Q i k=1 (N 1)+ 1 +k P N 1 i=0 i 1 i! + 1 P N 1 i=0 i N i i! + N 1 1 (N 1)! P 1 i=1 Q i k=1 (N 1)+ 1 +k = B1 +QL B1 +I1 +QL ; whereB1 := P N 1 i=0 ( ) i 1 i! corresponds to the states when the tagged agent is busy and the system has no queue;I1 := 1 P N 1 i=0 ( ) iN i i! corresponds to the states when the tagged agent is idle and therefore the system has no queue;QL := N 1 1 (N 1)! P 1 k=1 Q k i=1 (N 1)+ 1 +i corresponds to the states when the system has a queue. Ifb < 1 , from Whitt (2004), when!1, we know that Pr(Number of busy server < N )! 0. Therefore, when!1.B ( 1 ;)! 1. Ifb 1 , Note that B1 = N 1 X i=0 Pr(Y =i) exp = Pr(Y <N ) exp ; I1 = 1 N 1 X i=0 i N i i! = 1 0 @ N N 1 X i=0 i 1 i! N 1 X i=1 i 1 (i 1)! 1 A = 1 0 @ N N 1 X i=0 i 1 i! + N 1 (N 1)! 1 A = 1 N 1 Pr(Y <N ) + 1 Pr(Y =N 1) exp ; whereY is a Poisson random variable with parameter . When is large, Y converges to a standard normally distribution in probability. Therefore, whenb> 1 , Pr(Y N )! 1; 124 whenb = 1 , Pr(Y N )! 1=2: By Stirling’s approximation, Pr(Y =N ) N exp 1 p 2N (N =e) N = N N exp N 1 p 2N = 1 o() N N exp (o()) 1 p 2N 1 p 2N : Sinceb> 1 , from Whitt (2004), as!1, we know thatP (Number of busy server<N )! 1, therefore QL=(B1 +I1)! 0. Therefore, when!1, B ( 1 ;) B1 B1 +I1 1 1 + 1 N 1 : In summary, as!1, B ( 1 ;)! 8 > < > : + 1 (b1) ifb> 1 1 ifb 1 : Next we show uniform convergence. Ifb > 1 , when is large enough, notingN = b +o(), we have (N 1)<. Therefore, 1 X i=1 i Y k=1 (N 1) + 1 +k 1 X i=1 i Y k=1 (N 1) = (N 1) : 125 B ( 1 ;) ~ B( 1 ;) B ( 1 ;) B1 B1 +I1 + B1 B1 +I1 ~ B( 1 ;) I1QL (B1 +I1) 2 + 1 Pr(Y =N 1) Pr(Y <N ) (1 + 1 (N = 1=)) 2 1 (N = 1=) + 1 Pr(Y =N 1) Pr(Y <N ) Pr(Y =N 1) (N 1) (1 + 1 (N = 1=)) 2 + 1 Pr(Y =N 1) Pr(Y <N ) (1 + 1 (N = 1=)) 2 Pr(Y =N 1) (b 1= + Pr(Y =N 1) Pr(Y <N ) ) 1 b1 + Pr(Y <N ) (1 + 1 (b 1=)) 2 : Note that as!1, Pr(Y < N )! 1 and Pr(Y = N 1)! 0. Therefore, the above bound does not depend on 1 and converges to 0. Ifb 1 ,B ( 1 ;)< ~ B( 1 ;) = 1. From the Proof of Lemma B.0.2, @B ( 1 ;) @ 1 < 0. Therefore, for any 1 2 [;], B ( 1 ;) 1 B (;) 1 ! 0: By now, we have shown the uniform convergence. Proof of Theorem 3.3.1: We show the existence of a unique symmetric equilibrium similar to Proof of Proposition 3.3.2 in two steps: First we show that when is large enough, for any fixed, @U( 1 ;) @ 1 has at most 1 zero-point. Then we show the continuity of @U( 1 ;) @ 1 on. From the first step, we know that the fixed point is a symmetric equilibrium. The FOC of server 1’s objectiveU( 1 ;) is @U( 1 ;) @ 1 =P 0 ( 1 )B( 1 ;) +P ( 1 ) @B( 1 ;) @ 1 = 0: 126 Similar to (B.1), the second order derivative is P 00 ( 1 )B( 1 ;)+ 2 B( 1 ;) @B( 1 ;) @ 1 @(P ( 1 )B( 1 ;)) @ 1 + P ( 1 ) B( 1 ;) B( 1 ;) @ 2 B( 1 ;) @ 2 1 2 @B( 1 ;) @ 1 2 ! : Similar to Lemma B.0.1, we have a lemma for the property ofB( 1 ;). Lemma B.0.2. When is large enough, B( 1 ;) @ 2 B( 1 ;) @ 2 1 2 @B( 1 ;) @ 1 2 < 0: (B.4) Therefore, @U( 1 ;) @ 1 is decreasing as long as it is nonnegative, implying it has no more than 1 zero point. The maximum ofU( 1 ;) is achieved either at the corner points and, or at the FOC point. Therefore, as is moving from to continuously, the maximum solution 1 moves continuously on [;]. Therefore, there must exist a E satisfying (3.4). Similar to Proof of Proposition 3.3.2, if ? <, we haveP 0 ( ? ) = 0, @U( 1 ; ? ) @ 1 1 = ? < 0, implying E < ? . Therefore, E 2 ; ? . Next we show E ! ^ E : As increase to1, if E has a subsequence i E converging to ~ 6= ^ E , then for any 1 2 [;], P ( i E )B i ( i E ; i E )P ( 1 )B i ( 1 ; i E ): (B.5) From Proposition 3.3.3,B i uniformly converges to ^ B. We have B i ( i E ; i E )! ^ B(~ ; ~ ); B i ( 1 ; i E )! ^ B( 1 ; ~ ): Taking i !1 on (B.5), we have P (~ ) ^ B(~ ; ~ )P ( 1 ) ^ B( 1 ; ~ ): Therefore, ^ U(~ ; ~ ) max 1 ^ U( 1 ; ~ ): 127 This contradicts to the uniqueness of ^ E from Proposition 3.3.2. Therefore, any convergent subsequence of E converge to ^ E . We have E ! ^ E : Proof of Proposition 3.4.1: Since (N;) solves (3.19), fixingN, solves max 2[;] c A 1 c F c A (1p()) B(;)c A Server 1’s objective is max 1 2[;] P S (1P R (1p( 1 ))) 1 B( 1 ;); whereP R = c F c A . We have dB( 1 ;) = @B( 1 ;) @ 1 d 1 + @B( 1 ;) @ d: SinceB( 1 ;) is decreasing in both 1 and, @B( 1 ;) @ 1 < 0; @B( 1 ;) @ < 0: Therefore, @B( 1 ;) @ 1 1 = > @B(;) @ : At the maximum solving (3.19), the FOC is satisfied: (1P R (1p()p 0 ()))B(;) + (1P R (1p())) @B(;) @ = 0: Therefore, when others are working at, the first order derivative on server 1’s objective at 1 = is (1P R (1p( 1 )p 0 ( 1 ) 1 ))B( 1 ;) + (1P R (1p( 1 ))) 1 @B( 1 ;) @ 1 1 = > 0: 128 Therefore, server 1 would choose a service rate larger than when all the other servers are working at, indicating that the equilibrium service rate will be larger than. Proof of Lemma 3.4.1: SinceC LB (N ;)c S N , whenN =w(), lim inf !1 C LB (N ;) c s lim inf !1 N =1: When is large andN =o(),B ()! 1. lim inf !1 C LB (N ;) =c A + lim inf !1 N (c S c A +c F (1p())) =c A : Proof of Proposition 3.4.2: N =b +o(). IfN , orb 1, from Garnett et al. (2002), lim !1 B () = 1. lim !1 C LB (N ;) =c S b +c A (1b) +c F (1p())b: If<N , orb> 1, from Garnett et al. (2002), lim !1 B () = 1 b . lim !1 C LB (N ;) =c S b +c F (1p()): Proof of Lemma 3.4.2: Ifb 1, (3.21) isc S b +c F (1p()), which is increasing inb, the minimum is achieved whenb = 1 . If b 1, (3.21) isc S b +c A (1b) +c F (1p())b =c A + c S +c F (1p())c A b, if c A > c S +c F (1p()); (B.6) 129 the minimum is achieved whenb = 1, otherwise, the minimum is achieved atb = 0. Whenb = 1, (3.21) is simplified as c S +c F (1p()) = c S +c F (1p()) : If a b c d and b > 0;d > 0, we have a b a+c b+d c d . For any 1 ; 2 2 [;] and 2 (0; 1), if c S +c F (1p( 1 )) 1 1 c S +c F (1p( 2 )) 2 2 , we have (c S +c F (1p( 1 )) 1 ) 1 c S +c F (1p( 1 )) 1 + (1)c F (1p( 2 )) 2 1 + (1) 2 (1)(c S +c F (1p( 2 )) 2 ) (1) 2 : Notingp() is concave on [;], implying (1p()) is convex, therefore, c S +c F (1p( 1 + (1) 2 ))( 1 + (1) 2 ) 1 + (1) 2 c S +c F (1p( 1 )) 1 +c F (1)(1p( 2 )) 2 1 + (1) 2 : Therefore, c S +c F (1p()) is quasiconvex on [;]. It has a unique minimum solution ^ . And ^ b = 1 ^ . The condition (B.6) can be written as ^ c LB := c S ^ +c F (1p(^ ))<c A : Proof of Theorem 3.4.1 For any staffing and compensation policy, assume the equilibrium service rate is E . The number of aban- donments per unit time isAb := E NB( E ). The total system cost is c S N +c A ( E NB( E )) +c F (1p( E )) E NB( E ) =c S Ab E B( E ) +c F (1p( E ))(Ab) +c A Ab c S E +c F (1p( E )) (Ab) +c A Ab ^ c LB (Ab) + ^ c LB Ab = ^ c LB 130 Proof of Lemma 3.4.3: We know that ^ is the solution to minimize c S +c F (1p()) and ^ c LB is the minimum cost. Therefore, for any2 [;], we have ^ c LB c S +c F (1p()); or ^ c LB c F (1p())c S ; where the equality is only achieved when = ^ . It means that ^ is the solution to max 2[;] ^ c LB c F (1p()) = ^ c LB 1 c F ^ c LB (1p()) : Therefore, whenP R = c F ^ c LB , ? (P R ) = ^ is the rate to maximize (1P R (1p())). If ^ 2 (;), then ^ satisfies the first order condition: 1P R (1p()) +P R p 0 () = 0: Therefore,P R = 1 1p(^ )p 0 (^ )^ is unique. Note that the FOC of the system cost gives: c S 2 c F p 0 () = 0; implyingp 0 (^ )^ = c S c F ^ . Therefore, c F c S ^ c LB = 1 c S c F ^ + 1p(^ ) = 1 1p(^ )p 0 (^ )^ : We can see the two expressions are consistent. If ^ =, then anyP R c F ^ c LB would motivate the servers to work at; if ^ =, then any 0P R c F ^ c LB would motivate the servers to work at. 131 Proof of Theorem 3.4.2: From Lemma 3.4.3, when b = 1 ^ and P R = c F ^ c LB , ? = ^ . From Theorem 3.3.1 and Proposition 3.3.2, E ! ^ E = ^ . Therefore, lim !1 C(N ;) = c S ^ +c F (1p(^ )) = ^ c LB : Proof of Lemma 3.4.4: Given the proposed compensation and staffing scheme, N ^ = ? (P R ). From the proof Theorem 3.3.1, the servers choose service rate E 2 [; ? ]. P ( E )B( E ; E )P ( ? )B( ? ; E )P ( ? )B( ? ; ? ) = ^ c LB (1 (1p( ? ))) ? =c S The first inequality is because each server prefers E to ? ; the second inequality is becauseB( 1 ;) is decreasing in. When!1, we haveB(^ )! 1. Therefore, P S ! ^ c LB ; P F !c F : Proof of Proposition 3.4.3: ? c F ^ c LB = ^ = argmin 2[;] c S +c F (1p()). We have shown that c S +c F (1p()) is quasiconvex in Proof of Lemma 3.4.2. Therefore, ^ is either at the boundary or the FOC point of c S +c F (1p()). If ^ 2 (;), then ^ is an FOC point, we have c S ^ 2 c F p 0 (^ ) = 0: Increasingc S will decrease the derivative at ^ to negative, so the new minimum point is larger than ^ . When ^ =, c S ^ 2 c F p 0 (^ ) 0, increasingc S may get the FOC valid in an interior point, which is the minimum point larger than ^ . When ^ =, c S ^ 2 c F p 0 (^ ) 0, increasingc S keep the derivative negative 132 on [;], implying ^ staying at. Therefore, the minimum ^ is non-decreasing inc S and the monotonicity is strict when ^ 2 (;). Similarly, we can show that ^ is non-increasing inc F and the monotonicity is strict when ^ 2 (;).. Proof of Theorem 3.5.1: We first show the approximating utility ^ U( 1 ;) =P ( 1 ) ^ B( 1 ;) +U I ( ^ B( 1 ;)): When b 1 ? , we know that each agent working at ? is a symmetric equilibrium when is large. We acknowledge the possibility that in a symmetric equilibrium the agents are working at a speed larger than ? to get idle time. However, when we optimize the system performance, we need to consider the worst case. When b > 1 ? , if all the other servers choose 1 b < ? , then server 1 has incentive to choose a service rate larger than 1 b . Therefore, a symmetric equilibrium should be larger than 1 b . We have ^ B( 1 ;) = +(b1) 1 . The first order derivative of ^ U( 1 ;) is @ ^ U( 1 ;) @ 1 = P ( 1 ) +U 0 I ( ^ B( 1 ;)) @ ^ B( 1 ;) @ 1 +P 0 ( 1 ) ^ B( 1 ;): The second order derivative is @ 2 ^ U( 1 ;) @ 2 1 =P ( 1 ) @ 2 ^ B( 1 ;) @ 2 1 + 2P 0 ( 1 ) @ ^ B( 1 ;) @ 1 +P 00 () ^ B( 1 ;) +U 00 I ( ^ B( 1 ;)) @ ^ B( 1 ;) @ 1 ! 2 +U 0 I ( ^ B( 1 ;)) @ 2 ^ B( 1 ;) @ 2 1 =P 00 ( 1 ) ^ B( 1 ;) +U 00 I ( ^ B( 1 ;)) @ ^ B( 1 ;) @ 1 ! 2 + 2 @ ^ U( 1 ;) @ 1 @ ^ B( 1 ;) @ 1 ^ B( 1 ;) + P ( 1 ) +U 0 I ( ^ B( 1 ;)) ^ B( 1 ;) 0 @ ^ B( 1 ;) @ 2 ^ B( 1 ;) @ 2 1 2 @ ^ B( 1 ;) @ 1 ! 2 1 A =P 00 ( 1 ) ^ B( 1 ;) +U 00 I ( ^ B( 1 ;)) @ ^ B( 1 ;) @ 1 ! 2 + 2 @ ^ U( 1 ;) @ 1 @ ^ B( 1 ;) @ 1 ^ B( 1 ;) 133 The last equality is due to (B.2). SinceP () andU I () are both concave, as long as @ ^ U( 1 ;) @ 1 > 0, we have @ ^ U( 1 ;) @ 1 < 0. Therefore, there is at most 1 zero-point for the FOC. Move from maxf; 1 b g to, 1 will meet at some point, which is a symmetric equilibrium ^ E solving (3.27). Similarly, when we face theB( 1 ;) instead of ^ B( 1 ;), from Lemma B.0.2, when is large enough, @ 2 U( 1 ;) @ 2 1 <P 00 ( 1 )B( 1 ;) +U 00 I (B( 1 ;)) @B( 1 ;) @ 1 2 + 2 @U( 1 ;) @ 1 @B( 1 ;) @ 1 B( 1 ;) ; which is negative as long as @U( 1 ;) @ 1 > 0. Therefore, there is at most 1 zero-point for the FOC and there exist at least one symmetric equilibrium. If the solution ^ E is unique, similar to Proof of Theorem 3.3.1, we can show the existence of a symmetric equilibrium for the original systems and the convergence to ^ E when!1. Proof of Lemma 3.5.1: In the centralized system, the manager asks the agents to work at and wants to chooses the optimalb. We assumeb 1 , by choosingb the manager can adjust the utilizationx := 1 b . The IR condition should be satisfied so the manager only need pay each agentc S U I (x). The service failure cost is a function of, so it is fixed. The manager wants to minimize the fluid scaled staffing cost: min x2(0;1] (c S U I (x))b = (c S U I (x)) 1 x (B.7) The first order condition is U 0 I (x)xc S +U I (x) x 2 = 0; or U I (x)U 0 I (x)x =c S : The derivative of the left-hand side isU 00 I (x)x > 0. Whenx# 0, the left-hand side goes toU I (0) < c S . If lim x"1 U 0 I (x) <c S , the FOC has a unique solutionx 2 (0; 1), which is also the minimal solution to the objective; otherwise, x = 1. In other words, if lim x"1 U 0 I (x)c S , the manager should keep the utilization to be 1 and the problem reduces to the previous problem; if lim x"1 U 0 I (x)<c S , it is optimal to 134 keep the utilization strictly less than 1. In the second case, the manager wants to choose the optimal based onx . min c S U I (x ) x +c F (1p()) (B.8) Since c S U I (x ) x > 0, from Proof of Lemma 3.5.1, the above function is quasiconvex in. It has an unique minimal solution = ^ I . From (B.7), we know that c S U I (x ) x c S U I (1) 1 =c S , so c S U I (x ) x +c F (1p()) c S +c F (1p()): Therefore, c S U I (x ) x ^ I +c F (1p(^ I )) min c S +c F (1p()); (B.9) meaning quality-driven regime is better than critical staffing regime. Note the optimal staffing level is ^ b = 1 x ^ I . By fluid approximation of quality-driven regime we have lim !1 B ( ^ b I ; ^ I ) =x : Proof of Theorem 3.5.2 For any staffing and compensation policy, assume the equilibrium service rate is E . The number of aban- donments per unit time isAb := E NB( E ). The total system cost is (c S U I (B( E )))N +c A ( E NB( E )) +c F (1p( E )) E NB( E ) = (c S U I (B( E ))) Ab E B( E ) +c F (1p( E ))(Ab) +c A Ab c S U I (B( E )) E B( E ) +c F (1p( E )) (Ab) +c A Ab ^ c I (Ab) + ^ c I Ab = ^ c I : 135 Proof of Theorem 3.5.3: We need to show that when the utility function is ^ B and the staffing level is ^ b I := 1 ~ B^ I , settingP R;I := c F ^ c I could incentivize the servers to work at ^ I . Suppose other servers are working at while server 1 chooses 1 , ^ c I c S U I ( ~ B) 1 ~ B +c F (1p( 1 )) c S U I (B( 1 ;)) 1 B( 1 ;) +c F (1p( 1 )); (^ c I c F (1p( 1 ))) 1 B( 1 ;) +U I (B( 1 ;))c S : The equality holds if and only if 1 = = ^ I . Therefore, ^ I is the solution to max 1 2[;] (^ c I c F (1p( 1 ))) 1 B( 1 ;) +U I (B( 1 ;)): That means whenP F =c F ,P S = ^ c I , and all other servers are working at ^ I , server 1 will also choose ^ I . In fact, if ~ 6= ^ I is also a symmetric equilibrium, then the IR condition cannot be satisfied. When the solution ^ to (3.27) is unique, from Theorem 3.5.1, we know that the true equilibrium E ! ^ I , andB ( E )! ~ B, as!1. Therefore, lim !1 C(N ; ^ c I ;c F ) = (c S U I ( ~ B)) 1 ~ B^ I +c F (1p(^ I )) = ^ c I : Proof of Lemma 3.5.2: First we show that c S +(1p())g F (1p()) has a unique solution. Similar to the proof of Lemma 3.4.2, if a b c d and b > 0;d > 0, we have a b a+c b+d c d . For any 1 ; 2 2 [;] and 2 (0; 1), if c S +(1p( 1 ))g F (1p( 1 )) 1 1 c S +(1p( 2 ))g F (1p( 2 )) 2 2 , we have (c S + (1p( 1 ))g F (1p( 1 )) 1 ) 1 c S +(1p( 1 ))g F (1p( 1 )) 1 + (1)(1p( 2 ))g F (1p( 2 )) 2 1 + (1) 2 (1)(c S + (1p( 2 ))g F (1p( 2 )) 2 ) (1) 2 : ((1p())g F (1p())) 00 =g 00 F (1p())(p 0 ()) 2 (1p()) 2g 0 F (1p())p 0 () 1p()p 0 () g F (1p())(p()) 00 : 136 We know thatg F is increasing convex sog 00 F > 0 andg 0 F > 0, also thatp 0 < 0 andp() is concave so (p()) 00 < 0, therefore, (1p())g F (1p()) is convex, implying c S + (1p( 1 + (1) 2 ))g F (1p( 1 + (1) 2 ))( 1 + (1) 2 ) 1 + (1) 2 c S +(1p( 1 ))g F (1p( 1 )) 1 + (1)(1p( 2 ))g F (1p( 2 )) 2 1 + (1) 2 : Therefore, c S + (1p())g F (1p()) is quasiconvex on [;]. It has a unique minimum solution ^ A . Ifb 1, the fluid scaled abandonment is 0. The problem is the same with linear abandonment cost case. Ifb< 1, the fluid scaled abandonment isx = 1b. Fixinga, the manager wants to choose to min :c S 1x +xg A (x) + (1p())g F (1p())(1x) The optimal service rate does not depend onx. We have defined ^ A := argmin 2[;] c S +(1p())g F (1 p()) and ^ c A := c S ^ A + (1p(^ A ))g F (1p(^ A )). Next we need to find the optimalx: min x : (1x)^ c A +xg A (x): If ^ c A < g A (1), then x = 0 is better than x = 1, we rule out the zero staffing. The optimal solution is x = argmin x2[0;1] (1x)^ c A +xg A (x). We have assumed g A (x) is non-decreasing in x, g 0 A (x) 0. If g A (0)< ^ c A , the first order derivative atx = 0 is ^ c A +g A (x) +xg 0 A (x)j x=0 =^ c A +g A (0)< 0; therefore, the optimumx is in the interior (0; 1). Ifg A (0) ^ c A , the first order derivative is ^ c A +g A (x) +xg 0 A (x) 0 The optimumx = 0. 137 Proof of Lemma 3.5.3: The arrival rate of customers should be balanced by the rate of customer abandonment and customer ser- vices: =P A +N B ( ): As!1, P A = 1 N B ( ) ! 1b^ A 1a b^ A =a: Proof of Theorem 3.5.4 For any staffing and compensation policy, assume the equilibrium service rate is E . The number of aban- donments per unit time isAb := E NB( E ). The total system cost is c S N + (1p( E ))g F (1p( E )) E NB( E ) +Abg A (Ab=) =c S Ab E B( E ) + (Ab)(1p( E ))g F (1p( E )) +Abg A (Ab=) c S E + (1p( E ))g F (1p( E )) (Ab) +Abg A (Ab=) (^ c A (1Ab=) +Ab=g A (Ab=)): Since min x2[0;1] ^ c A (1x) +c A x = (1a)^ c A +c A a, the total cost has a lower bound ((1a)^ c A +c A a): Proof of Theorem 3.5.5: We have known ^ A is the optimal solution to min 2[;] c S + (1p())g F (1p()): 138 We have shown in Proof of Lemma 3.5.2 that the above function is quasiconvex in. The FOC is c S 2 p 0 () (1p())g 0 F (1p()) +g F (1p()) = 0: (B.10) Consider the quasiconvex function (we have shown the quasiconvexity in the Proof of Lemma 3.5.1): c S +K(1p()) (B.11) We want the solution to (B.11) also to be ^ M . The FOC is c S 2 Kp 0 () = 0: (B.12) To match the two FOCs at ^ A , K A = (1p(^ A ))g 0 F (1p(^ A )) +g F (1p(^ A )): From the quasiconvexity, if ^ A = , then the left-hand side of (B.10) is positive on (;], meaning the left-hand side of (B.12) is also positive, so the solution to (B.11) is; if ^ A =, then the left-hand side of (B.10) is negative on [;), meaning the left-hand side of (B.12) is also negative, so the solution to (B.11) is. Define ~ c A := min 2[;] c S +K A (1p()) c S 1 +K A (1p( 1 )) Therefore, (~ c A K A (1p( 1 ))) 1 c S : The equality holds if and only if 1 = ^ A . Therefore, whenP F = K A , P S = ~ c A , and other servers are working at ^ A so ^ B( 1 ;) = 1, server 1 would be incentivized to work at ^ A , given that ^ B( 1 ;) replaces B( 1 ;) in the utility function. Therefore, lim !1 C(N ; ~ c A =K A ) =c S 1a ^ A + (1p(^ A ))g F (1p(^ A ))(1a) +ag A (a) = (1a)^ c A +ag A (a): 139 Proof of Lemma 3.5.4: The manager wants to choose the fluid scaled abandonmenta and fluid scaled busy portionx to minimize the total cost. Fixinga, we first figure out the optimal service rate andx. The fluid scaled total cost is (c S U I (x)) 1a x +g A (a)+g F (1p())(1a) = (1a) c S U I (x) x + (1p())g F (1p()) +g A (a): We see that given a, the optimal x and do not depend on a. From Lemma 3.5.1, we know that if lim x"1 U 0 I (x)<c S ,x solves U I (x)U 0 I (x)x =c S ; otherwise,x = 1. ^ M = argmin 2[;] c S U I (x ) x + (1p())g F (1p()): Define ^ c M = c S U I (x ) x ^ M + (1p(^ M ))g F (1p(^ M )): We can find the optimala in a similar way to the Proof of Lemma 3.5.2. As long as ^ c M <g A (1), a M = c 0 A 1 (^ c M ): We can find the hold timeT from exponential abandonment with rate. 1 exp(T ) =a M ; T = log(1a M ) : The optimal staffing level b = 1a M x ^ M : 140 Proof of Theorem 3.5.6: For any staffing and compensation policy, assume the equilibrium service rate is E . The number of aban- donments per unit time isAb := E NB( E ). The total system cost is (c S U I (B( E )))N +Abg A Ab + (1p( E ))g F (1p( E )) E NB( E ) = (c S U I (B( E ))) Ab E B( E ) + (1p( E ))g F (1p( E ))(Ab) +Abg A Ab = c S U I (B( E ) E B( E ) + (1p( E ))g F (1p( E )) (Ab) +Abg A Ab ^ c M (1 Ab ) + Ab g A Ab : Since min x2[0;1] ^ c M (1x) +g A (x) = (1a M )^ c M +g A (a M ), the total cost has a lower bound ((1a M )^ c M +g A (a M )): Proof of Theorem 3.5.7: When the staffing level is ^ b M and fixed wait time isT = log(1a M ) , the effective arrival is Poisson with rate is(1a M ). Define ~ c S := c S U I ( ~ B) ~ B . We have known ^ M is the optimal solution to min 2[;] ~ c S + (1p())g F (1p()): We use similar steps as in the Proof of Theorem 3.5.5. We have shown in Proof of Lemma 3.5.2 that the above function is quasiconvex in. The FOC is ~ c S 2 p 0 () (1p())g 0 F (1p()) +g F (1p()) = 0: (B.13) Consider the quasiconvex function (we have shown the quasiconvexity in the Proof of Lemma 3.5.1): ~ c S +K(1p()) (B.14) 141 We want the solution to (B.14) also to be ^ M . The FOC is ~ c S 2 Kp 0 () = 0: (B.15) To match the two FOCs at ^ M , K M = (1p(^ M ))g 0 F (1p(^ M )) +g F (1p(^ M )): From the quasiconvexity, if ^ M = , then the left-hand side of (B.13) is positive on (;], meaning the left-hand side of (B.15) is also positive, so the solution to (B.14) is; if ^ M =, then the left-hand side of (B.13) is negative on [;), meaning the left-hand side of (B.15) is also negative, so the solution to (B.14) is. Define ~ c M := min 2[;] ~ c S +K M (1p()) ~ c S 1 +K M (1p( 1 )) ~ c S U I ( ^ B( 1 ; ^ M )) 1 ^ B( 1 ; ^ M ) +K M (1p( 1 )): Therefore, (~ c M K M (1p( 1 ))) 1 ^ B( 1 ; ^ M ) +U I ( ^ B( 1 ; ^ M ))c S : The equality holds if and only if 1 = ^ M . Therefore, whenP S = K M ,P F = ~ c M , and other servers are working at ^ M , server 1 would be incentivized to work at ^ M , given that ^ B( 1 ;) replacesB( 1 ;) in the utility function. Similarly to Proof of Theorem 3.5.3, we can show that when the solution to (3.27) is unique, the true equilibrium converges to ^ M . Therefore, lim !1 C(N ;K M ; ~ c M ) = (c S U I ( ~ B)) 1a M ~ B^ M + (1p(^ M ))g F (1p(^ M ))(1a M ) +a M g A (a M ) = (1a M )^ c M +a M g A (a M ): 142 Proof of Proposition 3.5.1: From the definitions, ^ I = ^ M = argmin 2[;] c S U I ( ~ B) ~ B +c F (1p()); ^ = ^ A = argmin 2[;] c S +c F (1p()) b = 1 ^ ; ^ b I = 1 ~ B^ I ; ^ b A = 1a ^ A ; ^ b M = 1a M ~ B^ M ; ^ c LB = c S ^ +c F (1p(^ )); ^ c I = ^ c S ^ I +c F (1p(^ I )); ^ c A = c S ^ A +c F (1p(^ A )); ^ c M = ^ c S ^ M +c F (1p(^ M )); where ~ c S := c S U I ( ~ B) ~ B . Since ~ B minimizes c S U I (x) x , we have ~ c S = c S U I ( ~ B) ~ B c S U I (1) 1 = c S . From Proposition 3.4.3, we know that ^ I ^ . Therefore, b ^ b I and ^ b M ^ b I . It is clear that ^ b A b and ^ b A ^ b M . Since min c S +c F (1p()) is the special case of min ;b c S U I (B()) B() +c F (1p()) (whenB() = 1), and the special case of min ;b;a (1a) c S +c F (1p()) +c A a (whena = 0), the minimum value of the objective function in Section 4.4, ^ c LB , cannot be larger than the ones in Sections 5.1 and 5.2, which are ^ c I , (1a)^ c A +c A a respectively. For the same reason that the objective functions of Section 5.1 and 5.2 are special cases of the objective function in Section 5.3, ^ c I , (1a)^ c A +c A a cannot be larger than (1a M )^ c M +c A a M . Proof of Lemma B.0.1: We first expressB(); B 0 (); B 00 () by these simple terms, andB 00 ()B()2(B 0 ()) 2 < 0 is equivalent to an expression with simple terms, that is, 2 1 X i=1 a i b i ! 2 1 X i=1 a i 1 X i=1 a i (b 2 i +c i )> 0: Then we find the lower bound of this expression: 1 X m=2 a mj(m) a j(m) m1 X j=1 2b mj b j b 2 j c j : 143 At last, we prove that for anym 2, m1 X j=1 2b mj b j b 2 j c j > 0; which completes the proof. Fori 1 we define a i := i1 Y k=0 +k ; B() = 1 1 1 + P 1 i=1 a i : The derivative ofa i regarding is: @a i @ = i P i1 k=0 Q k1 j=0 ( +j) Q i1 j=k+1 ( +j) Q i1 k=0 ( +k) 2 =a i i1 X k=0 1 +k =a i b i ; whereb i := P i1 k=0 1 +k . B 0 () = (1B()) 2 1 X i=1 @a i @ =(1B()) 2 1 X i=1 a i b i @b i @ = i1 X k=0 1 ( +k) 2 =c i wherec i := P i1 k=0 1 (+k) 2 . Note that @(a i b i ) @ =b i @a i @ +a i @b i @ =a i b 2 i a i c i =a i (b 2 i +c i ): We have B 00 () =2(1B()) 3 1 X i=1 a i b i ! 2 + (1B()) 2 1 X i=1 a i (b 2 i +c i ): B 00 ()B() 2(B 0 ()) 2 =(1B()) 3 0 @ 2 1 X i=1 a i b i ! 2 B() 1B() 1 X i=1 a i (b 2 i +c i ) 1 A =(1B()) 3 0 @ 2 1 X i=1 a i b i ! 2 1 X i=1 a i 1 X i=1 a i (b 2 i +c i ) 1 A 144 After simplifying the expression, to showB 00 ()B() 2(B 0 ()) 2 < 0, it is equivalent to show 2 1 X i=1 a i b i ! 2 1 X i=1 a i 1 X i=1 a i (b 2 i +c i )> 0: 2 1 X i=1 a i b i ! 2 1 X i=1 a i 1 X i=1 a i (b 2 i +c i ) = 2 1 X i=1 1 X j=1 a i b i a j b j 1 X i=1 1 X j=1 a i a j (b 2 j +c j ) = 1 X i=1 1 X j=1 a i a j 2b i b j b 2 j c j = 1 X m=2 m1 X j=1 a mj a j 2b mj b j b 2 j c j = 1 X m=2 8 > > > > > > > > > > < > > > > > > > > > > : P m=21 j=1 a mj a j 4b mj b j b 2 j b 2 mj c j c mj +a 2 m=2 b 2 m=2 c m=2 ; ifm is even P (m1)=2 j=1 a mj a j 4b mj b j b 2 j b 2 mj c j c mj ; ifm is odd Ideally, if we can show that 4b mj b j b 2 j b 2 mj c j c mj > 0, the proof is done. However, this is not true. We can only show that 4b mj b j b 2 j b 2 mj c j c mj > 0 whenj = m=2 (m is even) or j = (m 1)=2 (m is odd). And for somej, the expression could be negative. Whenm is even, b 2 m=2 c m=2 = 2 m=21 X k=0 k1 X l=0 1 +k 1 +l > 0: Whenm is odd, 4b (m1)=2 b (m+1)=2 b 2 (m1)=2 b 2 (m+1)=2 c (m1)=2 c (m+1)=2 = 2b (m1)=2 b (m+1)=2 b (m1)=2 b (m+1)=2 2 c (m1)=2 c (m+1)=2 = 2b 2 (m1)=2 + 2b (m1)=2 1 +(m 1)=2 1 ( +(m 1)=2) 2 2c (m1)=2 1 ( +(m 1)=2) 2 = 2 b 2 (m1)=2 c (m1)=2 + 2 +(m 1)=2 b (m1)=2 1 +(m 1)=2 > 0 145 Next we show that 4b mj b j b 2 j b 2 mj c j c mj is increasing inj whenj < m1 2 . 4b mj b j b 2 j b 2 mj c j c mj 4b mj1 b j+1 b 2 j+1 b 2 mj1 c j+1 c mj1 = 4(b mj b j b mj1 b j+1 ) + (b 2 j+1 b 2 j ) (b 2 mj b 2 mj1 ) + (c j+1 c j ) (c mj c mj1 ) = 4 1 +j b mj1 + 1 + (mj 1) b j + 2 1 +j b j+1 1 ( +j) 2 2 1 + (mj 1) b mj 1 ( + (mj 1)) 2 + 1 ( +j) 2 1 ( + (mj 1)) 2 = 2 1 +j b mj1 + 1 + (mj 1) b j 2 +j (b mj1 b j+1 ) 2 + (mj 1) (b mj b j ) Since j < m1 2 , we have j + 1 mj 1. Therefore, 1 +j > 1 +(mj1) , b j+1 b mj1 and b j <b mj1 <b mj . The above is negative. For each m > 1, there exists an integer j(m) such that when the integer j satisfies j(m) j m=2, 4b mj b j b 2 j b 2 mj c j c mj 0; when 0<j <j(m), 4b mj b j b 2 j b 2 mj c j c mj < 0. The range ofj(m) is between 1 andm=2 1. We note that a j+1 a mj1 a j a mj = +j +(mj1) = + (mj 1) +j ; so when j < m1 2 , a j a mj is increasing in j. Therefore, a j a mj < a j(m) a mj(m) when j < j(m); a j a mj a j(m) a mj(m) when j(m) jbm=2c. With the monotone property, we can find a lower bound of P bm=2c j=1 a mj a j 4b mj b j b 2 j b 2 mj c j c mj . Whenj(m)jbm=2c, we substitute a j a m j by a j(m) a mj(m) and obtain a smaller value; when 1 j < j(m), we also substitute a j a m j by a j(m) a mj(m) and obtain a smaller value. 146 Whenm is odd, (m1)=2 X j=1 a mj a j 4b mj b j b 2 j b 2 mj c j c mj > j(m)1 X j=1 a mj(m) a j(m) 4b mj b j b 2 j b 2 mj c j c mj + (m1)=2 X j=j(m) a mj(m) a j(m) 4b mj b j b 2 j b 2 mj c j c mj =a mj(m) a j(m) (m1)=2 X j=1 4b mj b j b 2 j b 2 mj c j c mj =a mj(m) a j(m) m1 X j=1 2b mj b j b 2 j c j Whenm is even, m=21 X j=1 a mj a j 4b mj b j b 2 j b 2 mj c j c mj +a 2 m=2 b 2 m=2 c m=2 > j(m)1 X j=1 a mj(m) a j(m) 4b mj b j b 2 j b 2 mj c j c mj + m=21 X j=j(m) a mj(m) a j(m) 4b mj b j b 2 j b 2 mj c j c mj +a mj(m) a j(m) b 2 m=2 c m=2 =a mj(m) a j(m) 0 @ (m1)=2 X j=1 4b mj b j b 2 j b 2 mj c j c mj +b 2 m=2 c m=2 1 A =a mj(m) a j(m) m1 X j=1 2b mj b j b 2 j c j Therefore we find the common lower bound 2 1 X i=1 a i b i ! 2 1 X i=1 a i 1 X i=1 a i (b 2 i +c i )> 1 X m=2 a mj(m) a j(m) m1 X j=1 2b mj b j b 2 j c j : 147 Figure B.1: The equivalence of two summations j k l m-1 m-2 m-2 l=j-1 k=m-1-j (1,0,0) A C D O B From Figure B.1, we have m1 X j=1 b mj b j = m1 X j=1 mj1 X k=0 1 +l j1 X l=0 1 +k = m1 X j=1 j1 X l=0 mj1 X k=0 1 +k 1 +l = m1 X j=1 j1 X l=0 j1l X k=0 1 +k 1 +l = m1 X j=1 X k+l<j 1 +k 1 +l ; 148 Note that P m1 j=1 P j1 l=0 P mj1 k=0 1 +k 1 +l corresponds to triangular pyramid OABD; P m1 j=1 P j1 l=0 P j1l k=0 1 +k 1 +l corresponds to triangular pyramid OABC. OABD is equivalent to OABC when we map the points in the direction of j axis. Therefore, we have m1 X j=1 2b mj b j b 2 j c j = 2 m1 X j=1 X k+l<j 1 +k 1 +l m1 X j=1 j1 X k=0 1 +k ! 2 m1 X j=1 j1 X k=0 1 ( +k) 2 = m1 X j=1 0 @ 2 X k+l<j 1 +k 1 +l j1 X k=0 j1 X l=0 1 +k 1 +l j1 X k=0 1 ( +k) 2 1 A = m1 X j=1 0 @ X k+l<j 1 +k 1 +l X k<j;l<j;k+lj 1 +k 1 +l j1 X k=0 1 ( +k) 2 1 A = m1 X j=1 0 @ 0 @ X k+lj2;k6=l 1 +k 1 +l + X k<j=2 1 ( +k) 2 + X k+l=j1 1 +k 1 +l 1 A 0 @ X k<j;l<j;k+lj;k6=l 1 +k 1 +l + X j=2k<j 1 ( +k) 2 1 A j1 X k=0 1 ( +k) 2 1 A = m1 X j=1 0 @ 0 @ X k+lj2;k6=l 1 +k 1 +l X k<j;l<j;k+lj;k6=l 1 +k 1 +l 1 A + X k+l=j1 1 +k 1 +l 2 X j=2k<j 1 ( +k) 2 1 A = m1 X j=1 0 @ X k+lj2;k6=l 1 +k 1 +l 1 + (j 1l) 1 + (j 1k) X j=2k<j 2 +k 1 + (j 1k) 1 +k + 2 + (j 1)=2 1 j is odd 1 A : In all the equationsk andl are nonnegative integers. Figure B.2 graphically show the comparison of the terms whenj = 6. After cancelation, the circle points are positive terms, the cross and plus points are neg- ative terms. The northwest triangle (excluding the anti-diagonal) dominates the southeast counterpart. The 149 Figure B.2: The graph of terms comparison k l 0 1 2 3 4 5 2 1 3 4 5 0 anit-diagonal dominates twice of the southeast half of the main diagonal. Therefore, the whole expression is positive. In details, whenk +lj 2, we havej 1l>k andj 1k>l, therefore, 1 +k 1 +l 1 + (j 1l) 1 + (j 1k) > 0: Whenkj=2,j 1k<k, we have 1 +(j1k) > 1 +k . We conclude 2 1 X i=1 a i b i ! 2 1 X i=1 a i 1 X i=1 a i (b 2 i +c i )> 0: Proof of Lemma B.0.2: Rewrite B( 1 ;) = C + P 1 i=1 a i C +D 1 + P 1 i=1 a i ; whereC := P N1 i=0 ( ) i 1 i! ( ) N1 1 (N1)! ,D := 1 P N1 i=0 ( ) iNi i! ( ) N1 1 (N1)! , and we redefinea i := Q i k=1 (N1)+ 1 +k . @a i @ 1 =a i i X k=1 1 (N 1) + 1 +k =a i b i ; 150 whereb i := P i k=1 1 (N1)+ 1 +k . @b i @ 1 = i X k=1 1 ((N 1) + 1 +k) 2 =c i ; wherec i := P i k=1 1 ((N1)+ 1 +k) 2 . We have @B( 1 ;) @ 1 = D(C + P 1 k=1 a k + 1 P 1 k=1 a k b k ) (C +D 1 + P 1 k=1 a k ) 2 @ 2 B( 1 ;) @ 2 1 =D 2(D P 1 k=1 a k b k )(C + P 1 k=1 a k + 1 P 1 k=1 a k b k ) + 1 (C +D 1 + P 1 k=1 a k ) P 1 k=1 a k (b 2 k +c k ) (C +D 1 + P 1 k=1 a k ) 3 B( 1 ;) @ 2 B( 1 ;) @ 2 1 2 @B( 1 ;) @ 1 2 = D (C +D 1 + P 1 k=1 a k ) 3 0 @ 2 1 X k=1 a k +C ! 1 X k=1 a k b k 2 1 1 X k=1 a k b k ! 2 + 1 1 X k=1 a k +C ! 1 X k=1 a k (b 2 k +c k ) 1 A From Lemma B.0.1, we know that 1 X k=1 a k (b 2 k +c k )< 2 ( P 1 k=1 a k b k ) 2 P 1 k=1 a k : Therefore, B( 1 ;) @ 2 B( 1 ;) @ 2 1 2 @B( 1 ;) @ 1 2 < D (C +D 1 + P 1 k=1 a k ) 3 0 @ 2 1 X k=1 a k +C ! 1 X k=1 a k b k 2 1 1 X k=1 a k b k ! 2 + 1 1 X k=1 a k +C ! 2 ( P 1 k=1 a k b k ) 2 P 1 k=1 a k 1 A = 2D P 1 k=1 a k b k (C +D 1 + P 1 k=1 a k ) 3 1 X k=1 a k +C ! + 1 C P 1 k=1 a k b k P 1 k=1 a k ! To show (B.4), it is sufficient to show 1 C 1 X k=1 a k b k C 1 X k=1 a k 1 X k=1 a k ! 2 < 0: 151 That is 1 C 1 X k=1 a k P 1 j=1 a j b k C 1 X k=1 a k < 0: (B.16) Define := (N1)+ 1 . If (N 1), we can show that 1 1 X i=1 a i b i 1 X i=1 a i < 0: (B.17) We use notationa i (),b i () to emphasize on their dependence on. We havea i (0) = i ,b i (0) = i=. When> 0, we havea i ()<a i (0), 1 X i=1 a i ()< 1 X i=1 a i (0) = 1 : Therefore, a 1 (0) P 1 i=1 a i (0) < a 1 () P 1 i=1 a i () . We can view the sequence a i () P 1 j=1 a j () as a probability distribution, which decays faster than the sequence a i (0) P 1 j=1 a j (0) (geometrically decay). Sinceb i () is an increasing sequence, we conclude that 1 X k=1 a k () P 1 j=1 a j () b k ()< 1 X k=1 a k (0) P 1 j=1 a j (0) b k () = 1 X k=1 (1) k1 b k (): 1 X k=1 a i () P 1 j=1 a j () b i () < 1 X k=1 (1) k1 b k ()< 1 X k=1 (1) k1 b k (0) = 1 X k=1 (1) k1 k = (1) : Therefore, if (N 1), 1 1 X i=1 a i ()b i () 1 X j=1 a j ()< 1 (N 1) + 1 1: That means (B.16) is valid when (N 1). 152 If> (N 1), we can show thatC is bounded. LetY be a Poisson random variable with parameter .C = Pr(Y N 1)= Pr(Y =N 1). For any 2jN, Pr(Y =Nj) Pr(Y =N 1) = Nj 1 (Nj)! N1 1 (N1)! = (N 1)! (N1) j1 (N 1) j1 (Nj)! < (N 1) j1 : Therefore, C = Pr(Y N 1) Pr(Y =N 1) N X j=1 (N 1) j1 < (N 1) : (B.18) Sinceb i ib 1 = i (N1)+ 1 + , it is sufficient to show (N 1) 1 b 1 1 X i=1 a i i 1 X i=1 a i ! 1 X i=1 a i ! 2 < 0: (B.19) From (4.3) in Mandelbaum and Zeltyn (2005), we have 1 X i=1 a i = 1 X i=1 i Y j=1 = ((N 1) + 1 )= +j =A (N 1) + 1 ; ; (B.20) whereA(x;y) := x exp(y) y x (x;y) 1, and (x;y) := R y 0 t x1 exp(t)dt is the lower incomplete Gamma function. @A(x;y) @y = 1 X i=1 iy i1 Q i j=1 (x +j) = x y + 1 x y (A(x;y) + 1): Therefore, 1 X i=1 ia i = ((N 1) + 1 ) A (N 1) + 1 ; + : For simplicity, denoteA (N1)+ 1 ; byA. Define := (N 1)2 (0;]. (B.19) is equivalent to 1 + 1 + 1 A + A A 2 < 0: If 1 , this above inequality is valid. We only need to consider the case > (N 1) + 1 : (B.21) 153 It is sufficient to show either 1 + 1 + 1 A + A< 0; giving or solve from the quadratic inequality, giving A> K ; ifK > 0; or A> p K 2 + 2LK L ; whereK = 1 +( 1 + +),L = 2( 1 + +)> 0. It is difficult to prove the inequalities for any , we can show it is valid for large enough . We are interested in a sequence of systems with increasing. Superscript the corresponding notation by the arrival rate. When!1, if ! 0, we have K !> 0: It is sufficient to show as is large, A> 1 : (B.22) If ! ^ 2 (0; 1), as!1, we have K ! ^ +(1 ^ ); L 2 ! 2 ^ (1 ^ )> 0: If ^ +(1 ^ )> 0, it is sufficient to show A> 1 ^ +(1 ^ ) : (B.23) If ^ +(1 ^ ) 0, noting whenK < 0, p K 2 + 2LK L < K L K K L = 2K L 1 K ; 154 we have lim !1 p K 2 + 2L K L = ^ (1 ^ ) ^ (1 ^ ) ; and it is sufficient to show A> ^ (1 ^ ) ^ (1 ^ ) : (B.24) Next we showA !1 as!1. From (B.20) and (B.21), we have A (N 1) + 1 ; >A ; : Lower incomplete Gamma function has a property: lim x!1 (x;x) (x) = 1 2 ; where (:) is the Gamma function. Whenx is large, exp(x)x x x (x;x) exp(x)x x x(x)=2 exp(x)x x p 2 2 x x+1=2 exp(x) = p 2x 2 : Therefore, A ; r 2 !1; as!1: As the right-hand side of (B.22), (B.23) and (B.24) are all constant, we have done the proof. 155
Abstract (if available)
Abstract
My dissertation applies queueing models to analyzing and designing service systems. It contains three chapters. The first chapter investigates transient behavior of service systems whereas the second and third chapters focus on steady state. Both the second and third chapters consider the speed quality trade-off in large service systems. The second chapter studies the optimal routing policy for heterogeneous servers whereas the third chapter searches for the joint optimal compensation and staffing policy for homogeneous but strategic servers who respond to incentives. ❧ The first chapter ""On the Generalized Drift Skorokhod Problem in One Dimension"" is a joint work with Professors Josh Reed and Amy Ward published in Journal of Applied Probability (2013). We show how to write the solution to the one-dimensional generalized drift Skorokhod problem in terms of the supremum of the solution of a tractable unrestricted integral equation (that is, an integral equation with no boundaries). As an application of our result, we equate the transient distribution of a reflected Ornstein-Uhlenbeck (O-U) process to the first hitting time distribution of an unreflected O-U process. Then, we use this relationship to approximate the transient distribution of the GI/GI/1+GI queue in conventional heavy traffic and the M/M/N/N queue in a many-server heavy traffic regime. We validate the transient behavior approximation by simulation. ❧ The second chapter ""Routing to Minimize Waiting and Callbacks in Large Call Centers"" is a joint work with Professor Amy Ward published in Manufacturing & Service Operations Management (2014). The setting is an inverted-V queueing model with heterogeneous server pools, differentiated by their service speed and quality measured by call resolution. We propose a threshold policy that routes customers to faster servers when the system is congested and otherwise routes customers to slower servers with better quality (lower callback probability). We show both analytically and by using simulation that the policy is nearly Pareto optimal for large systems
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Real-time controls in revenue management and service operations
PDF
Essays on service systems with matching
PDF
Essays on capacity sizing and dynamic control of large scale service systems
PDF
Essays on dynamic control, queueing and pricing
PDF
Commercialization of logistics infrastructure as an offline platform
PDF
Strategic and transitory models of queueing systems
PDF
Essays on information, incentives and operational strategies
PDF
Modeling customer choice in assortment and transportation applications
PDF
Essays on revenue management with choice modeling
PDF
Essays on consumer returns in online retail and sustainable operations
PDF
Queueing loss system with heterogeneous servers and discriminating arrivals
PDF
Some topics on continuous time principal-agent problem
PDF
Essays on improving human interactions with humans, algorithms, and technologies for better healthcare outcomes
PDF
Essays on consumer product evaluation and online shopping intermediaries
PDF
Efficient delivery of augmented information services over distributed computing networks
PDF
Stochastic models: simulation and heavy traffic analysis
PDF
Essays on information design for online retailers and social networks
PDF
Essays on delegated portfolio management under market imperfections
PDF
Essays on business cycle volatility and global trade
PDF
Essays on understanding consumer contribution behaviors in the context of crowdfunding
Asset Metadata
Creator
Zhan, Dongyuan
(author)
Core Title
Essays on service systems
School
Marshall School of Business
Degree
Doctor of Philosophy
Degree Program
Business Administration
Publication Date
08/27/2015
Defense Date
02/13/2015
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
call centers,diffusional limits,Erlang-A,fluid limits,OAI-PMH Harvest,queueing games,service operations,Skorokhod problem,speed quality trade-off,strategic servers
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Ward, Amy R. (
committee chair
), Rajagopalan, Sampath (
committee member
), Randhawa, Ramandeep (
committee member
), Yang, Sha (
committee member
), Zhu, Leon (
committee member
)
Creator Email
d.zhan@ucl.ac.uk,dzhan@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-172744
Unique identifier
UC11274450
Identifier
etd-ZhanDongyu-3855.pdf (filename),usctheses-c40-172744 (legacy record id)
Legacy Identifier
etd-ZhanDongyu-3855.pdf
Dmrecord
172744
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Zhan, Dongyuan
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
call centers
diffusional limits
Erlang-A
fluid limits
queueing games
service operations
Skorokhod problem
speed quality trade-off
strategic servers