Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
A stochastic employment problem
(USC Thesis Other)
A stochastic employment problem
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
A Stochastic Employment Problem. By Teng Wu ADissertationPresentedtothe FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY In INDUSTRIAL AND SYSTEMS ENGINEERING May 2013 Copyright 2013, Teng Wu i Dedication To my parents and Jing. -ii- Acknowledgments Iwouldliketoexpressmyfirstandmostearnestgratitudetomyadviserand my Committee Chair, Professor Sheldon Ross, for his invaluable guidance, supervision,encouragementandconstantsupportduringmyPhDstudyatthe University of Southern California. His mentorship was paramount in shaping various aspects of my professional and personal life. I am also sincerely thankful to Professor Qiang Huang and Professor Jianfeng Zhang for their advice and serving in my Qualifying Exam and Dissertation Defense Committee, Professor Maged Dessouky and Professor Alejandro To- riello for serving in my Qualifying Exam Committee. I must also thank the Industrial and Systems Engineering Department, for providingmescholarshipandwonderfulresearchopportunities. Specialthanks are conveyed to my colleagues in the OHE340 oce, Qian An, Yalda Khashe, Maryam Tabibzadeh, Shi Mu, Li Wang, Lijuan Xu, Kai Chen, Dongyuan Zhan, Yi Zhong, Pai Liu, Yasaman Dehghani, and my peers in the Electrical Engineering Department, Lin Zhang, Yang Yue, Xiaoxia Wu, Jingyuan Yang, Bo Zhang, Yunchu li, Dongrui Wu, Yan Yan, Hao Huang. Discussions with them gave me lots of inspirations and they also made my days at USC more enjoyable and memorable. -iii- My final and the most heartfelt acknowledgment must go to my parents and my wife, Jing, for their love and encouragement. Without them, this work would never have come into existence. -iv- Contents Dedication ............................... ii Acknowledgments........................... iii Abstract................................. ix Chapter 1. Introduction ........................ 1 1.1. Dissertation Topic . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2. Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1. Literature review . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.2. One complete set problem . . . . . . . . . . . . . . . . . . . 2 1.2.3. When ball vectors are exchangeable . . . . . . . . . . . . . . 2 1.2.4. When ball vectors are independent . . . . . . . . . . . . . . . 3 1.2.5. An index policy. . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.6. A queuing model. . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.7. Future works. . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Chapter 2. Literature review ..................... 5 2.1. Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 3. One complete set problem................ 12 3.1. When each box needs a single ball . . . . . . . . . . . . . . . . 12 3.1.1. If S d S 0 ,then N ⇡ o (S)< (1) N ⇡ o (S 0 ). . . . . . . . . . . . . . . 13 3.1.2. N is stochastically minimized by ⇡ o .............. 15 -v- 3.1.3. M is stochastically maximized by ⇡ o ............. 17 3.1.4. If S d S 0 ,then S ⇡ o (r)< (1) S 0 ⇡ o (r),8 r ............ 18 3.2. Estimating P(N>r) by ecient simulation . . . . . . . . . . 19 3.2.1. A Lemma. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.2. Variance reduction techniques. . . . . . . . . . . . . . . . . 21 3.2.3. A numerical example . . . . . . . . . . . . . . . . . . . . . . 21 3.3. Estimating E[N]withecientsimulation . . . . . . . . . . . 23 3.3.1. A numerical example . . . . . . . . . . . . . . . . . . . . . . 24 3.4. Bounds of P(N>r)and E[N]when X i are independent . . . 25 3.4.1. A numerical example . . . . . . . . . . . . . . . . . . . . . . 26 Chapter 4. When ball vectors are exchangeable ......... 29 4.1. When X 1 ,...,X n are exchangeable. . . . . . . . . . . . . . . . 29 4.1.1. If S d S 0 ,then P (N ⇡ o (S) r)P (N ⇡ o (S 0 ) r),8 r .... 30 4.1.2. P (N ⇡ o (S) r) = max ⇡ P (N ⇡ (S) r)............ 30 4.1.3. P(N ⇡ o (S) r)isaSchurConcavefunction . . . . . . . . . . 32 4.2. Estimating P(N>r) by ecient simulation . . . . . . . . . . 33 4.2.1. Variance reduction techniques . . . . . . . . . . . . . . . . . 34 4.2.2. A numerical example . . . . . . . . . . . . . . . . . . . . . . 35 4.3. Estimating E[N]byecientsimulation . . . . . . . . . . . . . 36 4.3.1. A numerical example . . . . . . . . . . . . . . . . . . . . . . 37 Chapter 5. When ball vectors are independent .......... 38 5.1. When X 1 ,...,X n are independent but not identically distributed 38 5.1.1. A heuristic policy . . . . . . . . . . . . . . . . . . . . . . . . 38 5.1.2. An on-line optimization policy . . . . . . . . . . . . . . . . . 38 -vi- 5.1.3. A lower bound of E[N]..................... 39 5.1.4. A numerical example . . . . . . . . . . . . . . . . . . . . . . 40 5.2. An optimal policy when there are 2 boxes . . . . . . . . . . . . 42 5.2.1. The (1,m) model . . . . . . . . . . . . . . . . . . . . . . . . 42 5.2.2. The (2,m) model . . . . . . . . . . . . . . . . . . . . . . . . 43 5.2.3. The (1,m) model with cost (1) . . . . . . . . . . . . . . . . . 46 5.2.3.1. An optimal Policy . . . . . . . . . . . . . . . . . . . . . . . 46 5.2.4. The (1,m) model with cost (2) . . . . . . . . . . . . . . . . . 47 5.2.4.1. An optimal policy . . . . . . . . . . . . . . . . . . . . . . . 48 Chapter 6. An index policy ...................... 51 6.1. When X 1 ,...,X n are exchangeable . . . . . . . . . . . . . . . 51 6.1.1. An optimal index policy . . . . . . . . . . . . . . . . . . . . 51 6.1.2. A Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 6.2. Estimating E[N]byecientsimulation . . . . . . . . . . . . . 53 6.2.1. Variance reduction techniques . . . . . . . . . . . . . . . . . 55 6.2.2. A numerical example . . . . . . . . . . . . . . . . . . . . . . 55 Chapter 7. A queuing model ..................... 58 7.1. When each box has a life time . . . . . . . . . . . . . . . . . . 58 7.1.1. When µ 1 = ··· =µ n ....................... 58 7.1.2. When µ 1 ···µ n and p 1 ··· p n ............ 61 7.1.3. When both p 1 ,...,p n and µ 1 ,...µ n are general. . . . . . . . 63 Chapter 8. Possible future work ................... 65 8.1. Possible Future Work . . . . . . . . . . . . . . . . . . . . . . . 65 8.1.1. The two-box scenario . . . . . . . . . . . . . . . . . . . . . . 65 -vii- 8.1.2. When µ 1 ···µ n and p 1 ··· p n ............ 65 8.1.3. Increasing failure rate of a boxs life time . . . . . . . . . . . 66 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Appendix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 .1. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Algorithm of generating I in Chapter 3.2 . . . . . . . . . . . . . . . 74 Algorithm of generating n in Chapter 4.2 . . . . . . . . . . . . . . . 76 NBBW algorithm of generating N in Chapter 6 . . . . . . . . . . . 78 Algebra involved in Chapter 5 . . . . . . . . . . . . . . . . . . . . . 80 Algebra involved in Chapter 7 . . . . . . . . . . . . . . . . . . . . . 88 -viii- Abstract This dissertation studied a Stochastic Assignment Problem, called “A Sto- chastic Employment Problem(SEP)”. There are n boxes having quota S = (s 1 ...s n ), that is box i needs s i balls, i=1...n.Ballsarrivesequentially, with each one having a binary vector X=(X 1 ,X 2 ...X n )attached,withthe interpretation being that if X i =1thisballiseligibletobeputinbox i, i=1...n. When a ball arrives, its vector is revealed and the ball is put in one box for which it is eligible. Assuming the vectors are independent and identicallydistributedamongthesuccessiveballs, thisproblemcontinuesuntil there are at least s i balls in boxi,8 i.TheSEPcouldbeappliedtoanorgani- zational employment decision problem, with the interpretation being that the boxes are the types of jobs and the balls are the job seekers, with s i implying the number of type i jobs and X indicating which jobs a seeker is qualified to take. Variations of the Stochastic Employment Problem were considered. Such as, balls arrive according to a renewal process and each box has a lifetime under a specified distribution. Thus the SEP could be considered as a stochastic con- trolproblemassociatedwithasingleserverqueuingsystem. Thereforeitcould be applied to channel/processor scheduling in telecommunication/computer industry. For example, in a time slotted network, n users share one channel with user i having s i packets to transmit i=1...n,and X indicating which users are connected hence able to transmit. The SEP could also be applied to an organ transplant decision problem, with the box lifetime being a patient’s -ix- lifetime and the ball vector indicating which patients an arriving organ fits. Besides what has been mentioned, the Stochastic Employment Problem has a variety of variations and very broad applications (see, for instance, [2], [3], [4], [7], [9], [13], [17], [22], [23], [34], [40], [53], [54], [57].) -x- CHAPTER 1 Introduction 1.1. Dissertation Topic The Stochastic Employment Problem(SEP) is a variation of the Stochastic Assignment Problem which analyzes the scenario that one assigns balls into boxes. Balls arrive sequentially with each one having a binary vector X = (X 1 ,X 2 ,...,X n )attached,withtheinterpretationbeingthatif X i = 1 the ball is eligible to be put into box i,i=1...n.Thevectorisrevealedwhen a ball arrives and the ball is put in an alive box for which it is eligible. Here alive means a box needs more balls. Assuming that vectors are independent and identically distributed among the successive balls, following a specified joint distribution; this problem continues until there are at least s i balls in box i,i=1...n.Bythenatureofbeinganassignmentproblem,theSEP could be applied to an organizational employment decision problem, with the interpretation being that boxes are the types of jobs and balls are the job seekers, with s i implying the number of type i jobs and X indicating which jobs a seeker is qualified to take. Variations of the Stochastic Employment Problem were studied. Such as, balls arriveaccordingtoarenewalprocess; eachaliveboxincursacostperunittime; each box has a lifetime following a specified distribution, etc. Thus, SEP could be considered as a Stochastic Scheduling Problem in a single server queuing 1 1.2. DISSERTATION OUTLINE system. Ithasnativeapplicationsinchannel/processorschedulingproblemsin communications/computer industry. The SEP could also be applied to organ transplant decision problems, with the box lifetime being a patient’s lifetime and the ball vector indicating which patients the incoming organ fits. 1.2. Dissertation Outline 1.2.1. Literature review. This dissertation is organized as following. Chapter 2 introduces background knowledge of the Stochastic Employment Problem. 1.2.2. One complete set problem. Chapter 3 considers a special sce- nario where each box needs one ball, that is s i =1,i=1...n.Let N denote thenumberofballscollecteduntiltheproblemends. Weproposeapolicy⇡ o in assigning the arriving balls and show that if X i , i=1,...,n are independent, N is stochastically minimized following ⇡ o . Let M(k)bethenumberoffilledboxesafter k balls were collected. We show that if X i , i=1,...,n are independent, M(k)isstochasticallymaximized following ⇡ o , 8 k. Without the condition of X i being independent, ⇡ o is not optimal, which was shown by an example. Ecient simulation algorithms for P ⇡ o (N>r)and E ⇡ o [N] were presented. When X i are independent, analytic bounds for P ⇡ o (N>r)and E ⇡ o [N]wereprovided. 1.2.3. When ball vectors are exchangeable. Chapter 4 analyzes an- other special case where the ball vectors X 1 ,...,X n are exchangeable. Being exchangeable means: there exist non negative constant c k, k=0...n ,suchthat 2 1.2. DISSERTATION OUTLINE P( P n i=1 X i = k)= c k and given P n i=1 X i = k,theballisequallylikelyto have any one of those n k possible vectors. We propose a policy ⇡ o in allocat- ing balls and show that following ⇡ o , N is stochastically minimized. Ecient simulation estimators for P ⇡ o (N>r)and E ⇡ o [N]werepresented. 1.2.4. When ball vectors are independent. Chapter 5 studies a sce- nario where X 1 ,...,X n are independent. We are interested in the policy that minimizes E[N]. Dynamic Programming is not a feasible option there be- cause the needed computations grow exponentially in n.Toovercomethis diculty, a heuristic policy was introduced then the policy improvement algo- rithm ofDynamicProgramming wasemployed. To benchmarkperformanceof the proposed policy, a lower bound of E[N]underanypolicyisprovidedand numerical examples were presented. In the scenario that there are two boxes, we believe there exists a switching curve policy (i, M i )thatminimizes E[N] and we present M 1 and M 2 . Under the same assumption, two linear “cost” problems were analyzed; (1) Box 1 incurs $ 1 per unit time and box 2 incurs $ c per unit time; (2) Per quota in box 1 incurs $ 1 per unit time and per quota in box 2 incurs $ c per unit time. 1.2.5. Anindexpolicy. Chapter6considersindexpolicies. WhenX 1 ,...,X n areexchangeable,weshowthepolicythatassignspriorityoftheboxesdecreas- ingly with respect to their quota is the unique index policy that minimizes N stochastically. For X 1 ,...,X n being independent, ecient algorithms in esti- mating E[N]werepresented. 3 1.2. DISSERTATION OUTLINE 1.2.6. A queuing model. Chapter 7 assumes balls arrive in a Poisson Process and each box has an exponentially distributed lifetime. Special mono- tonecaseswithcorrespondingoptimalpolicieswereanalyzed. Whenastronger resultisnotobtainable,heuristicpolicieswereintroducedwithpolicyimprove- ment algorithms applied. Numerical examples were provided. 1.2.7. Future works. Chapter 8 proposes possible future works. With an emphasize on applications in Health-care and Communications/Computer Industry. 4 CHAPTER 2 Literature review 2.1. Literature Review The Stochastic Employment Problem (SEP) could be considered as a varia- tion of the classical Stochastic Assignment Problem (SAP), which was first introduced by Derman-Lieberman-Ross (1972) and since then it has been ex- tensively studied and successfully applied in a variety of areas, including orga- nizational job hiring, network scheduling, assets selling and organ transplant decision. This Chapter provides background knowledge of the classical SAP, some of its major extensions and their applications. TheclassicalStochasticAssignmentproblemstudiedthesequentialassignment of job seekers to jobs. There are n jobs with values 0 x 1 ··· x n , and these jobs must be matched with n seekers which come in sequentially. Each seeker has a performance index (pi)whichisindependentandidentically distributed with a specified distribution function F.Aseeker’s pi is revealed when she arrives and if a “pi”seekerisassignedtoa“x”job,arewardof “pi⇤ x” is earned. Each job can be assigned once and the problem ends when all jobs have been assigned. Under this special reward function, the optimal assignment policy which maximizes the expected total reward is independent of x i . Namely, if there are n jobs left, there exist n 1numbers, a 1,n a 2,n ··· a n 1,n ,dividingthereallineinto n intervals, such that if the next seeker 5 2.1. LITERATURE REVIEW has a performance value falling into the ith interval, it is best to assign the job with value x i to this seeker. In another paper, Albright-Derman (1972) studied the limiting behavior of a i,n ,as n!1 . Albright (1974) studied an extension of the SAP which includes the arrival times of job seekers as an explicit parameter. He assumed that if a seeker arrivingattimetisassignedtoajob,thenarewardisearnedastheproductof theseeker’spiandthejobvalueandr(t),wherer(t)isapiece-wisecontinuous, non-negative, non-increasing function of t,with r(0) = 1. Supposing that seekers arrive according to a renewal or a non homogeneous Poisson process, Albright showed that there exists a threshold policy, which maximizes the expected total reward. Albright (1977) also studied the scenario where the distribution of resources is unknown. Kennedy (1986) established the most general result of the classical SAP by removing the assumption of independence among the successive seekers. He proved that threshold style policies are optimal for any problem of this type, although the thresholds that define the optimality could be random variables and dicult to compute. Righter(1989)extendedtheSAPbyallowingavarietyofparameterstochange according to independent Markov processes. In particular, she allowed the seeker arrival rate, the job values and the deadline rates to be determined by independent Markov processes. Assuming seekers arrive according to a Poisson process and either there is a single random deadline for all jobs, which isthesameasdiscountingthereturns, or thejobshaveindependent deadlines, Righter studied the e↵ects on the structure of the optimal policy of allowing 6 2.1. LITERATURE REVIEW those parameters to change. She gave conditions under which the expected total return is monotone in the states of the Markov processes and showed that the optimal expected total return is increasing and convex in job values, decreasing and convex in the deadline rates, and increasing if the variability of the distribution of the seeker’s performance index is increasing. Righter(1990)consideredadi↵erentrewardfunctionunderaSAPformulation. Supposing seekers’ skill are independent and identically distributed random variables and are observed when they arrive; a Seeker is either assigned a job or rejected with no recalls allowed; it takes zero time to do a job; either there is a single random deadline that is exponentially distributed with rate ↵ ,or each job has an independent deadline that is exponentially distributed with rate ↵ . In particular, she interpreted “pi⇤ x”astheprobabilitythata“pi” seeker finishes a “x” job correctly. Jobs done correctly by the deadline are counted as successes; the objective is to maximize the number of successes. Righter showed that when there are independent deadlines a simple threshold policy which is independent of the job values stochastically maximizes the objective. Whenthereisasingledeadline,thereisnopolicythatstochastically maximizes the number of successes. However, there is single threshold policy which is independent of job values that maximizes the probability that all jobs are done correctly by the deadline. Asset selling problem is a typical economic application of the SAP, under the assumption that one o↵er is received per period, and no recall of the past of- fers allowed. Bruss-Ferguson (1997) studied a multidimensional generalization of the celebrated house selling problem. In that paper they considered the 7 2.1. LITERATURE REVIEW selling of n houses. Let o↵ers X = X 1 ...X n be independent and identically distributed n dimensional random vectors with a specified joint distribution. It costsc>0pervectorofobservation. Ateachstagethedecisionmaker may sell none, any one, any two,..., or all of the houses. This problem contin- ues until all houses are sold, with a payo↵ being the sum of the selling prices minus c times the number of vectors observed. Bruss and Ferguson showed that to maximize the expected total payo↵, the decision maker is to choose si- multaneously from n thresholds, N 1 ...N n ,oneforeachcomponent. Discount functions and o↵ers recalling were also considered in their paper. David-Levi (2001) studied a continuous version of the multi-asset selling problem. They showed that under the assumption of fixed rate holding costs and Poisson arrival of i.i.d o↵ers with no recalls, the optimal decision policy is simply a multi-dimensional threshold policy. Organ transplant allocation constitutes another important application of the SAP. Righter (1989) and David-Yechiali (1995) analyzed kidney transplant allocation problems via stylized SAP frameworks. Su-Zenios (2005) further studied the e↵ect of patient choice on kidney allocation. They assumed that n transplant patients are to be allocated kidneys which will arrive sequentially; apatientandakidneyhasitsowntype;kidneytypesarerandomandrevealed upon their arrival; the reward from allocating a kidney to a patient depends on both their types; patients may choose to accept or decline any kidney o↵er. Their objective is to determine a kidney allocation policy that maximizes the expected total reward subject to the constraint that patients will only accept o↵ersthatmaximizetheirownexpectedreward. Theyshowedthatapartition 8 2.1. LITERATURE REVIEW policy, which divides the kidney types into di↵erent groups (each correspond- ing to one patient type) and each kidney isassigned to one patient in itsgroup is asymptotically optimal when patients must accept all kidney o↵ers. Then, they introduced patient choice by adding an incentive compatibility condition, which an allocation policy must satisfy in order that patients accept each kid- ney o↵er. Besides the SAP formulation, organ allocation process was also studied via simulation. One of the first papers in that vein is by Ruth (1985). Shechter (2005) also analyzed a liver allocation decision process via a discrete event simulation model. Bertsimas-Farias-Trichak (2012) combined the SAP formulation with simulation. They used an assignment problem formulation for the training phase, then combined with the simulation model developed by SRTR see KPSAM (2008). Rather than designing a fundamentally new allo- cation system with general near optimal dynamic policies. Bertsimas-Farias- Trichak designed a dynamic learning process that can fit directly in the current decision making process of the U.S. policy makers. Another major application of the SAP is processor/network scheduling, with the interpretation being that the “resource” to be allocated is the available processors/servers in each time slot. Ganti-Modiano-Tsitsiklis (2007) studied an optimal transmission scheduling in a symmetric network with randomly varying connectivity. They considered a time slotted system with N queues and K identical transmitters. They assumed i.i.d Bernoulli arrivals at each queue during each slot; each queue has a channel changes between states “on and “o↵ according to i.i.d Bernoulli processes; each transmitter, can transmit atmostC packetsfromaqueuehavingan“on”channelinaslot. Theyshowed that a policy that always assigns the transmitters to the longest queues whose 9 2.1. LITERATURE REVIEW channel is “on” minimizes the total queue size, as well as a broad class of other performance criteria. In that paper, the authors took a sample path approach (coupling argument) for solving the optimal priority allocation problems. In this dissertation, we also took a coupling argument to show the optimality of the proposed policies. In particular, in Chapter 4 we studied a special case where the ball vectors are exchangeable. We showed that to stochastically minimize a busy period with no arrivals, we should put each ball in its eligible box having the highest quota. Although the assumption of exchangeability in this dissertation is a weaker condition compared to the i.i.d one in Ganti- Modiano-Tsitsiklis’s, their proof is eligible in our formulation. OnelimitationoftheclassicalSAPisthataseeker’sskillisthesametoalljobs, thus it does not allow for the possibility that one is more talented at certain types of jobs. Ross-Ross (2011) relaxed this limitation by assuming that each seeker had a vector (X 1 ,...,X n )withtheinterpretationthat X i was earned if if that seeker is assigned to job i.Itisassumedthats(X 1 ,...,X n )hasan arbitrary distribution and Ross proposed a computational procedure to find theoptimalpolicywhennisofmoderatesize. Bysupposingthat(X 1 ,...,X n ) was a vector of independent binary variables Ross-Wu (2012) was able to ob- tain stronger results. In this dissertation, we assume each seeker has a binary vector indicating which jobs that seeker is qualified to take. The vectors are independent and identically distributed among the successive seekers having aspecifiedjointdistribution,andarerevealeduponaseeker’sarrival. There- fore the SEP model in this dissertation could also be analyzed by maximum matching in a bipartite graph. However, there is a major di↵erence that we assume knowing the exact distribution of the vectors. And, we assume instant 10 2.1. LITERATURE REVIEW decision with no recalls. An online algorithm deals with instant decisions. An online algorithm receives sequential requests and responds to each request when it is received. An o✏ine algorithm waits until all requests have been received before determining its responses. One benchmark of an online al- gorithm is its competitive ratio, which is the ratio between its performance and the best possible o✏ine algorithm’s performance for the same problem. Karp (1990) studied a randomized algorithm for online bipartite matching problem that achieves a competitive ratio of 1 1 e .Manshadi-Gharan-Saberi’s algorithm (2012) further boosted the competitive ratio to 0.702. Karp and Manshadi-Gharan-Saberi made no assumptions on the vector’s distribution function. 11 CHAPTER 3 One complete set problem 3.1. When each box needs a single ball This Chapter considers a scenario where there is a given set of n boxes, num- bered1throughn. Each box needs a single balli.e.m i =1, i=1,...,n. Balls arecollectedsequentiallywitheachballhavingabinaryvectorX = X 1 ,...,X n attached to it, with the interpretation being that the ball is eligible to be put in box i,if X i =1, i=1,...,n. After a ball is collected, it is put in an alive box for which it is eligible. Here alive means a box needs balls. Let p i denote the probability that a ball is eligible to box i,thatis p i = P(X i = 1). Without loss of the generality we assume p 1 p 2 ... p n . Assuming vectors are independent and identically distributed among the successive balls. The pri- mary interest in this Chapter to is find a policy which stochastically minimizes the number of balls collected until the problem ends. LetN denotethenumberofballscollecteduntiltheproblemends. Wepropose apolicy⇡ o thatputseachballinboxi,where{i=minj : X j =1&boxjisalive}. Call this policy ⇡ o .If X i , i=1,...,n are independent, N is stochastically minimized following ⇡ o .Thatis,8 S✓{ 1,2,...,n} and8 r 1, P(N ⇡ o (S) r)=max ⇡ P(N ⇡ (S) r) where ⇡ denotes a policy of allocating balls. 12 3.1. WHEN EACH BOX NEEDS A SINGLE BALL Let M(k)denotethenumberofmatchedboxesafterobserving k vectors. If X i , i=1,...,n are independent, M(k)isstochasticallymaximizedfollowing ⇡ o ,8 k.Thatis,8 kr, P(M(k) ⇡ o r)=max ⇡ P(M(k) ⇡ r) Without the condition of X i being independent, ⇡ o is not optimal, which is shown with an example later. To prove the optimality of ⇡ o conditioning on X i being independent, let’s define a relationship between certain sets of boxes. Specifically, for S and S 0 being subsets of {1,...,n},say S d S 0 if either S⇢ S 0 or there are distinct integers, {i,j,i 1 ,i 2 ,...,i k } such thati<j, and S=(j,i 1 ,i 2 ,...,i k )and S 0 =(i,i 1 ,i 2 ,...,i k ). We have the following lemma. 3.1.1. If S d S 0 , then N ⇡ o (S)< (1) N ⇡ o (S 0 ). AssumingX i ,i=1,...,nareindependent,wehavethefollowinglemma If S d S 0 ,then N ⇡ o (S)< (1) N ⇡ o (S 0 ). Where < (1) denotes stochastic less than of the first order. To be specific, it means P(N ⇡ o (S) r)P(N ⇡ o (S 0 ) r),8 r 1. Proof. The proof of this lemma is made by induction and coupling ball vectors. If S⇢ S 0 ,itisimmediatethat P(N ⇡ o (S) r)P(N ⇡ o (S 0 ) r),8 r 1. Here the proof shows when S and S 0 di↵er on one component. 13 3.1. WHEN EACH BOX NEEDS A SINGLE BALL (1), r=1impliesthat S = {j} and S 0 = {i} and, P(N ⇡ o (S) 1) =P(X j =1)=p j P(N ⇡ o (S 0 ) 1) =P(X i =1)=p i Because ji, p j p i . (2), assume when r =k>1and S d S 0 ,itistruethat P(N ⇡ o (S) r) P(N ⇡ o (S 0 ) r). We show this inequality is true when r = k+1. Let S 1 denote the state from S after assigning the initial ball following ⇡ o ; S 0 1 is the corresponding state from S 0 .Becausecomparing N(S)with k+1isequiva- lent to comparing N(S 1 )with k,andif S 1 d S 0 1 ,bytheinductionhypoth- esis P(N ⇡ o (S 1 ) k) P(N ⇡ o (S 0 1 ) k). To show P(N ⇡ o (S) k+1) P(N ⇡ o (S 0 ) k+1),itissucienttoshow S 1 d S 0 1 almost surely. Because S=(j,i 1 ,...,i k )and S 0 =(i,i 1 ,...,i k ), the partial vector X = (X i ,X j ,X i 1 ,...,X i k )isofourconcern. Bygeneratingindependentuniform random numbers, U = {U,U 1 ,....U k } and setting Y j = 8 < : 1 if U <p j 0 otherwise Y l = 8 < : 1 if U l <p l 0 otherwise for l=1,...,k Z i = 8 < : 1 if U <p i 0 otherwise Z l = 8 < : 1 if U l <p l 0 otherwise for l=1,...,k Both Y and Z have the same distribution as X.Take Y as the vector of the ball feeds S; Z as the vector feeds S 0 .Because p i <p j , Z i =1implies Y j = 1. Which guarantees that S 1 d S 0 1 almost surely, which is a sucient condition for P(N ⇡ o (S 1 ) k) P(N ⇡ o (S 0 1 ) k). Therefore, by induction on 14 3.1. WHEN EACH BOX NEEDS A SINGLE BALL r and coupling the ball vectors we showed: if S d S 0 ,then P(N ⇡ o (S) r) P(N ⇡ o (S 0 ) r),8 r 1. ⇤ 3.1.2. N is stochastically minimized by ⇡ o . That is8 S✓{ 1,2,...n} and8 r 1, P(N ⇡ o (S) r)=max ⇡ P(N ⇡ (S) r) where ⇡ denotes a policy of assigning balls. Proof. The proof is by induction. (1), when r=1,theresultisimmediate. Because8 ⇡ , P(N ⇡ (S) 1) = 8 < : 0 if |S|> 1 p i if S = {i} 9 = ; =P(N ⇡ o (S) 1) (2), assume when r =k> 1, P(N ⇡ o (S) k1) = max ⇡ P(N ⇡ (S) k1). We show it is true when r = k via conditioning on the vector of the initial ball. If P i2 S X i 1, Either the ball is not eligible to any box in S and therefore will be discarded; or it is eligible to one box in S and therefore will be put in that box. By the induction hypothesis, it is optimal to follow ⇡ o there after. If P i2 S X i > 1, DenoteS 1 astheresultingstatefromSafterassigningtheinitialballfollowing ⇡ o ; S 0 1 as the resulting state following another policy ⇡ .Immediately: S 1 d 15 3.1. WHEN EACH BOX NEEDS A SINGLE BALL S 0 1 . Hence, P(N ⇡ (S) k | X i2 S X i > 1) =P(N ⇡ (S 0 1 ) k1) under another policy ⇡ P(N ⇡ o (S 0 1 ) k1) by the induction hypothesis P(N ⇡ o (S 1 ) k1) by the preceding lemma =P(N ⇡ o (S) k | X i2 S X i > 1) Therefore8 S✓{ 1,2,...n} and8 r 1, P(N ⇡ o (S) r)=max ⇡ P(N ⇡ (S) r) ⇤ Without the condition of X i being independent, ⇡ o is not optimal. Here is an example: S=(1,2,3) and the joint probability mass function of X = (X 1 ,X 2 ,X 3 )is, P(X 1 =0,X 2 =1,X 3 =1)=0.5 P(X 1 =1,X 2 =0,X 3 =1)= ✏ P(X 1 =1,X 2 =0,X 3 =0)=0.5✏ where ✏ is a suciently small positive number. We see P(X 1 =1)=0.5, P(X 2 =1)=0.5, and P(X 3 =1)=0.5+ ✏. Assume a ball has the vector (1,0,1). Putting it in box 1, we need to collect N 1 additional balls until stop. 16 3.1. WHEN EACH BOX NEEDS A SINGLE BALL Where, E[N 1 ]=1+ 1 2 ⇤ 1 1/2+✏ +✏⇤ 1 1/2 +(0.5✏)⇤ E[N 1 ] = 1 1/2+✏ + 1 1+2✏ ⇤ 1 1/2+✏ + ✏ 1/2+✏ ⇤ 2 ⇡ 4 Putting this ball in box 3, we need to collect N 2 additional balls until the problem ends. Where, E[N 2 ]=1+ 1 1/2 =3 Clearly E[N 1 ]>E[N 2 ]. 3.1.3. M is stochastically maximized by ⇡ o . Let M(k)denotethenumberoffilledboxesafter k balls were collected. If X i ,i=1,...,n are independent, M(k) is stochastically maximized by ⇡ o . That is,8 kr, P(M ⇡ o (k)r)=max ⇡ P(M ⇡ (k)r) Let S denote the state (the box set) when the process begins. To show P(M ⇡ o (k)r) = max ⇡ P(M ⇡ (k)r),itisequivalenttoshowthatS ⇡ o (r)< (1) S ⇡ (r)8 r,whereS ⇡ (r)denotesthenumberofballscollecteduntilr boxeswere filled starting from state S following ⇡.Toshowwhich,weintroducethe following lemma 17 3.1. WHEN EACH BOX NEEDS A SINGLE BALL 3.1.4. If S d S 0 , then S ⇡ o (r)< (1) S 0 ⇡ o (r), 8 r. Proof. The proof is made by induction. When r=1theresultisimmediate S(1) d =Geo(1 Q j2 S q j ) S 0 (1) d =Geo(1 Q i2 S 0 q i ) Because Q j2 S q j < Q i2 S 0 q i , S ⇡ o (1) < (1) S 0 ⇡ o (1). Now assume S ⇡ o (r1) < (1) S 0 ⇡ o (r1), we show S ⇡ o (r) < (1) S 0 ⇡ o (r), forr> 1. Coupling the ball vectors until one box in S gets filled, denote the resulting states as S 1 and S 0 1 .Itis straight forward to see thatS 1 d S 0 1 .Combingwiththeinductionhypothesis completes the proof. ⇤ Next we show: P(S ⇡ o (r) k)=max ⇡ P(S ⇡ (r) k) Proof. Again, the proof is made by induction. If r=1, S(1) d = Geo(1 Q j2 S q j ), thereforeanypolicyisthesame. AssumingP(S ⇡ o (r1) k)=max ⇡ P(S ⇡ (r 1) k), we show it is true at r.Let S 1 be the state after assigning the initial ball following ⇡ o ; S 0 1 be the corresponding state following another policy. It is straight forward to see that S 1 d S 0 1 . By the induction hypothesis, after the initialassignmentitisoptimaltofollow⇡ o .Combiningthiswiththepreceding lemma completes the proof. ⇤ 18 3.2. ESTIMATING P(N>R) BY EFFICIENT SIMULATION 3.2. Estimating P(N>r) by ecient simulation When X i are not independent, ⇡ o remains a reasonably good policy even though not optimal. From here on, we adhere to it and we are interested in using simulation to eciently estimate the distribution and the mean value of N.Let I = {I 1 ,...,I n } denote the sequence in which boxes were filled, i.e. I j was thej th boxthatgotfilled. LetA j denotetheadditional number ofballs collected after j1boxeswerefilled,until j boxes were filled. 3.2.1. A Lemma. Conditioning on I = {I 1 ,...,I n }, A 1 ,...,A n are in- dependent geometric random variables with, E[A j |I 1 ,...,I n ]= 1 1Q j where Q j =P(aballdoesnotfitanyofthecurrentemptyboxes{I j ,...,I n }) =P( n X k=j X I k =0) Proof. Given I 1 ,...,I j 1 ,todeterminethevalueof A j ,weneedtheinformation whether the next ball fits the alive box set. If yes, the ball is kept andA j =1; if not, it’s dropped and A j increases by 1. Balls are collected until one is kept. Because the vectors are i.i.d,conditioningon I 1 ,...,I j 1 , A j follows a geometric distribution with parameter P j =1Q j , which is the probability that a ball is eligible for at least one alive box. 19 3.2. ESTIMATING P(N>R) BY EFFICIENT SIMULATION NextweshowthatconditiononI 1 ,...,I j 1 ,A j andI j ,...,I n areindependent. For example, giving A j =3isequivalenttosaythattwoballsweredropped and the third one was kept; giving A j =7isequivalenttosaythatsixballs were dropped and the seventh was kept. Because the vectors arei.i.dandeach ball faces the same alive boxes I j ,...,I n , A j =3and A j =7donotmake di↵erence in inferring I j ,...,I n . Which implies that givenI 1 ,...,I j 1 , A j and I j ,...,I n are independent. ⇤ WhenX i are independent, we provide an algorithm to generateI (See Appen- dix). And specifically, N|I= n X i=1 A i |I d =Geo(1 n Y i=1 q I i )⇤ Geo(1 n Y i=2 q I i )⇤ ...⇤ Geo(1q In ) ALemma Let X 1 ,...,X n be independent geometric random variables, with X i having parameter p i , i=1,...,n.If p i are distinct, then for rn, P( n X i=1 X i >r)= n X i=1 (1p i ) r Y j6=i p j p j p i Proof. AproofwasgiveninRoss(A First Course in Probability). It is omitted here. ⇤ By the preceding Lemmas, P {(N>r)|I} = n X i=1 C i (1Q i ) r where C i = Q j6=i 1 Q j Q i Q j and Q i = Q n j=i q I j , i=1,...,n. 20 3.2. ESTIMATING P(N>R) BY EFFICIENT SIMULATION To estimate P(N>r)bysimulationwegenerate I and set est 1 =P {(N>r)|I} 3.2.2. Variance reduction techniques. LetI j =(j,I 1 ,...,I n 1 ), suchthatboxj was the first box being filled. Instead of generating I for K rounds, we generate I j for K j = K⇤ P(I 1 = j)rounds, j=1,...,n. est 2 = n X j=1 P(I 1 =j)P (N>r)|I j By proportional stratification on the first box being filled, est 2 achieves a smaller variance. Here are some numerical results to compare the variances of est 1 , est 2 and the raw estimator, est raw = I{N>r|I} where I{·} is an indicator function. 3.2.3. A numerical example. For P 1 =(0.10.30.50.70.9), P 2 = (0.10.20.30.40.5), P 3 =(0.40 0.45 0.50 0.55 0.60) and r=5,...,12, findP(N>r)with1000simulationroundsandcomparethevariancesofest 1 , est 2 and est raw . Table P(N>r) r 56 7 8 9 10 11 12 P 1 0.7749 0.6453 0.5556 0.4847 0.4259 0.3761 0.3334 0.2964 P 2 0.9497 0.8604 0.7577 0.6584 0.5697 0.4932 0.4281 0.3728 P 3 0.6495 0.3524 0.1786 0.0888 0.0443 0.0224 0.0115 0.0060 21 3.2. ESTIMATING P(N>R) BY EFFICIENT SIMULATION V(Est P(N>r) ), P =P 1 r 56 7 8 9 10 11 12 est raw 0.1721 0.2281 0.2469 0.2499 0.2449 0.2359 0.2242 0.2113 est 1 0.0706 0.1009 0.0997 0.0905 0.0794 0.0682 0.0579 0.0486 est 2 0.0566 0.0799 0.0791 0.0722 0.0637 0.0551 0.0471 0.0398 P =P 2 r 56 7 8 9 10 11 12 est raw 0.0497 0.1212 0.1833 0.2254 0.2455 0.2500 0.2447 0.2331 est 1 0.0024 0.0141 0.0317 0.0471 0.0559 0.0583 0.0563 0.0517 est 2 0.0018 0.0106 0.0242 0.0364 0.0439 0.0464 0.0454 0.0421 P =P 3 r 56 7 8 9 10 11 12 est raw 0.2273 0.2293 0.1484 0.0818 0.0415 0.0208 0.0098 0.0063 est 1 0.0032 0.0054 0.0040 0.0021 0.0009 0.0004 0.0001 0.0001 est 2 0.0027 0.0047 0.0035 0.0019 0.0008 0.0003 0.0001 0.0000 Remark: est 2 employs proportional stratification sampling with the number of total simulation rounds equals that of est raw and est 1 .Variancelistedin the table above is per simulation round based. The variance of est 2 is about 98% lower than that of est raw . 22 3.3. ESTIMATING E[N] WITH EFFICIENT SIMULATION 3.3. Estimating E[N] with ecient simulation By the preceding lemmas, est 1 =E[N|I] = 1 1 Q n i=1 q I i + 1 1 Q n i=2 q I i +...+ 1 1q In whereI=(I 1 ,...,I n )isthegeneratedsequenceinwhichboxeswerefilled. Another estimator of E[N] Let T 1 be the number of balls collected until box 1 is filled, and forj>1let T j be the number of additional balls collected after box 1,...,box j1have been filled until box j is filled. (Thus, if box j was filled before any one of box 1,...box j1was, T j =0). Weknow: N = P n j=1 T j .Following ⇡ o E[N]= 1 p 1 + n X j=2 P(I(b j )=1) 1 p j est 2 = 1 p 1 + n X j=2 I(b j )/p j where I(b j )istheindicatorfortheeventthatbox j is the last of box 1,...,j being filled and it is read from the generated I. Additional variance reduction techniques Because est 1 and est 2 are both unbiased estimators of E[N], their linear com- bination remains an unbiased estimator ofE[N]. Because they are both based on I,afterobtainingoneestimator,thereislittleextraworktoobtainthe other. Take est 3 = ↵ ⇤ est 1 +(1↵ ⇤ )est 2 23 3.3. ESTIMATING E[N] WITH EFFICIENT SIMULATION where, ↵ ⇤ = V[est 2 ]Cov[est 1 ,est 2 ] V[est 1 ]+V[est 2 ]2Cov[est 1 ,est 2 ] Remark : ↵ ⇤ guarantees thatVar[est 3 ]isminimized,andisobtainedfromthe simulation data. Further variance reduced by stratified sampling Rather than ↵ ⇤ est 1 +(1↵ )⇤ est 2 , we prefer est 4 = n X j=1 P(I 1 =j) n ↵ ⇤⇤ est 1 +(1↵ ⇤⇤ )est 2 | ~ I J o = ↵ ⇤⇤ est ⇤ 1 +(1↵ ⇤⇤ )est ⇤ 2 whereI j =(j,I 1 ,...,I n 1 ), est ⇤ 1 andest ⇤ 2 are estimators obtained by stratifying est 1 and est 2 on the first box being filled, and ↵ ⇤⇤ = V[est ⇤ 2 ]Cov[est ⇤ 1 ,est ⇤ 2 ] V[est ⇤ 1 ]+V[est ⇤ 2 ]2Cov[est ⇤ 1 ,est ⇤ 2 ] 3.3.1. A numerical example. For P 1 =(0.10.30.50.70.9), P 2 =(0.10.20.30.40.5) and P 3 =(0.40 0.45 0.50 0.55 0.60), find E[N] with 10000 simulation rounds and compare the variances of the di↵erent estimators. P 1 P 2 P 3 E[N]= 11.4378 12.9671 6.3572 24 3.4. BOUNDS OF P(N>R) AND E[N] WHEN X I ARE INDEPENDENT V[est E[N] ] est raw est 1 est 2 est 3 est 4 P 1 74.5407 17.6545 4.2079 0.5433 0.3923 P 2 64.7693 14.5467 10.4489 1.0131 0.7708 P 3 2.3526 0.0959 4.1189 0.0349 0.0255 Remark:Thelinearcombinationof est 1 and est 2 having a much lower vari- ance, suggests a negative correlation of them. This is true since they are both derivatives of the same I and for example while I=(1,2,3,4,5) minimizes est 1 ,itmaximizes est 2 . 3.4. Bounds of P(N>r) and E[N] when X i are independent When X i are independent, N|I has the same distribution as a convolution composed by n independent geometric random variables. Because I has n! di↵erent values, N has the same distribution as a weighted average of n! convolutions. Among those convolutions each term is stochastically larger than N|I=(1,2,...,n) and stochastically less than N|I=(n,...,2,1). There- fore, N|I=(1,2,...,n) (1) N (1) N|I=(n,n1,...,1) where (1) stands for stochastically less than of the first order. We get one set of lower and upper bounds of P(N>r). min I P {(N>r)|I} =P {(N>r)|I=(1,2,...,n)} max I P {(N>r)|I} =P {(N>r)|I=(n,n1,...,1)} 25 3.4. BOUNDS OF P(N>R) AND E[N] WHEN X I ARE INDEPENDENT The preceding bounds could be strengthened by conditioning on I 1 (or even better on I 1 ,... , I k for somek>1). Because P(N>r)= n X i=1 P(N>r|I 1 =i)P(I 1 =i) and min I P {(N>r)|I 1 =i} =P {(N>r)|I=(i,1,2,.i1,i+1..,n)} max I P {(N>r)|I 1 =i} =P {(N>r)|I=(i,n,n1,.i+1,i1..,1)} Here is an example to show the tightness of these bounds. 3.4.1. A numerical example. For P 1 =(0.10.30.50.70.9), P 2 = (0.10.20.30.40.5), P 3 =(0.40 0.45 0.50 0.55 0.60) and r=5,...,12, comparethebounds ofP(N>r)withthevalueobtainedbysimulation. Bounds of P(N>r), P =P 1 r 56 7 8 9 10 11 12 P(N>r)0.7749 0.6453 0.5556 0.4847 0.4259 0.3761 0.3334 0.2964 L.B 0.2045 0.0378 0.0084 0.0022 0.0007 0.0002 0.0001 0.0000 U.B 0.9534 0.8838 0.8075 0.7322 0.6613 0.5960 0.5365 0.4826 P =P 2 r 56 7 8 9 10 11 12 P(N>r)0.9497 0.8604 0.7577 0.6584 0.5697 0.4932 0.4281 0.3728 L.B 0.8401 0.6081 0.3981 0.2444 0.1441 0.0829 0.0470 0.0265 U.B 0.9855 0.9526 0.9046 0.8469 0.7845 0.7209 0.6586 0.5991 26 3.4. BOUNDS OF P(N>R) AND E[N] WHEN X I ARE INDEPENDENT P =P 3 r 56 7 8 9 10 11 12 P(N>r)0.6495 0.3524 0.1786 0.0888 0.0443 0.0224 0.0115 0.0060 L.B 0.5950 0.2854 0.1246 0.0522 0.0214 0.0087 0.0035 0.0014 U.B 0.7658 0.5099 0.3175 0.1915 0.1137 0.0671 0.0395 0.0232 Recall, E[N]= 1 p 1 + n X j=2 P {I(b j )=1} 1 p j where I(b j )istheindicatorfortheeventthatbox j is the last of box 1,...,j being filled, which means that among box 1,...,j, the first box being filled is not box j, the second box being filled is not box j,...,the j1 st box being filled is not box j. Now for fixedj> 1, let F i denote the i th filled box among box 1,...,j,i<j P {I(b j )=1} =P(F 1 6=j,F 2 6=j,...,F j 1 6=j) =P(F 1 6=j)P(F 2 6=j|F 1 6=j)...P(F j 1 6=j|F 1 6=j,F 2 6=j,...,F j 2 6=j) We know P(F 1 6=j)=1P(F 1 =j)=1 p j Q j 1 1Q j P(F 2 6=j|F 1 =i)=1P(F 2 =j|F 1 =i)=1 p j Q j 1 /q i 1Q j /q i =1 p j Q j 1 q i Q j some i<j P(F 3 6=j|F 1 =i;F 2 =l)=1P(F 3 =j|F 1 =i;F 2 =l)=1 p j Q j 1 /(q i q l ) 1Q j /(q i q l ) =1 p j Q j 1 q i q l Q j some i6=l<j 27 3.4. BOUNDS OF P(N>R) AND E[N] WHEN X I ARE INDEPENDENT Similarly, for distinct 0<i 1 ,...,i k <j P(F k+1 6=j|F i =i r ;r=1,...k)=1 p j Q j 1 q i 1 ...q i k Q j The above equation is maximized wheni r =r and minimized wheni r =jr. Because P(F k+1 6= j|F 1 6= j,F 2 6= j,...,F k 6= j)isaweightedaverageofthe equation above, we see for k=1,2,...,i2 1 p j Q j 1 Q j 1 /Q j k 1 Q j P(F k+1 6=j|F 1 6=j,F 2 6=j,...,F k 6=j) 1 p j Q j 1 Q k Q j Put those bounds together j 2 Y k=0 (1 p j Q j 1 Q j 1 /Q j k 1 Q j ) P {I(b j )=1} j 2 Y k=0 (1 p j Q j 1 Q k Q j ) Hence, 1 p 1 + n X j=2 1 p j j 2 Y k=0 (1 p j Q j 1 Q j 1 /Q j k 1 Q j ) E[N] 1 p 1 + n X j=2 1 p j j 2 Y k=0 (1 p j Q j 1 Q k Q j ) Example For P 1 =(0.10.30.50.70.9), P 2 =(0.10.20.30.40.5), P 3 =(0.40 0.45 0.50 0.55 0.60) , compare the bounds of E[N]withthe value obtained by simulation. Bounds of E[N] P 1 P 2 P 3 E[N]= 11.4378 12.9671 6.3572 L.B =11.1960 12.5751 6.0283 U.B =12.4333 13.8006 6.5356 28 CHAPTER 4 When ball vectors are exchangeable 4.1. When X 1 ,...,X n are exchangeable. This Chapter considers a special case where X 1 ,...,X n are exchangeable. Exchangeablity means that there exist non-negative c k, k=0...n,suchthat P( P n i=1 X i = k)= c k and conditioning on P n i=1 X i = k,aballisequally likely to have any one of those n k possible vectors. The exchangeability assumption allows us to take the state as the ordered vector S=(s 1 ,...,s n ) such that s 1 s 2 ... s n ,withtheinterpretationbeingthatthestateis S if {s 1 ,...,s n } is the set of the remaining quota of the boxes. We propose apolicy ⇡ o that puts each arriving ball in its eligible box having the largest remaining quota, and show that ⇡ o minimizes N stochastically. That is, P(N ⇡ o (S) r)= max ⇡ P(N ⇡ (S) r), 8 r 1 where ⇡ is an arbitrary policy of assigning the sequentially arriving balls and N ⇡ (S) is the number of balls collected until the problem ends starting from S and following ⇡ . To show the optimality of ⇡ o ,weintroducealemma. Recall S submajorizes S 0 (S d S 0 ), if either S S 0 component wise, or there exist distinct integers 29 4.1. WHEN X 1 ,...,X N ARE EXCHANGEABLE. i<j such that S=(m 1 ,...,m i 1,...,m j ,...,m n ) S 0 =(m 1 ,...,m i ,...,m j 1,...,m n ) We have the following Lemma. 4.1.1. If S d S 0 , then P (N ⇡ o (S) r)P (N ⇡ o (S 0 ) r),8 r. Proof. It is trivial if S S 0 component wise. Here the proof is about whenS andS 0 di↵er on two components and it is made by induction. Because the components of a ball vector are exchangeable, we can, without loss of generality, re-interpret the vector X 1 ,...,X n to be such that X i =1means that the ball is eligible to be put in the box with the i th largest remaining quota. Assume P(N ⇡ o (S) r) P(N ⇡ o (S 0 ) r), we show the inequality is true at r+1. GivenS d S 0 ,considertwoscenarios. Thefirstonestartsfrom state S and the second from S 0 .Coupletheinitialvectoranddenote S 1 as the resulting state from S after assigning the initial ball following ⇡ o ; S 0 1 as the resulting state fromS 0 .Bytheinductionhypothesis,toshowP(N ⇡ o (S) r+1) P(N ⇡ o (S 0 ) r+1),itissucienttoshow S 1 d S 0 1 almost surely, which is guaranteed by ⇡ o . ⇤ 4.1.2. P (N ⇡ o (S) r) = max ⇡ P (N ⇡ (S) r). 8 S✓{ 1,2,...n}and8 r 1 Proof. The proof is by an induction argument. If r=1theresultis immediate. So assume it true at r,weshowitistrueat r+1. To show 30 4.1. WHEN X 1 ,...,X N ARE EXCHANGEABLE. P(N ⇡ o (S) r+1) = max ⇡ P(N ⇡ (S) r+1), let S 1 denote the resulting state after assigning the initial ball following ⇡ o ; S 0 1 denote the corresponding state following another policy. It is clear that S 1 d S 0 1 . By the induction hypothesis, after the initial choice it is optimal to follow ⇡ o .Combiningthis with the preceding Lemma completes the proof. ⇤ Without the condition of X 1 ,...,X n being exchangeable, ⇡ o is not optimal. Here is an example: S=(2,1,1,1); P(X 1 =1)=p; P(X 2 =X 3 =X 4 =1)= p, P(X 2 = X 3 = X 4 =0)=1p,and X 1 is independent from (X 2 ,X 3 ,X 4 ). Although p(X 1 =1)= p(X 2 =1)= p(X 3 =1)= p(X 4 =1)= p, X i are not exchangeable. Assume an incoming ball has vector X=(1,1,1,1), by following ⇡ o it is best to put it in box 1. However, because X 2 =X 3 =X 4 , N has the same distribution as N 0 from another game. In which S 0 =(2,3) and X 0 =(X 0 1 ,X 0 2 ); where X 0 i ,i=1,2areindependentandidenticallydistributed with P(X 0 i =1)= p. Assuming the incoming ball has a vector X 0 =(1,1), by following ⇡ o it is best to put the ball in box 2, which is against the choice from the first game following the same policy. 31 4.1. WHEN X 1 ,...,X N ARE EXCHANGEABLE. 4.1.3. P(N ⇡ o (S) r) is a Schur Concave function. If X 1 ,...,X n are exchangeable, P(N ⇡ o (S) r)isSchurConcaveof S. Proof. To show P(N ⇡ o (S) r)isSchurConcaveof S, one must show: (1) it is symmetric of S=(s 1 ,s 2 ,...,s n ); (2) S m S 0 implies P(N ⇡ o (S) r) P(N ⇡ o (S 0 ) r). (1) is proved by the exchangeability of X 1 ,...,X n .To prove (2), recall: U majorizes V (U m V)if k X i=1 u i k X i=1 v i ,k=1,2,...,n1 and n X i=1 u i = n X i=1 v i The structure of majorization implies: if S d S 0 ,then S m S 0 ;if S m S 0 , then there exist S 1 ,...,S k such that S = S 1 m S 2 m ... m S k = S 0 and S i d S i+1 ,where S i =(s 1 ,...,s i ,s i+1 ,...,s n )and S i+1 =(s 1 ,...,s i 1,s i+1 + 1,...,s n ). Combining with the preceding lemma we conclude: if S m S 0 ,thenP(N ⇡ o (S) r) P(N ⇡ o (S 0 ) r). The Schur Concavity of P(N ⇡ o (S) r)isproved. ⇤ By the Schur Concavity of P(N ⇡ o (S) r), for quota vectors with a fixed sum ¯ S=(¯ s,¯ s,...,¯ s)maximizes P(N ⇡ o (S) r); S ⇤ =( P n i=1 s i ,0,...,0) minimizes P(N ⇡ o (S) r). 32 4.2. ESTIMATING P(N>R) BY EFFICIENT SIMULATION 4.2. Estimating P(N>r) by ecient simulation From here on we assume ⇡ o is taken and we are interested in using simulation to eciently estimate the distribution of N.Let n j , j=0,1,...t1denote the number of alive boxes after j balls were accepted, where t = P n i=1 m i .Let A j denote the additional number of balls collected after j balls were accepted untilj+1ballswereaccepted. Conditioningonn = {n 0 ,...,n t 1 },A 1 ,...,A t 1 are independent geometric random variables with E[A j |n 0 ,...,n t 1 ]= 1 1Q n j 1 where Q n j 1 =P(a ball is not eligible to any one of the alive boxes) Proof. The proof is similar to the one in Chapter 1. It is omitted here. ⇤ Therefore N|n 0 ,...,n t 1 = t 1 X i=0 A i |n 0 ,...,n t 1 d =Geo(1Q n 0 )⇤ Geo(1Q n 1 )⇤ ...⇤ Geo(1Q n t 1 ) As long as X 1 ,...,X n are exchangeable, n = {n 0 ,...,n t 1 } are the sucient statistics ofN. WhenX 1 ,...,X n are independent and identically distributed, weprovideanalgorithmtogeneraten(SeeAppendix). Undersomeconditions there exists a closed form expression for P(N>r|n). If such an expression exists, we generate n then do the calculation. If such an expression does not 33 4.2. ESTIMATING P(N>R) BY EFFICIENT SIMULATION exist we could estimate P(N>r|n)with est raw = I n X A(i)>r o where I{·} is the indicator function. Actually, instead of generating N = P A(i)andtaking I{N>r},wegenerateN 1 ,thenumberofballscollectedun- til onebox wasalivefollowing ⇡ o ;recordtheremainingquotauand take est 1 = 8 < : 1 if N 1 r p(Z>rN 1 ) if N 1 <r where Z follows a negative binomial distribution. Compared with est raw , which is 0 or 1 valued, est 1 boosts the lower end up hence is more accurate and has a smaller variance. 4.2.1. Variance reduction techniques. To generate n,weneedtogenerateuniformnumbers U.Asmallvalued U implies the event that a ball is put in a box with a high quota, which further implies a small N. Hence U and N are positively correlated, therefore U and P(N>r)arepositivelycorrelated. Wecouldtake P t u i=1 U i t u 2 as a control variate for est 1 .thatis est 2 =est 1 +c ⇤ ( t u X i=1 U i tu 2 ) where c ⇤ = Cov(est 1 , P t u i=1 U i ) Var( P t u i=1 U i ) Because est 1 is a monotone increasing function of N 1 and E[N 1 ]iscalculable. Instead of P t u i=1 U i t u 2 ,onemayprefertouse N 1 E[N 1 ]asthecontrol 34 4.2. ESTIMATING P(N>R) BY EFFICIENT SIMULATION variate for est 1 . est 3 =est 1 +c ⇤ (N 1 E[N 1 ]) where c ⇤ = Cov(est 1 ,N 1 ) Var(est 1 ) Remark: c ⇤ , c ⇤ , u and E[N 1 ]areobtainedfromthesimulationdata. 4.2.2. A numerical example. For P(X i =1)=0.1 i=1,...5, S 1 = (1 3 5 7 9), S 2 =(3 4 5 6 7), S 3 =(5 5 5 5 5)and r=70,...,110, find P(N>r)with10000simulationruns. P(N>r) E[N] r=70 r=80 r=90 r=100 r=110 S 1 102.4050 0.9427 0.8440 0.7024 0.5464 0.4039 S 2 89.0730 0.8340 0.6290 0.4380 0.2710 0.1610 S 3 85.1890 0.7610 0.5180 0.3220 0.1920 0.0850 For P(X i =1)=0.1, i=1,...,5, S=(1 3 5 7 9)and r=70,...250, compare the variances in estimating P(N>r). V[Est P(N>r) ] r=70 r=80 r=90 r=100 r=110 est raw 0.0519 0.1304 0.2089 0.2480 0.2405 est 1 0.0266 0.0731 0.1197 0.1406 0.1322 est 2 0.0243 0.0606 0.0904 0.1003 0.0933 est 3 0.0201 0.0431 0.0579 0.0638 0.0625 35 4.3. ESTIMATING E[N] BY EFFICIENT SIMULATION r=150 r=175 r=200 r=225 r=250 est raw 00.0733 0.0223 0.0056 0.0012 0.0002 est 1 0.0260 0.0048 0.0006 0.0001 0.0000 est 2 0.0210 0.0042 0.0006 0.0001 0.0000 est 3 0.0188 0.0040 0.0005 0.0000 0.0000 Remark: datafromtheabovetableverifiedtheSchurConcavityofP(N(S)> r). For example S 1 is more diverse than S 2 ,and S 2 is more diverse than S 3 . N(S 1 )isstochasticallylagerthan N(S 2 ), and N(S 2 ) is stochastically lager thanN(S 3 ). Varianceslistinthetablearepersimulationroundbased. V[est 3 ] is about 70% lower than V[est raw ]. 4.3. Estimating E[N] by ecient simulation By the preceding lemmas est 1 =E[N|n] = 1 1Q n 0 + 1 1Q n 1 +...+ 1 1Q n t 1 Again, we take P t i=1 U i t 2 as a control variate of est 1 , est 2 = 1 1Q n 0 + 1 1Q n 1 +...+ 1 1Q n t 1 +c ⇤ ( t X i=1 U i t 2 ) wherec ⇤ is obtained from the simulation data. Here is an example to compare the variance of the estimators of E[N]. 36 4.3. ESTIMATING E[N] BY EFFICIENT SIMULATION 4.3.1. Anumericalexample. ForS 1 =(1 3 5 7 9),S 2 =(3 4 5 6 7) andS 3 =(5 5 5 5 5),comparethevariancesinestimating E[N]with10000 simulation rounds. V(est E[N] ) P(X i =1)=0.1 S 1 S 2 S 3 est raw 614.8416 408.0493 352.8506 est 1 158.2081 85.5281 48.9290 est 2 59.6370 34.5001 25.5171 P(X i =1)=0.5 S 1 S 2 S 3 est raw 4.6678 3.8047 3.7699 est 1 0.4392 0.1055 0.0856 est 2 0.3029 0.0884 0.0751 Remark:Varianceslistedintheabovetablearepersimulationroundbased. Conditioning on n, V[est 1 ]isabout80%lowerthan V[est raw ]. When com- bining with a control variate, the variance could be further reduced by about 90%. 37 CHAPTER 5 When ball vectors are independent 5.1. When X 1 ,...,X n are independent but not identically distributed In this section we suppose thatX 1 ,...,X n are independent but not identically distributed, and consider the problem of minimizing E[N]. The standard dynamic programming recursion can not be e↵ectively applied, because the needed computations grow exponentially in n. 5.1.1. A heuristic policy. Consequently, we introduce a heuristic policy ⇡ h :let P i = P(X i =1),thenif R i is the current quota of box i when a ball arrives i=1...n,theheuristicpolicy ⇡ h puts a ball with vector (X 1 ,...,X n ) in that box j such that R j P j = Max i:X i =1 ⇢ R i P i that is box j is the eligible box having the largest R i P i . 5.1.2. An on-line optimization policy. Starting from policy ⇡ h ,we then employ the policy improvement algorithm of dynamic programming, that is given current quota R 1 = r 1 ,...,R n = r n ,itisbesttoputanarrivingball in that box j for which E[N ⇡ h (r 1 ,...,r j 1,...,r n )] = Min i:X i =1 {E[N ⇡ h (r 1 ,...,r i 1,...,r n )]} 38 5.1. WHEN X 1 ,...,X N ARE INDEPENDENT BUT NOT IDENTICALLY DISTRIBUTED where E[N ⇡ h (r 1 ,...,r i 1,...,r n )] denotes the expected number of balls to collect until the problem N(r 1 ,,...,r i 1,...,r n )endswhenfollowing ⇡ h . With the help of simulation, the minimization can be e↵ected at each stage of the problem. Call this new policy ⇡ h ⇤.Thus, ⇡ h ⇤ is an on-line optimization policy. 5.1.3. AlowerboundofE[N]. To measure how “good”⇡ h and⇡ h ⇤ are, we take a lower bound of E[N]thatappliestoallpoliciesasabenchmark. It is easy to see that Min ⇡ {E[N ⇡ (s 1 ,...,s n )]} Max i=1...n ⇢ s i p i ToidentifyasecondlowerboundofE[N],lett = P n i=1 s i ,andB i ,i=0,...t1 be the set of the alive boxes after i balls were accepted. Also, let A i denote the additional number of balls to collect after i balls were accepted until i+1 balls were accepted. Conditioned on (B 0 ,...,B t 1 ), A 0 ,...,A t 1 are independent geometric ran- dom variables with E[A j |B 0 ,...,B t 1 ]= 1 P(B j ) . Where, P(B j )=P(a ball is eligible to at least one box in B j ) Proof. The proof is similar to the argument in Chapter 1. It is omitted here. ⇤ Fromtheabovelemmaweknow: thereexistsaspecificsequenceof(B 0 ,...,B t 1 ) conditioned on which, P t 1 i=0 A i is minimized. Assuming, without loss of the 39 5.1. WHEN X 1 ,...,X N ARE INDEPENDENT BUT NOT IDENTICALLY DISTRIBUTED generality, that p 1 p 2 ... p n ,wehave, Min (B 0 ,...,B t 1 ) t 1 X i=0 E[A i ]= tn 1 Q n i=1 (1p i ) + n X j=2 1 1 Q n i=j (1p i ) The above bound is derived by assuming that all boxes were alive up until when tn balls have being accepted. Then, the box which is the hardest to be filled got filled, ...,etc. Thereforewetakethelowerboundof E[N] as: Min ⇡ {E[N ⇡ ]} Max i=1...n ( m i p i ; tn 1 Q n i=1 (1p i ) + n X j=2 1 1 Q n i=j (1p i ) ) Here is an example to show the performance of ⇡ h and ⇡ h ⇤ . 5.1.4. Anumericalexample. ForP=[0.10.15 0.20.25 0.30.35 0.40.45 0.5]; S=[5 8 10 12 15 18 20 23 26];Compare E[N ⇡ h ], E[N ⇡ h⇤ ]andalower bound of E[N]. E[N ⇡ h ] 2 E[N ⇡ h⇤ ] 2 L.B.{E[N]} 146.1140 7.9322 144.8000 12.6222 143.5288 Varianceslistedinthetableabovearepersimulationroundbased. Wetook10 simulation rounds, inside each simulation round, upon making each decision we took 100 simulation rounds. It is easy to see that the heuristic policy performs well and its improved version has E[N ⇡ h⇤]veryclosetothelower bound of E[N]byfollowinganypolicy. Another Example When n=2,theoptimalvaluefunctioncouldbecalculatednumericallyby 40 5.1. WHEN X 1 ,...,X N ARE INDEPENDENT BUT NOT IDENTICALLY DISTRIBUTED the standard dynamic programming algorithm, when s 1 and s 2 are not too large. The following comparesE[N] obtained by applying the heuristic policy, the improved one, and the optimal. For i,j =1,..,200, let E[N ⇡ h (i,j)] denote the expected number of balls to collect until there are at leasti balls in box 1 andj balls in box 2 by following theheuristicpolicy;E[N ⇡ h⇤ (i,j)]denotetheexpectednumberbyfollowingthe improved heuristic policy; V(i,j) denote the minimal expected number found by dynamic programming. (p 1 ,p 2 ) Max (i,j) n E[N⇡ h (i,j)] V(i,j) o Max (i,j) n E[N⇡ h⇤ (i,j)] V(i,j) o (0.1,0.9) 1.1033 1.0055 (0.2,0.8) 1.0710 1.0094 (0.3,0.7) 1.0333 1.0066 (0.4,0.6) 1.0139 1.0021 (0.5,0.5) 1 1 (0.005,0.5) 1.0410 1 (0.0005,0.5) 1 1 Again we see that by following the improved heuristic policy E[N ⇡ h⇤ ]isless than 1% deviated from the optimal. 41 5.2. AN OPTIMAL POLICY WHEN THERE ARE 2 BOXES 5.2. An optimal policy when there are 2 boxes 5.2.1. The (1,m) model. Assume p 1 <p 2 .Thepolicythathas E[N] minimized in state (1,m)istoputa(1,1) vector-ed ball in box 1 if and only if m M 1 where, M 1 = log ⇣ p 1 q 2 p 2 q 1 ⌘ log ⇣ 1 p 1 1 q 1 q 2 ⌘ . Proof. The proof is made by a one stage look ahead argument. Assume the current state is S=(1,m+1). Becausewearemakingadecisiononly if the current ball has a (1,1) vector, we assume so. Because once box 1 is filled there are no more decisions involved and the optimal “cost to go” is known, we can treat this model as a stopping time problem. Because in a finite stage monotone stopping problem, a one stage look ahead policy is an optimal policy, it is sucient to show that the (1 ,m) problem constitutes a finite stage monotone stopping problem. LetH(0,m+1)= m+1 p 2 be the expected number of balls to collect if we put the current ball in box 1 (stopping now) and let H(1,m)betheexpectednumber if we put the current ball in box 2 and the next box 1 eligible ball in box 1 (goingonestagemoreandthenstop). Theonestagelookaheadpolicywillput the current ball in box 1 if H(0,m+1) H(1,m). Thus, we must show that H(1,m)H(0,m+1) decreasesmonotonicallyin m.Tocompute H(1,m), we consider the balls that are eligible for at least one box. Conditioning on 42 5.2. AN OPTIMAL POLICY WHEN THERE ARE 2 BOXES the first box 1 eligible ball among those, we see H(1,m)= m X r=1 p 1 1q 1 q 2 (1 p 1 1q 1 q 2 ) r 1 [ r 1q 1 q 2 + m+1r p 2 ]+(1 p 1 1q 1 q 2 ) m [ m 1q 1 q 2 + 1 p 1 ] =1+ m p 2 + ✓ q 1 p 1 ◆ (1 p 1 1q 1 q 2 ) m where the missing algebra in going from the first to the second equation is given in the appendix Therefore H(1,m)H(0,m+1)= ✓ q 1 p 1 ◆ (1 p 1 1q 1 q 2 ) m q 2 p 2 Now it is straight forward to see that H(1,m)H(0,m+1)decreasesmono- tonically in m,withafinalvalue q 2 p 2 and an initial value H(1,0)H(0,1) = q 1 p 1 q 2 p 2 If p 1 <p 2 , H(1,0) H(0,1) > 0. Therefore we showed that the (1,m) model constitutes a finite stage monotone stopping problem, thus ⇡ o (M 1 )is an optimal policy. ⇤ SolvingH(1,m)H(0,m+1)=0findsthecriticalvalueinstate(1,m), M 1 = log ⇣ p 1 q 2 p 2 q 1 ⌘ log ⇣ 1 p 1 1 q 1 q 2 ⌘ 5.2.2. The (2,m) model. Assume p 1 <p 2 .Thereexistsan M 2 ,such that the policy has E[N]minimizedinstate(2,m)istoputa(1,1) vector-ed ball in box 1 if and only if m M 2 . 43 5.2. AN OPTIMAL POLICY WHEN THERE ARE 2 BOXES Proof. Again, the proof is made by a one stage look ahead argument. Because we are making a decision only if the current ball has a (1,1) vector, we assume so. Because once box 1 receives a ball the problem transfers to a (1,m)problem,whereanoptimalpolicyisknownandthereforetheoptimal “cost to go” is calculable, we can treat the (2,m)problemasastoppingtime problem. Similarlytothe(1,m)problem,weshowthe(2,m)modelconstitutes a finite stage monotone stopping time problem. Assume the current state is M=(2,m+1). Let V(1,m)betheminimalexpectednumberofballsto collect until there are at least 1 ball in box 1 and m balls in box 2; H(2,m) be the minimal expected number if we put the current ball in box 2 and the next box 1 eligible ball in box 1. The one stage look ahead policy will put the current ball in box 1 if V(1,m+1) H(2,m). Thus, we show H(2,m)V(1,m+1) decreases monotonically in m.Wealreadyshowedinstate(1,m)thereexists an M 1 such that it is best to put a (1,1) vector-ed ball in box 1 if and only if m M 1 . Case 1 m M 1 H(2,m)= m X r=1 p 1 1q 1 q 2 (1 p 1 1q 1 q 2 ) r 1 r 1q 1 q 2 +V(1,m+1r) +(1 p 1 1q 1 q 2 ) m m 1q 1 q 2 + 2 p 1 =2+ m p 2 + ✓ q 1 p 2 1q 1 q 2 ◆ m m q 1 1q 1 q 2 +2 q 1 p 1 44 5.2. AN OPTIMAL POLICY WHEN THERE ARE 2 BOXES Therefore H(2,m)V(1,m+1) = q 2 p 2 + ✓ q 1 p 2 1q 1 q 2 ◆ m q 1 1q 1 q 2 m+ q 1 p 1 + q 1 1q 1 q 2 = q 2 p 2 +E[X;X>m]+ q 1 1q 1 q 2 ✓ q 1 p 2 1q 1 q 2 ◆ m here X follows a geometric distribution with mean value 1 q 1 q 2 p 1 . It is straight forward to see that H(2,m)V(1,m+1)decreasesmonotonicallyin m.Itis also easily verified that H(2,M 1 )V(1,M 1 +1) > 0. Thus, given m M 1 , in state (2,m)theonestagelookaheadpolicyputsa(1,1) ball in box 1. Case 2 m =M 1 +k,k> 0 Consider the balls that are eligible for at least one box and condition on the arrival having a (1,0) vector among those, we get V(1,M 1 +k)= k X r=1 p 1 q 2 1 q 1 q 2 ✓ p 2 1 q 1 q 2 ◆ r 1 r 1 q 1 q 2 +H(0,M 1 +k+1 r) + ✓ p 2 1 q 1 q 2 ◆ k k 1 q 1 q 2 +V(1,M 1 ) = M 1 +k p 2 + ✓ p 2 1 q 1 q 2 ◆ k ✓ c 1 M 1 p 2 ◆ H(2,M 1 +k)= k X r=1 p 1 1 q 1 q 2 ✓ q 1 p 2 1 q 1 q 2 ◆ r 1 r 1 q 1 q 2 +V(1,M 1 +k+1 r) + ✓ q 1 p 2 1 q 1 q 2 ◆ k ✓ k 1 q 1 q 2 +c 2 ◆ =1+ M 1 +k p 2 + ✓ q 1 p 2 1 q 1 q 2 ◆ k ✓ M 1 q 1 q 2 p 2 (1 q 1 q 2 ) +c 2 1 c 1 1 q 1 q 2 ◆ + ✓ p 2 1 q 1 q 2 ◆ k ✓ c 1 1 q 1 q 2 M 1 p 2 (1 q 1 q 2 ) ◆ where c 1 =V(1,M 1 ), c 2 =V(2,M 1 ). Therefore H(2,M 1 +k) V(1,M 1 +k+1) = ✓ q 1 p 2 1 q 1 q 2 ◆ k q 2 q k 1 (1 q 1 q 2 ) q 1 q 2 1 q 1 q 2 + ✓ q 1 p 2 1 q 1 q 2 ◆ M 1 M 1 q 1 1 q 1 q 2 + 2q 1 p 1 q 1 p 1 (1 q 1 q 2 ) + q 1 q 2 q k 1 p 1 (1 q 1 q 2 ) !! q 2 p 2 We showed thatH(2,M 1 +k)V(1,M 1 +k+1)decreasesmonotonicallyink, with a final value q 2 p 2 and an initial value H(2,M 1 )V(1,M 1 +1) > 0. We 45 5.2. AN OPTIMAL POLICY WHEN THERE ARE 2 BOXES conclude that in state (2,m), there exists an M 2 such that it is optimal to put an (1,1) vector-ed ball in box 1 if and only if m M 2 .Themissingalgebrais given in the appendix. ⇤ We believe there exist values M i ,i 1, such that in state (i,m)theoptimal policy puts a (1,1) vector-ed ball in box1 if and only if the current quota of box 2 is less or equals to M i . 5.2.3. The (1,m) model with cost (1). Assume the current state is S=(1,m); box 1 incurs a cost of 1 dollar per time unit and box 2 incurs a cost of c dollars per time unit, as long as they are alive. The primary interest in this section is to find a policy that minimizes the expected total cost. 5.2.3.1. An optimal Policy. Let U be the expected total cost. There exists an M 1 ,suchthatthepolicy has U minimized is to put a (1,1) vector-ed ball in box 1 if and only if m M 1 . Proof. Similartotheprecedingsection, weshowthe(1,m+1)costprob- lem constitutes a finite stage monotone stopping problem. In particular, as- suming the current ball has a (1,1) vector, we show W(1,m)U(0,m+1) is a monotone function of m,where U(0,m+1) is the expected minimal cost if we put the current ball in box 1 (stopping now); W(1,m)istheexpected minimalcostifweputthecurrentballinbox2andthenextbox1eligibleball in box 1 (going one stage more then stopping). Conditioning on the arrival of 46 5.2. AN OPTIMAL POLICY WHEN THERE ARE 2 BOXES aballwhichiseligibletobox1,wesee W(1,m)= m X r=1 p 1 1q 1 q 2 (1 p 1 1q 1 q 2 ) r 1 ⇢ r 1q 1 q 2 + r 1q 1 q 2 ⇤ c+ m+1r p 2 ⇤ c +(1 p 1 1q 1 q 2 ) m ⇢ m 1q 1 q 2 ⇤ c+ m 1q 1 q 2 + 1 p 1 = 1 p 1 q 2 p 2 ⇤ c+ m+1 p 2 ⇤ c ✓ 1 p 1 1q 1 q 2 ◆ m ⇤ c After some algebra we get W(1,m)U(0,m+1)= 1 p 1 q 2 p 2 ⇤ c ✓ 1 p 1 1q 1 q 2 ◆ m ⇤ c ItisstraightforwardtoseethatW(1,m)U(0,m+1)increasesmonotonically in m,withafinalvalue p 2 p 1 q 2 ⇤ c p 1 p 2 and an initial value p 2 p 1 ⇤ c p 1 p 2 . ⇤ 1 If p 2 p 1 q 2 ⇤ c 0, it is best to put each (1,1) ball in box 2. That is, M 1 =1 . 2 If p 2 p 1 ⇤ c 0, it is best to put each (1,1) ball in box 1. That is, M 1 =1. 3 If p 2 p 1 ⇤ c<0and p 2 p 1 q 2 ⇤ c> 0, there exists an M 1 ,suchthatit is better to put a (1,1) ball in box 1 if and only if m M 1 ,where M 1 is the solution of, 1 p 1 q 2 p 2 ⇤ c ✓ 1 p 1 1q 1 q 2 ◆ m ⇤ c=0 5.2.4. The (1,m) model with cost (2). Assume the current state is S=(1,m); p 1 <p 2 ;perquotaofbox1incursacostof1dollarpertime 47 5.2. AN OPTIMAL POLICY WHEN THERE ARE 2 BOXES unit and per quota of box 2 incurs a cost of c dollars per time unit, as long as they are alive. The primary interest in this section is to find a policy that minimizes the expected total cost. 5.2.4.1. An optimal policy. Let U be the expected total cost. There exists an M 1 ,suchthatthepolicy has U minimized is to put a (1,1) vector-ed ball in box 1 if and only if m M 1 . Proof. We show the (1,m+1) cost problem constitutes a finite stage monotone stopping problem. In particular, assuming the current ball has a (1,1) vector, we show W(1,m)U(0,m+1) is a monotone function of m,where U(0,m+1) is the expected minimal cost if we put the current ball in box 1 (stopping now); W(1,m)istheexpected minimalcostifweputthecurrentballinbox2andthenextbox1eligibleball in box 1 (going one stage more then stopping). Conditioning on the arrival of aballwhichiseligibletobox1,wesee W(1,m)= m X r=1 p 1 1q 1 q 2 (1 p 1 1q 1 q 2 ) r 1 ( r 1q 1 q 2 + c 1q 1 q 2 m+1 r X i=m i+ c p 2 m+1 r X i=1 i ) +(1 p 1 1q 1 q 2 ) m ( c 1q 1 q 2 m X i=1 i+ m 1q 1 q 2 + 1 p 1 ) After some algebra we get W(1,m)U(0,m+1) =mc ✓ q 2 p 2 ◆ + ✓ 1 p 1 1q 1 q 2 ◆ m c ✓ q 1 p 2 p 1 ◆ + 1 p 1 + c(q 1 q 2 +q 1 q 2 p 2 1) p 1 p 2 48 5.2. AN OPTIMAL POLICY WHEN THERE ARE 2 BOXES ItisstraightforwardtoseethatW(1,m)U(0,m+1)decreasesmonotonically in m,withafinalvalue1 and an initial value W(1,0)U(0,1) =c ✓ q 1 p 2 p 1 ◆ + 1 p 1 + c(q 1 q 2 +q 1 q 2 p 2 1) p 1 p 2 = 1 p 1 + c(q 1 q 2 +q 1 p 2 q 2 +q 1 p 2 p 2 1) p 1 p 2 = 1 p 1 + c(q 1 1) p 1 p 2 > c(p 2 +q 1 1) p 1 p 2 > 0 if p 1 p 2 We showed that the (1,m) cost problem consists a monotone stopping prob- lem, therefore ⇡ o is an optimal policy. ⇤ Theorem Foranyp 1 andp 2 ,denoteM 1 (c 1 )=min m { W(1,m,c 1 )U(0,m+1,c 1 )< 0 }. If c 1 <c 2 ,then M 1 (c 1 )>M 1 (c 2 ) Proof. Because W(1,m)U(0,m+1) is a linear function of c and it decreases monotonically in m. ⇤ Theorem Foranyp 2 andc,denoteM 1 (p 1 )=min m { W(1,m,p 1 )U(0,m+1,p 1 )< 0 } If p 0 1 >p 1 ,then M 1 (p 0 1 )>M 1 (p 1 ) 49 5.2. AN OPTIMAL POLICY WHEN THERE ARE 2 BOXES Proof. d[W(1,m)U(0,m+1)] dp 1 =cp 2 ✓ q 1 p 2 m p 1 (p 2 +p 1 q 2 ) 2 + q 1 p 2 p 2 1 (p 2 +p 1 q 2 ) ◆✓ 1 p 1 1q 1 q 2 ◆ m 1 1cp 2 p 2 1 < 0 BecauseW(1,m)-U(0,m+1)decreasesmonotonicallyinp 1 ,combiningwiththe fact that it decreases monotonically in m,weconcludeforany p 1 <p 0 1 <p 2 , M 1 (p 0 1 )>M 1 (p 1 ) . ⇤ 50 CHAPTER 6 An index policy Apriorityindexpolicy j 1 ,...,j n is a permutation of 1...n with the instruc- tion that an arriving ball having vector values (x 1 ,...,x n )shouldbeputin box j i if i is the smallest integer for which box j i is alive and x j i =1. Let S=(s 1 ,...,s n )bethestate,where s i is the initial quota of box i,(i=1...n). If s 1 s 2 ... s n , then assuming that X 1 ,...,X n are exchangeable we show that the only index policy that could be optimal is ⇡ ⇤ ⌘ (1,2,...,n). When X 1 ,...,X n are independent we give an ecient algorithm in estimating E[N ⇡ ⇤ ]. 6.1. When X 1 ,...,X n are exchangeable Let N ⇡ (S)denotethenumberofballsittakestofillalltheboxesstarting from state S when following a priority index policy. We have following theo- rem. 6.1.1. An optimal index policy. For all S satisfying s 1 s 2 ... s n ,and8 r 1 P (N ⇡ ⇤ (S) r)=max ⇡ P (N ⇡ (S) r) 51 6.1. WHEN X 1 ,...,X N ARE EXCHANGEABLE To show the optimality of ⇡ ⇤,letusdefinearelationshipbetweencertain pairs of quotas. Specifically, for S and S 0 being vectors of i 1 ,...,i n ,saythat S d S 0 ,iftherearedistinctintegers i 1 ,...i j ,i j+1 ,...,i n such that i j >i j+1 and S = i 1 ,...,i j ,i j+1 ,...,i n and S 0 = i 1 ,...,i j+1 ,i j ,...,i n .Weintroduce the following lemma. 6.1.2. A Lemma. If S d S 0 then P (N ⇡ ⇤ (S 0 ) r)<P (N ⇡ ⇤ (S) r), 8 r 1. Proof. GivenS d S 0 ,considertwoscenarios. Thefirstonestaringfrom state S and the second from S 0 .Theexchangeabilityassumptionallowsusto generate a random vector X=(X 1 ,...,X j ,X j+1 ,...,X n ) as the vector of the arriving balls in the first scenario and take Y=(X 1 ,...,X j+1 ,X j ,...,X n ) as the vector in the second scenario. Following ⇡ ⇤ ⌘ (1,2,...,n), if (X j ,X j+1 )=(1,0) decreases S(j)byone,it decreases S 0 (j +1) by one. Similarly if (X j ,X j+1 )=(0,1) decreases S(j+1) by one, it decreasesS 0 (j) by one. Also, if (X j ,X j+1 )=(1,1) decreasesS(j)by one,itdecreasesS 0 (j)byone. Considering(S(j),S(j+1)) vs (S 0 (j),S 0 (j+1)) because the initial value of S(j)= i j is greater than the initial value of S 0 (j)=i j+1 we have the following. 52 6.2. ESTIMATING E[N] BY EFFICIENT SIMULATION 1 either S 0 (j)dropsto0first, e.g. (1,1) vs (0,2), 2 or S(j+1)and S 0 (j)dropto0simultaneously, e.g. (1,0) vs (0,1), 3 or S and S 0 become equal at some stage, e.g. (1,1) vs (1,1). If 3 happens we take X = Y.Itiseasytoseethatif 1 happens, P(N ⇡ ⇤ (S 0 ) r)<P(N ⇡ ⇤ (S) r)andif 2 or 3 happens, P(N ⇡ ⇤ (S 0 ) r)=P(N ⇡ ⇤ (S) r). Because the probability that 1 happens is positive, P (N ⇡ ⇤ (S 0 ) r)<P (N ⇡ ⇤ (S) r), 8 r ⇤ The preceding Lemma shows that if the index policy 1...n is employed, then between any neighboring boxes i and i+1itispreferredthatbox i have a higher initial quota. It follows that if s 1 s 2 ... s n then the only priority index policy that could be optimal is ⇡ ⇤ ⌘ (1,2,...,n). 6.2. Estimating E[N] by ecient simulation We are interested in using simulation to eciently estimate E[N ⇡ ⇤ ]. Let B i , i=0...t1and A i take the same notation as in Chapter 4. We see est 1 =E[N ⇡ ⇤ |B]= t 1 X j=0 1 1 Q i2 B j q i serves as a natural estimator of E[N ⇡ ⇤ ]. Besides that, we present another estimator of E[N ⇡ ⇤ ]. Let R i denote the remaining quota of box i when box 1,...,i1 were all filled. Let T 1 denote the number of balls collected until 53 6.2. ESTIMATING E[N] BY EFFICIENT SIMULATION box 1 got filled, and fori>1let T i denote the number of additional balls collected after box 1,...,i1werefilleduntilbox i was filled. (If box i got filled before any one of box 1,...,i1did, T i =0). N ⇡ ⇤ could be interpreted as: N ⇡ ⇤ = P n i=1 T i ,where T i follows a negative binomial distribution with parameters (R i ,p i ). By this interpretation, we get another estimator, est 2 = n X i=1 E[T i |R i ]= s 1 p 1 + n X i=2 R i p i R i could be obtained by 2 methods. The first one is to simulate B then read R i from the simulation result. The advantage of this method is: (1) because est 1 and est 2 could share one B,withoutmuchadditionalworkwe get two unbiased estimators of E[N ⇡ ⇤ ]. (2) because est 1 and est 2 are strongly negatively correlated (shown with an example later), their linear combination has a lower variance. The other method to obtain R i is by direct simulation, as follows. The number of balls collected until there are s 1 balls in box 1 follows a negative binomial distribution with parameters (s 1 ,p 1 ). 1 generate Y 1 d = NBino(s 1 ,p 1 ). Among those not eligible to box 1, each ball is eligible to box 2 with probability p 2 . 2 generate n 2 d = Bino(Y 1 s 1 ,p 2 ). Update the remaining quota of box 2 as R 2 = s 2 n 2 . Now box 2 has the highest priority and the additional number of balls to collect until there ares 2 balls in it follows a negative binomial distribution with parameters (R 2 ,p 2 ). Among those left-overs from 1 and 2 , each ball is eligible to box 3 with probability p 3 . And box 3 is the current box with the highest priority...We continue to simulaterecursivelyuntilR n isfound. CallthismethodtheNegativeBinomial and Binomial Way( denoted as NBBW). The advantage of NBBW is that it greatly speeds up the simulation rounds, especially when t is large. 54 6.2. ESTIMATING E[N] BY EFFICIENT SIMULATION 6.2.1. Variance reduction techniques. Let est 3 denote the best linear combination of est 1 and est 2 . est 3 = ↵est 1 +(1↵ )est 2 where ↵ is the constant that minimizes Var[est 3 ]. 6.2.2. A numerical example. Under priority policy (1,...,n)for S 1 =(9 7 5 3 1), S 2 =(1 3 5 7 9), S 3 =(7 6 5 4 3), S 4 =(3 4 5 6 7),and S 5 =(5 5 5 5 5),wecompare the variance of estimators of E[N ⇡ ⇤ ]fordi↵erent Ps. P(X i =1)=(0.1, 0.1, 0.1, 0.1, 0.1) S 1 S 2 S 3 S 4 S 5 E[N]103.2813 111.3130 89.7090 98.8910 88.4680 V[est raw ]598.0792 690.4121 431.9029 514.4947 394.3360 V[est 1 ]147.1061 156.2715 83.1205 107.0744 59.5013 V[est 2 ]294.0002 899.9211 390.6321 762.3001 579.2369 V[est 3 ]90.6068 117.8720 71.1399 86.1324 56.8689 55 6.2. ESTIMATING E[N] BY EFFICIENT SIMULATION P(X i =1)=(0.5, 0.5, 0.5, 0.5, 0.5) S 1 S 2 S 3 S 4 S 5 E[N]28.8134 34.0351 28.9209 32.1453 30.3200 V[est raw ]7.1180 17.8521 6.6786 13.5873 9.7520 V[est 1 ]1.0883 2.1895 0.5739 1.6612 1.0328 V[est 2 ]19.6746 23.6108 21.0089 23.0769 21.9435 V[est 3 ]1.1084 1.6422 0.5501 1.3620 0.9021 P(X i =1)=(0.9, 0.9, 0.9, 0.9, 0.9) S 1 S 2 S 3 S 4 S 5 E[N]25.1707 25.9952 25.3394 25.7628 25.5510 V[est raw ]0.1918 1.1205 0.3642 0.8410 0.6067 V[est 1 ]0.0039 0.0079 0.0038 0.0068 0.0057 V[est 2 ]1.4404 1.2864 1.5257 1.3603 1.3928 V[est 3 ]0.0042 0.0060 0.0033 0.0055 0.0050 P(X i =1)=(0.1, 0.2, 0.3, 0.4, 0.5) S 1 S 2 S 3 S 4 S 5 E[N]90.5983 35.0942 71.2664 39.2291 52.6486 V[est raw ]759.2083 22.7604 573.8432 121.8558 347.8189 V[est 1 ]264.5206 4.0569 220.0875 52.3749 157.3176 V[est 2 ]5.9863 111.0756 13.8373 90.5085 30.4907 V[est 3 ]4.6529 3.7305 8.9521 14.6182 13.6030 P(X i =1)=(0.1, 0.3, 0.5, 0.7, 0.9) 56 6.2. ESTIMATING E[N] BY EFFICIENT SIMULATION S 1 S 2 S 3 S 4 S 5 E[N]89.8647 26.7684 69.9792 34.5302 50.6927 V[est raw ]814.9489 15.7503 617.0171 168.7088 414.0693 V[est 1 ]219.6741 5.0613 196.0857 80.8666 170.3756 V[est 2 ]0.3786 58.3685 1.4225 36.1752 6.6006 V[est 3 ]0.3502 2.9153 1.1970 9.8131 3.9783 The variances listed in the tables above are per simulation round based. With 1000 simulation rounds, we see that when X i are exchangeable, the est 1 has a lowervariancethantheest 2 ,andwhenX i arenotexchangeable, theest 1 hasa higher variance than theest 2 .Thelinearcombinationoftheest 1 and the est 2 hasamuchlowervariance, becausetheyarenegativelycorrelated, forexample givenS=(9 7 5 3 1),thesequenceB that maximizes the variance ofest 1 is (9 7 5 3 1)! (9 7 5 3 0)! (9 7 5 0 0)! (9 7 0 0 0)! (9 0 0 0 0). This B also minimizes the variance of est 2 . 57 CHAPTER 7 A queuing model 7.1. When each box has a life time This Chapter studies a variation of the Stochastic Employment Problem con- sidered in the preceding Chapters. In particular, we consider the problem that each box has a life time which is exponentially distributed. That is, box i has a life time which follows an exponential distribution with parameter µ i , i...n.BallsarriveaccordingtoaPoissonprocesswithparameter and each ball has a binary vector X = X 1 ...X n with the interpretation being that if X i = 1, the ball is eligible to be put in box i. Assuming X i are independent, the primary interest in this Chapter is to find a policy which stochastically maximizes the number of filled boxes when possible, or to maximize the prob- ability that all boxes being filled, if a stronger result is not obtainable. To maximize the expected number of filled boxes, although we can always find such a policy by dynamic programming, it has a low eciency because the needed computations grow exponentially in n.Todealwiththisshortage, we introduce a heuristic policy. Starting with which, we then improve it by applying the policy improvement algorithm of dynamic programming. 7.1.1. When µ 1 = ··· = µ n . Assume the lifetime of each box is independent and identically distributed 58 7.1. WHEN EACH BOX HAS A LIFE TIME with parameter µ. With out loss of generality we assume p 1 < ···<p n , where p(X i =1)= p i .Let N(S)bethenumberofboxesbeingfilledstarting from state S.Let ⇡ o be the policy that always put a ball in box i,where {i=minj :X j =1&box j is alive },thatisbox i is the smallest indexed alive eligible box when a ball arrives. Claim: N(S) is stochastically maximized by ⇡ o . That is, for all S and r| S|, P(N ⇡ o (S)r) = max ⇡ P(N ⇡ (S)r) To prove its optimality, let’s define a relationship between certain sets of boxes. Specifically, for S and S 0 being subsets of {1,2,...n},say S d S 0 if there are distinct integers, {i,j,i 1 ,i 2 ,...i k } such thati<j,and S=(j,i 1 ,i 2 ,...i k )and S 0 =(i,i 1 ,i 2 ,...i k ). We have the following lemma. If S d S 0 ,then N ⇡ o (S)> (1) N ⇡ o (S 0 ). Where > (1) denotes stochastic dominance of the first order. The proof of this lemma is made by induction on |S| and coupling the ball vectors. Proof. (1) when |S|=1. Weknow S = {j} and S 0 = {i} and N ⇡ o (S)= 8 < : 1 with probability p j p j +µ 0 with probability µ p j +µ N ⇡ o (S 0 )= 8 < : 1 with probability p i p i +µ 0 with probability µ p i +µ Because p j >p i ,itisimmediatethat N ⇡ o (S)> (1) N ⇡ o (S 0 ). 59 7.1. WHEN EACH BOX HAS A LIFE TIME (2) Assume when |S| = |S 0 | = n and S d S 0 ,itistruethat N ⇡ o (S) > (1) N ⇡ o (S 0 ). We show it is true when |S| = |S 0 | = n+1. Let S 1 be the state starting from S until an event happens (either a box disappears or a ball arrives), andS 0 1 bethecorresponding statestartingfromS 0 .Bytheinduction hypothesis, to showN ⇡ o (S)> (1) N ⇡ o (S 0 ), itissucienttoshowthat S 1 d S 0 1 with probability 1. Which, we show through coupling the ball vectors. First, we generate a uniform (0,1) distributed random variable I. 1 If I 2 (0, µ +(n+1)µ ], set box j 2 S and box i2 S 0 die. Then S 1 =S 0 1 ,and therefore N ⇡ o (S 1 ) d =N ⇡ o (S 0 1 ). 2 If I 2 ( kµ +(n+1)µ , (k+1)µ +(n+1)µ ], set box k 2 S and2 S 0 die, k=1...n.Then S 1 d S 0 1 . 3 If I2 ( (k+1)µ +(n+1)µ ,1], a ball arrives. By cross coupling the ball vectors of X i and X j (the same method employed in Chapter 3), whenever a ball is eligible to box i,itisalsoeligibletobox j.Thisleadsusthat S 1 d S 0 1 . Therefore N ⇡ o (S)> (1) N ⇡ o (S 0 ), for all S d S 0 . ⇤ To show N(S) is stochastically maximized by ⇡ o ,weemployaninduction argument. First we notice that if |S|=1,allpoliciesarethesame. Then assume when |S| = n, N(S)isstochasticallymaximizedby ⇡ o .Weshowit is true when |S| = n+1. Ifthefirsteventhappeningisaboxdie,thenby the induction hyper-thesis, it is optimal to follow ⇡ o after since. Assuming the first event happening is an arrival, if it is not eligible to S, the ball will be dropped and the problem remains unchanged, thanks to the memory-less 60 7.1. WHEN EACH BOX HAS A LIFE TIME property of the exponential life times. If the ball is eligible to at least one box in S,combiningwiththelemma,itisoptimaltofollow ⇡ o . ⇤ 7.1.2. When µ 1 ···µ n and p 1 ··· p n . Assume each box has independent exponential lifetime with parameters µ 1 > ···>µ n ,and p 1 < ···<p n . LetN(S)bethenumberofboxesthatarefilledstartingfromstateS.Let⇡ o be thepolicythatalwaysputaballinboxi,where {i=minj :X j =1&box j is alive }, that is box i is the smallest indexed alive eligible box when a ball arrives. Claim: P ⇡ o (N(S)= |S|) = max ⇡ P ⇡ (N(S)= |S|),8 S. To prove its optimality, let’s define a relationship between certain sets of boxes. Specifically, for S and S 0 being subsets of {1,2,...n},say S d S 0 if there are distinct integers, {i,j,i 1 ,i 2 ,...i k } such thati<j,and S=(j,i 1 ,i 2 ,...i k )and S 0 =(i,i 1 ,i 2 ,...i k ). We have the following lemma. If S d S 0 ,and |S| = |S 0 | =k,then P ⇡ o (N(S)=k)>P ⇡ o (N(S 0 )=k). The proof of this lemma is made by induction on |S| and coupling the ball vectors. Proof. (1), when |S|=1. Weknow S = {j} and S 0 = {i} and N ⇡ o (S)= 8 < : 1 with probability p j p j +µ j 0 with probability µ j p j +µ j 61 7.1. WHEN EACH BOX HAS A LIFE TIME N ⇡ o (S 0 )= 8 < : 1 with probability p i p i +µ i 0 with probability µ i p i +µ i Becauseµ j <µ i andp j >p i ,itisimmediatethatP ⇡ o (N(S)=1)>P ⇡ o (N(S 0 )= 1). (2), Assume when |S| = |S 0 | =k,P ⇡ o (N(S)=k)>P ⇡ o (N(S 0 )=k). We show it is true when |S| = |S 0 | = k+1. Let S 1 denote the state starting from S until an event happens(either a box dies or a ball arrives), and S 0 1 denote the corresponding state from S 0 .Combiningwiththeinductionhypothesis,itis sucient to show that S 1 d S 0 1 with probability 1. Let A denote the event that a ball arrives before any box dies. First we notice that, P ⇡ o (N(S)=k+1)=P ⇡ o (N(S)=k+1|A)P(A) = + P i2 S µ i P ⇡ o (N(S)=k+1|Aballarrives) Similarly, P ⇡ o (N(S 0 )=k+1)=P ⇡ o (N(S 0 )=k+1|A)P(A) = + P i2 S 0 µ i P ⇡ o (N(S 0 )=k+1|Aballarrives) BecauseS d S 0 , + P i2 S µ i > + P i2 S 0 µ i .Itissucienttoshowthat P ⇡ o (N(S)= k +1|Aballarrives) > P ⇡ o (N(S 0 )= k +1|A ball arrives). Which we show by cross coupling vectors X i and X j of the arriving ball. Because p i <p j , whenever a ball is eligible to box i,itisalsoeligibletobox j.Thisleadsus 62 7.1. WHEN EACH BOX HAS A LIFE TIME that S 1 d S 0 1 almost surely. Combining with the induction hyper-thesis, the lemma is proved. ⇤ To show P ⇡ o (N(S)= |S|)= max ⇡ P ⇡ (N(S)= |S|), again we employ an induction argument. First we notice that if |S|=1,allthepoliciesarethe same. Then assume when |S| = n, P ⇡ (N(S)= |S|)ismaximizedby ⇡ o .We show it is true when |S| =n+1. Consideringthatthefirsthappeningeventis an arrival. If the ball is not eligible to S,itwillbedroppedandtheproblem remains unchanged, thanks to the memory-less property of the exponential life times. If the ball is eligible to at least one box in S, not following ⇡ o on the initial assign will incur dominance. Combining with the lemma, we should always follow ⇡ o . ⇤ 7.1.3. When both p 1 ,...,p n and µ 1 ,...µ n are general. This section considers the scenario that X 1 ,...,X n are independent but not identically distributed and µ 1 ,...µ n are general. Considering the problem of maximizing E[N], the standard dynamic programming recursion can not be e↵ectively applied, because the needed number of computations grow expo- nentially in n.Toovercomethediculty,weintroduceaheuristicpolicy ⇡ h , which puts a ball with vector (X 1 ,...,X n )inthatbox j such that p j µ j = Min i:X i =1 ⇢ p i µ i that is box j is the eligible box having the smallest p i µ i .Startingwiththe policy ⇡ h we then employ the policy improvement algorithm of dynamic pro- gramming, such that among the current alive box set S,itisbesttoputan 63 7.1. WHEN EACH BOX HAS A LIFE TIME arriving ball into that box j for which E[N ⇡ h (S 1 ,...,S j 1,...,S n )] = Max i:X i =1 {E[N ⇡ h (S 1 ,...,S i 1,...,S n )]} where E[N ⇡ h (S 1 ,...,S i 1,...,S n )] denotes the expected number of balls collected until the problem N(S 1 ,,...,S i 1,...,S n )endsbyfollowing ⇡ h . Withthehelpofsimulation,theminimizationcouldbee↵ectedateachstageof theproblem. Wecallthisnewpolicy ⇡ h ⇤ .Thus, ⇡ h ⇤ isanon-lineoptimization policy. Tomeasurehow“good”⇡ h and⇡ h ⇤ are,wecompareE[N ⇡ h (S)]andE[N ⇡ h ⇤ (S)] withV(S). WhereV(S)isthemaximumoftheexpectednumberofboxesthat could be saved before they die. When |S| is not large, V(S)couldbefound by dynamic programming. Example For |S|=5, =1,P=(0.50.60.70.80.9); µ=(0.10.20.30.40.5); we compare E[N ⇡ h (S)] and E[N ⇡ h ⇤ (S)] with V(S). E[N ⇡ h ] 2 E[N ⇡ h⇤ ] 2 V(S) 2.8700 1.2456 2.9292 1.0927 2.9388 Thevarianceslistedinthetableabovearepersimulationroundbased. Totally we took 100 simulation rounds, inside each simulation round, upon making each decision we took 1000 simulation rounds. It is easy to see that the heuristic policy performs well and its improved version has E[N ⇡ h⇤ ]veryclose to the optimal value. 64 CHAPTER 8 Possible future work 8.1. Possible Future Work Some possible future research works are proposed in this Chapter. 8.1.1. The two-box scenario. Chapter5.2studiedthescenariowheretherearetwoboxes. Assumingtheball vectors are independent and not identically distributed, the primary interest is to find a policy which minimizes the expected number of balls to collect until the problem ends. We believe that there exists a switching curve style policy (i,M i ), such that it is best to put a (1,1) vector-ed ball in box 1 if and only of the quota of box 2 is less or equal to M i ,givenbox1hasaquotaequals i. Although we proved the case where i=1,2, we do not have a proof for the general i.Webelievethisworthssomemorework. 8.1.2. When µ 1 ··· µ n and p 1 ··· p n . Chapter 7.1.2 considered the scenario where each box has independent expo- nential lifetime with parametersµ 1 >···>µ n ,andp 1 <···<p n .Weshowed thepolicythatalwaysputsaballinboxi,where {i=minj :X j =1&box j is alive } maximizes the probability that all boxes are saved. We believe this policy ac- tually maximizes the number of filled boxes stochastically, although we do not 65 8.1. POSSIBLE FUTURE WORK have a proof for that at this moment. We believe the variation of the SEP worths more work. 8.1.3. Increasing failure rate of a boxs life time. For the application of the SEP in an organ transplant decision problem, one possiblelimitationistheassumptionthatboxeshaveexponentiallydistributed lifetimes. It is an interesting and valuable topic to study the optimal as- signment policy, or the structure of optimal assignment policy under the as- sumption that each box has a lifetime with increasing failure rate. Possible research methodology could be combining simulation with dynamic program- ming. That is, applying policy improvement algorithm of dynamic program- ming to some good heuristic policy. 66 Bibliography [1] Kidney-Pancreas Simulated Allocation Model, Arbor Research Collabora- tive for Health. Scientific Registry of Transplant Recipients., 2008. [2] C. Albright and C. Derman. Asymptotic optimal policies for stochastic sequential assignment problem. Management Science Series a-Theory, 19(1):46–51, 1972. [3] S.C.Albright. Markov-decision-chainapproachtoastochasticassignment problem. Operations Research,22(1):61–64,1974. [4] S.C.Albright. Optimalsequentialassignmentswithrandomarrivaltimes. Management Science Series a-Theory,21(1):60–67,1974. [5] A. J. Baras, J. S. Dorsey and Markowski. Two competing queues with lin- ear costs and geometric service requirements: The c-rule is often optimal. Advances in Applied Probability,17(1):186–209,1985. [6] V. F. Bertsimas, D. Farias and N. Trichakis. Fairness, eciency and flexi- bility in organ allocation for kidney transplantation. Operations Research, Forthcoming in Operations Research, 2013. [7] F. T. Bruss and T. S. Ferguson. Multiple buying or selling with vector o↵ers. Journal of Applied Probability,34(4):959–973,1997. 67 [8] G.M. Campbell and M. Diaby. Development and evaluation of an assign- ment heuristic for allocating cross-trained workers. European Journal of Operational Research,138:9–20,2002. [9] Y. H. Chun and R. T. Sumichrast. A rank-based approach to the sequen- tial selection and assignment problem. European Journal of Operational Research,174(2):1338–1344,2006. [10] J. W. Cohen. A two-queue, one server model with priority for the longer queue. Queuing Systems,2:261–283,1987. [11] J. G. Dai and G. Weiss. A fluid heuristic for minimizing makespan in job shops. Operational Research,50(4):692–707,2002. [12] I. David and O. Levi. Asset-selling problems with holding costs. Inter- natinal journal of production economics,71:317–321,2001. [13] I. David and O. Levi. A new algorithm for the multi-item exponentially discounted optimal selection problem. European Journal of Operational Research,153(3):782–789,2004. [14] I. David and U. Yechiali. A time-dependent stopping problem with appli- cation to live organ transplant. Operation Research,33:491–504,1985. [15] I. David and U. Yechiali. Sequential assignment match processes with arrivals of candidates and o↵ers. Probability in the Engineering and In- formational Sciences,4:413–430,1990. [16] I. David and U. Yechiali. One-attribute sequential assignment match processes in discrete time. Operational Research,43:879–884,1995. 68 [17] C. Derman, G. J. Lieberman, and S. M. Ross. Stochastic sequential allo- cation model. Operations Research,23(6):1120–1130,1975. [18] Cyrus Derman, Gerald J. Lieberman, and M. Ross Sheldon. A sequen- tial stochastic assignment problem. Management Science,18(7):349–355, 1972. [19] ModianoE.Ganti,A.andJ.N.Tsitsikli. Optimaltransmissionscheduling insymmetriccommunicationmodelswithintermittentconnectivity. IEEE Transaction on Information Theory,53(3):998–1008,2007. [20] L. Holst. On birthday, collectors’ occupancy and other classical urn prob- lems. International statistical review,54:15–27,1986. [21] A.HordukandG.Koole. Ontheoptimalityofleptandc-rulesforparallel processors and dependent arrival processes. Advances in Applied Proba- bility,25:979–996,1993. [22] D. P. Kennedy. Optimal sequential assignment. Mathematics of Opera- tions Research,11(4):619–626,1986. [23] A.J.KleywegtandJ.D.Papastavrou. Thedynamicandstochasticknap- sack problem with random sized items. Operations Research,49(1):26–41, 2001. Times Cited: 42 Kleywegt, AJ Papastavrou, JD. [24] SevastyanovB.A.Kolchin, V.F.andV.P.Chistayov. Random Allocations, Winston and Sons, 1978. [25] G. Koole. Optimal server assignment in the case of service times with monotone failure rates. Systems and Control Letters,20:233–238,1993. 69 [26] G. Koole. Structural results for the control of queueing systems using event-based dynamic programming. Queuing Systems,30:323–339,1998. [27] G. Koole. Stochastic scheduling with event-based dynamic programming. Mathematical Methods of Operations Research,51:249–261,2000. [28] S. O. Manshadi, V. H. Gharan and A. Saberi. Online stochastic match- ing: Online actions based on o✏ine statistics. Mathematics of Operations Research,37(4):559–573,2012. [29] A. W Marshall and I Olkin. Inequalities: theory of majorization and its applications, San Diego: Academic Press, 1979. [30] M.MerkleandL.Petrovic. Inequalitiesforsumsofindependentgeometric random variables. Aequationes Mathematicae,54:173–180,1997. [31] M. Mitzenmacher and E. Upfal. Probability and Computing, Randomized Algorithms and Probabilistic Analysis,Cambridge,2005. [32] T. Nakai. Optimal assignment for a random sequence with an unknown number of jobs. Journal of the Operations Research Society of Japan, 28(3):179–193, 1985. [33] P. Neal. The generalised coupon collector problem. Journal of applied probability,45:621–629,2008. [34] A. G. Nikolaev and S. H. Jacobson. Stochastic sequential decision-making with a random number of jobs. Operations Research,58(4):1023–1027, 2010. 70 [35] E.PekozandS.MRoss. Estimatingthemeancovertimeofasemi-markov process via simulation. Probability in the Engineering and Informational Sciences,11:267–271,1997. [36] R. Righter. The stochastic sequential assignment problem with random deadlines. Probability in the Engineering and Informational Sciences, 1:189–202, 1987. [37] R. Righter. Job scheduling to minimize expected weighted flowtime on uniform processors. Systems and Control Letters,10(4):211–216,1988. [38] R. Righter. A resource-allocation problem in a random environment. Op- erations Research,37(2):329–338,1989. [39] R. Righter. Stochastically maximizing the number of successes in a se- quential assignment problem. Journal of Applied Probability,27(2):351– 364, 1990. [40] R.Righter. Stochasticsequentialassignmentproblemwitharrivals. Proba- bility in the Engineering and Informational Sciences,25(4):477–485,2011. [41] R. D. Ross and S. M. Ross. On a multiple item selling model with vector o↵ers with applications to organizational hiring and a general sequential stochastic assignment model. Technical report,2011. [42] S. M. Ross. A note on optimal stopping for success runs. The Annals of Statistics,3:793–795,1975. [43] S. M. Ross. Introduction to stochastic dynamic programming,SanDiego: Academic Press, 1982. 71 [44] S. M. Ross. Simulation, San Diego: Academic Press, 2006. [45] S. M. Ross and E. Pekoz. A second course in probability,Boston: Proba- bility book store, 2007. [46] S. M. Ross and D. T. Wu. A generalized coupon collecting model as a parsimonious optimal stochastic assignment model. Annals of Operations Research,online,2012. [47] R. J. Wyszewianski Ruth and G. Herline. A simulation model for exam- ining demand and supply. Management Science,31:515–526,1985. [48]MSakaguchi. Asequentialstochasticassignmentproblemwithanun- known number of kidneys. Mathematica Japonica,29:141–152,1984. [49] M. Sato. A stochastic sequential allocation problem where the resources can be replenished. Journal of the Operations Research Society of Japan, 40(2):206–219, 1997. [50] A. Sen and N. Balakrishnan. Convolution of geometrics and a reliability problem. Statistics and Probability Letters,43:421–426,1999. [51] M. Shaked and J.G. Shanthikumar. Stochastic Orders,Springer2007. [52] C.L.AlagozO.KrekeJ.E.StahlJ.E.SchaeferA.J.AngusD.C.Roberts M.S.Shechter,M.S.Bryce. Aclinicallybaseddiscrete-eventsimulationof end-stage liver disease and the organ allocation process. Medical Decision Making,25(2):199–209,2005. 72 [53] V. Shestak, E. K. P. Chong, A. A. Maciejewski, and H. J. Siegel. Proba- bilistic resource allocation in heterogeneous distributed systems with ran- domfailures. Journal of Parallel and Distributed Computing,72(10):1186– 1194, 2012. [54] X. M. Su and S. A. Zenios. Patient choice in kidney allocation: A sequen- tial stochastic assignment model. Operations Research,53(3):443–455, 2005. [55] L. Tassiulas and A. Ephremides. Dynamic server allocation to parallel queues with randomly varying connectivity. IEEE Transactions on Infor- mation Therory,39(2):466–478,1993. [56] Karp R. M. Vazirani U. V. and Vazirani V. V. An optimal algorithm for on-line bipartite matching. Proc. 22nd Annual ACM Sympos. on Theory of Comput.,pages352–358,1990. [57] S. A. A. Zenios, G. M. Chertow, and L. M. Wein. Dynamic allocation of kidneys to candidates on the transplant waiting list. Operations Research, 48(4):549–569, 2000. 73 .1. Appendix Algorithm of generating I in Chapter 3.2. 1,i=0 2,q(j)= q j ,j=1,...,n 3,Q(1) = q(1) 4,Q(j)= Q(j 1)⇤ q(j),j=2,...,n 5,C=1 Q(n) 6,i = i+1 7,S=1 Q(1) 8,GenerateU 9,U = U⇤ C 10,J=1 11,IfU Sgoto 15 12,J = J+1 13,S=1 Q(J) 14,Goto 11 15,I(i)= J 16,Ifi =nstop 17,q(J)=1 18,Goto 3 74 .1. APPENDIX The preceding algorithm generatesI by considering balls which are eligible for at least one of those alive boxes, where I i is the index of the i th being filled box. In particular, I 1 has the probability mass function, P {I 1 =j} = Q j 1 i=1 q i p j C ,j=0,1,...,n, n X j=1 P {I 1 =j}=1 In order to generate I 1 ,wegeneratearandomnumber U which is uniformly distributed over (0,1) and set I(1) = 8 > > > > > > > > > > > > > < > > > > > > > > > > > > > : 1 if U < 1 Q 1 C 2 if 1 Q 1 C U< 1 Q 2 C . . . j if 1 Q j 1 C U< 1 Q j C . . . n if 1 Q n 1 C U< 1 AfterI 1 is found, by settingq I 1 =1,thealgorithmre-normalizestheprobabil- ity that a ball is eligible for at least one of the alive boxes. After which, I 1 is found...ThealgorithmcontinuestosimulaterecursivelyuntilI n isfound. 75 .1. APPENDIX Algorithm of generating n in Chapter 4.2. For X i being i.i.d.,thisalgorithmgenerates n,where n(j), j=0...t1is the number of alive boxes after j balls have been accepted. 1,S=(s 1 s 2 ...s n ) 2,q =q 1 3,t = n X i=1 s i 4,j=0 5,Y =sort(S) descendly 6,ifY(i)> 0 i=1:n I(i)=1 else I(i)=0 7,n(j)= n X i=1 I(i) 8,Q(1) =q 9,Q(l)=Q(l1)⇤ql=2,...,n(j) 10,C=1Q(n(j)) 11,p=1Q(1) 12,GenerateU 13,U =U⇤ C 14,J=1 15,ifU pgoto 19 16,J =J+1 17,p=1Q(J) 76 .1. APPENDIX 18,goto 15 19,S(j)=J 20,Y(J)=Y(J)1 21,S =Y 22,j =j+1 23,ifj =tstop 24,Goto 5 77 .1. APPENDIX NBBW algorithm of generating N in Chapter 6. . 1,S=(s 1 ,s 2 ,...,s n ) 2,p=(p 1 ,p 2 ,...,p n ) 3,Y =(0,0,...,0) 4,generateY 1 d =negative binomial(s 1 ,p 1 ) 5,Y 1 =Y 1 s 1 6,generaten 2 d = binomial(Y 1 ,p 2 ) 7,if s 2 >n 2 r 2 =s 2 n 2 generate Y 2 d =negative binomial(r 2 ,p 2 ) n 12 =Y 1 n 2 else n 12 =Y 1 s 2 r 2 =0 Y 2 =0 8,n 23 =Y 2 +n 12 9,goto 6 10,return (Y 1 ,...,Y n )&(r 1 ,...,r n ) 11,N = n X i=1 (Y i +r i ) The algorithm above generates N via the Negative Binomial and Binomial Way. Under index policy 1...n,thenumberofballscollecteduntilthereare 78 .1. APPENDIX s 1 in box 1 has a negative binomial distribution with parameters(s 1 ,p 1 ). 1 generate Y 1 d = NBino(s 1 ,p 1 ). Among those not eligible to box 1, each ball is eligible to box 2 with probability p 2 . 2 generate n 2 d = Bino(Y 1 s 1 ,p 2 ). Update the remaining quota of box 2 as R 2 = s 2 n 2 . Now box 2 has the highest priority and the additional number of balls to collect until there ares 2 balls in it follows a negative binomial distribution with parameters (R 2 ,p 2 ). Among those left-overs from 1 and 2,eachballiseligibletobox3with probability p 3 . And box 3 is the current box with the highest priority...We continue to simulate recursively until R n is found. 79 .1. APPENDIX Algebra involved in Chapter 5. Chpater 5.2.1 the (1,m) model H(1,m)= m X r=1 p 1 1q 1 q 2 (1 p 1 1q 1 q 2 ) r 1 [ r 1q 1 q 2 + m+1r p 2 ] +(1 p 1 1q 1 q 2 ) m [ m 1q 1 q 2 + 1 p 1 ] =( 1 1q 1 q 2 1 p 2 ) m X r=1 p 1 1q 1 q 2 (1 p 1 1q 1 q 2 ) r 1 r + m+1 p 2 m X r=1 p 1 1q 1 q 2 (1 p 1 1q 1 q 2 ) r 1 +(1 p 1 1q 1 q 2 ) m [ m 1q 1 q 2 + 1 p 1 ] =( 1 1q 1 q 2 1 p 2 )[ 1q 1 q 2 p 1 (m+ 1q 1 q 2 p 1 )(1 p 1 1q 1 q 2 ) m ] + m+1 p 2 (1(1 p 1 1q 1 q 2 ) m )+(1 p 1 1q 1 q 2 ) m [ m 1q 1 q 2 + 1 p 1 ] = m+1 p 2 q 2 p 2 +(1 p 1 1q 1 q 2 ) m p 1 q 2 p 2 (1q 1 q 2 ) (m+ 1q 1 q 2 p 1 ) m+1 p 2 + m 1q 1 q 2 + 1 p 1 = m+1 p 2 q 2 p 2 +(1 p 1 1q 1 q 2 ) m 1 p 1 1 =1+ m p 2 + ✓ q 1 p 1 ◆ (1 p 1 1q 1 q 2 ) m Therefore H(1,m)V(0,m+1) = ✓ q 1 p 1 ◆ (1 p 1 1q 1 q 2 ) m q 2 p 2 80 .1. APPENDIX Chpater 5.2.2 the (2,m) model Case 1 m M 1 H(2,m)= m X r=1 p 1 1q 1 q 2 (1 p 1 1q 1 q 2 ) r 1 r 1q 1 q 2 +V(1,m+1r) +(1 p 1 1q 1 q 2 ) m m 1q 1 q 2 + 2 p 1 = m X r=1 p 1 1q 1 q 2 (1 p 1 1q 1 q 2 ) r 1 " r 1q 1 q 2 +1+ m+1r p 2 + q 1 p 1 ✓ q 1 p 2 1q 1 q 2 ◆ m+1 r # +(1 p 1 1q 1 q 2 ) m m 1q 1 q 2 + 2 p 1 = ✓ 1 1q 1 q 2 1 p 2 ◆ m X r=1 p 1 1q 1 q 2 (1 p 1 1q 1 q 2 ) r 1 r+ ✓ 1+ m+1 p 2 ◆✓ 1( q 1 p 2 1q 1 q 2 ) m ◆ + q 1 m 1q 1 q 2 ✓ q 1 p 2 1q 1 q 2 ◆ m +(1 p 1 1q 1 q 2 ) m m 1q 1 q 2 + 2 p 1 =2+ m p 2 + ✓ q 1 p 2 1q 1 q 2 ◆ m m q 1 1q 1 q 2 +2 q 1 p 1 Therefore H(2,m)V(1,m+1) = q 2 p 2 + ✓ q 1 p 2 1q 1 q 2 ◆ m q 1 1q 1 q 2 m+ q 1 p 1 + q 1 1q 1 q 2 = q 2 p 2 +E[X;X>m]+ q 1 1q 1 q 2 ✓ q 1 p 2 1q 1 q 2 ◆ m Here X follows a geometric distribution with mean equals 1 q 1 q 2 p 1 . It is straight forward that H(2,m)V(1,m+1) decreases monotonically in m. It is also easy to verify that H(2,M 1 )V(1,M 1 +1)> 0. 81 .1. APPENDIX Case 2 m =M 1 +k,k> 0 Consider the balls that are eligible to at least one box and condition on the arrival having a (1,0) vector, we get V(1,M 1 +k)= k X r=1 p 1 q 2 1 q 1 q 2 ✓ p 2 1 q 1 q 2 ◆ r 1 r 1 q 1 q 2 +H(0,M 1 +k+1 r) + ✓ p 2 1 q 1 q 2 ◆ k k 1 q 1 q 2 +V(1,M 1 ) = ✓ 1 1 q 1 q 2 1 p 2 ◆ k X r=1 p 1 q 2 1 q 1 q 2 ✓ p 2 1 q 1 q 2 ◆ r 1 r + 1 ✓ p 2 1 q 1 q 2 ◆ k ! ✓ M 1 +k+1 p 2 ◆ + ✓ p 2 1 q 1 q 2 ◆ k ✓ k 1 q 1 q 2 +c 1 ◆ = M 1 +k p 2 + ✓ p 2 1 q 1 q 2 ◆ k ✓ c 1 M 1 p 2 ◆ where c 1 =V(1,M 1 ). By one stage look ahead, H(2,M 1 +k)= k X r=1 p 1 1 q 1 q 2 ✓ q 1 p 2 1 q 1 q 2 ◆ r 1 r 1 q 1 q 2 +V(1,M 1 +k+1 r) + ✓ q 1 p 2 1 q 1 q 2 ◆ k ✓ k 1 q 1 q 2 +H(2,M 1 ) ◆ = k X r=1 p 1 1 q 1 q 2 ✓ q 1 p 2 1 q 1 q 2 ◆ r 1 " r 1 q 1 q 2 + M 1 +k+1 r p 2 + ✓ p 2 1 q 1 q 2 ◆ k+1 r ✓ c 1 M 1 p 2 ◆ # + ✓ q 1 p 2 1 q 1 q 2 ◆ k ✓ k 1 q 1 q 2 +c 2 ◆ = p 1 q 2 p 2 (1 q 1 q 2 ) k X r=1 p 1 1 q 1 q 2 ✓ q 1 p 2 1 q 1 q 2 ◆ r 1 r + ✓ M 1 +k+1 p 2 ◆ 1 ✓ q 1 p 2 1 q 1 q 2 ◆ k ! + p 1 1 q 1 q 2 ✓ p 2 1 q 1 q 2 ◆ k ✓ c 1 M 1 p 2 ◆ k X r=1 q r 1 1 + ✓ q 1 p 2 1 q 1 q 2 ◆ k ✓ k 1 q 1 q 2 +c 2 ◆ =1+ M 1 +k p 2 + ✓ q 1 p 2 1 q 1 q 2 ◆ k ✓ M 1 q 1 q 2 p 2 (1 q 1 q 2 ) +c 2 1 c 1 1 q 1 q 2 ◆ + ✓ p 2 1 q 1 q 2 ◆ k ✓ c 1 1 q 1 q 2 M 1 p 2 (1 q 1 q 2 ) ◆ 82 .1. APPENDIX where c 2 =H(2,M 1 ). Therefore H(2,M 1 +k) V(1,M 1 +k+1) = q 2 p 2 + ✓ q 1 p 2 1 q 1 q 2 ◆ k ✓ M 1 q 1 q 2 p 2 (1 q 1 q 2 ) +c 2 1 c 1 1 q 1 q 2 ◆ + q 2 1 q 1 q 2 ✓ c 1 M 1 p 2 ◆✓ p 2 1 q 1 q 2 ◆ k = q 2 p 2 + ✓ q 1 p 2 1 q 1 q 2 ◆ k q 1 q 2 1 q 1 q 2 + ✓ q 1 p 2 1 q 1 q 2 ◆ M 1 ✓ M 1 q 1 1 q 1 q 2 + 2q 1 p 1 q 1 p 1 (1 q 1 q 2 ) ◆ ! + ✓ q 1 p 2 1 q 1 q 2 ◆ k q 2 q k 1 (1 q 1 q 2 ) + q 1 q 2 q k 1 p 1 (1 q 1 q 2 ) ✓ q 1 p 2 1 q 1 q 2 ◆ M 1 ! = ✓ q 1 p 2 1 q 1 q 2 ◆ k q 2 q k 1 (1 q 1 q 2 ) q 1 q 2 1 q 1 q 2 + ✓ q 1 p 2 1 q 1 q 2 ◆ M 1 M 1 q 1 1 q 1 q 2 + 2q 1 p 1 q 1 p 1 (1 q 1 q 2 ) + q 1 q 2 q k 1 p 1 (1 q 1 q 2 ) !! q 2 p 2 To showH(2,M 1 +k)V(1,M 1 +k+1) decreases monotonically ink,itissucient to show that f(k)= ✓ q 1 p 2 1 q 1 q 2 ◆ k q 2 q k 1 (1 q 1 q 2 ) q 1 q 2 1 q 1 q 2 + ✓ q 1 p 2 1 q 1 q 2 ◆ M 1 M 1 q 1 1 q 1 q 2 + 2q 1 p 1 q 1 p 1 (1 q 1 q 2 ) + q 1 q 2 q k 1 p 1 (1 q 1 q 2 ) !! decreases monotonically in k.Toshowwhich,firstwecheck q 2 q k 1 (1 q 1 q 2 ) q 1 q 2 1 q 1 q 2 + ✓ q 1 p 2 1 q 1 q 2 ◆ M 1 M 1 q 1 1 q 1 q 2 + 2q 1 p 1 q 1 p 1 (1 q 1 q 2 ) + q 1 q 2 q k 1 p 1 (1 q 1 q 2 ) ! q 2 (1 q 1 q 2 ) q 1 q 2 1 q 1 q 2 + ✓ q 1 p 2 1 q 1 q 2 ◆ M 1 ✓ M 1 q 1 1 q 1 q 2 + 2q 1 p 1 q 1 p 1 (1 q 1 q 2 ) + q 1 q 2 p 1 (1 q 1 q 2 ) ◆ = p 1 q 2 1 q 1 q 2 + ✓ q 1 p 2 1 q 1 q 2 ◆ M 1 ✓ M 1 q 1 1 q 1 q 2 + q 1 p 1 + q 1 q 2 1 q 1 q 2 ◆ > 0 83 .1. APPENDIX Next we show: f(k) f(k+1) > 1 which is equivalent to show: q 2 q 2 q k 1 (1q 1 q 2 ) q 1 q 2 1q 1 q 2 + ✓ q 1 p 2 1q 1 q 2 ◆ M 1 ✓ M 1 q 1 1q 1 q 2 + 2q 1 p 1 q 1 p 1 (1q 1 q 2 ) + q 1 q 2 q 2 q k 1 p 1 (1q 1 q 2 ) ◆ > 0 We know: ✓ q 1 p 2 1q 1 q 2 ◆ M 1 = p 1 q 2 q 1 p 2 Plug this in and we see: p 2 q 2 q 2 p 2 q k 1 (1q 1 q 2 ) q 1 q 2 1q 1 q 2 + M 1 p 1 q 2 p 2 (1q 1 q 2 ) + 2q 2 p 2 q 2 p 2 (1q 1 q 2 ) + q 2 q 2 q 2 p 2 q k 1 (1q 1 q 2 ) = q 2 q 2 p 2 q k 1 (1q 1 q 2 ) + M 1 p 1 q 2 +p 1 q 2 q 1 q 2 q 2 p 2 (1q 1 q 2 ) > q 2 q 2 p 2 (1q 1 q 2 ) + M 1 p 1 q 2 +p 1 q 2 q 1 q 2 q 2 p 2 (1q 1 q 2 ) = M 1 p 1 q 2 +p 1 q 2 +p 1 q 2 q 2 p 2 (1q 1 q 2 ) > 0 This finishes the proof thatH(2,M 1 +k)V(1,M 1 +k+1) decreases monotonically in k. 84 .1. APPENDIX Chpater 5.2.3 the (1,m) model with cost (1) W(1,m)= m X r=1 p 1 1q 1 q 2 (1 p 1 1q 1 q 2 ) r 1 ⇢ r 1q 1 q 2 + r 1q 1 q 2 ⇤ c+ m+1r p 2 ⇤ c +(1 p 1 1q 1 q 2 ) m ⇢ m 1q 1 q 2 ⇤ c+ m 1q 1 q 2 + 1 p 1 = 1 p 1 q 2 p 2 ⇤ c+ m+1 p 2 ⇤ c ✓ 1 p 1 1q 1 q 2 ◆ m ⇤ c After some algebra we get W(1,m)U(0,m+1) = 1 p 1 q 2 p 2 ⇤ c ✓ 1 p 1 1q 1 q 2 ◆ m ⇤ c It is straight forward to see that W(1,m)U(0,m+1) increases monotonically in m, with a final value p 2 p 1 q 2 ⇤ c p 1 p 2 and an initial value p 2 p 1 ⇤ c p 1 p 2 . (1) If p 2 p 1 q 2 ⇤ c 0, it is optimal to always put a (1,1) labeled ball in box 2. i.e. M 1 =1 . (2) If p 2 p 1 ⇤ c 0, it is optimal to always put a (1,1) labeled ball in box 1. i.e. M 1 = 1. (3) If p 2 p 1 ⇤ c< 0 and p 2 p 1 q 2 ⇤ c> 0,9 an M 1 such that it is best put a (1,1) vector-ed ball in box 1 if and only if m M 1 .Where M 1 is the solution of, W(1,m)W(0,m+1) = 1 p 1 q 2 p 2 ⇤ c ✓ 1 p 1 1q 1 q 2 ◆ m ⇤ c=0 Remark if the cost of box 1 and box 2 are (c 1 ,c 2 ) respectively, the problem is the same as the model above with taking c = c 2 c 2 . 85 .1. APPENDIX Chpater 5.2.4 the (1,m) model with cost (2) W(1,m)= m X r=1 p 1 1 q 1 q 2 (1 p 1 1 q 1 q 2 ) r 1 ( r 1 q 1 q 2 + c 1 q 1 q 2 m+1 r X i=m i+ c p 2 m+1 r X i=1 i ) +(1 p 1 1 q 1 q 2 ) m ( c 1 q 1 q 2 m X i=1 i+ m 1 q 1 q 2 + 1 p 1 ) = m X r=1 p 1 1 q 1 q 2 (1 p 1 1 q 1 q 2 ) r 1 ⇢ r 1 q 1 q 2 + c 1 q 1 q 2 r(2m+1 r) 2 + c p 2 (m+2 r)(m+1 r) 2 +(1 p 1 1 q 1 q 2 ) m ⇢ c 1 q 1 q 2 m(m+1) 2 + m 1 q 1 q 2 + 1 p 1 = m X r=1 p 1 1 q 1 q 2 (1 p 1 1 q 1 q 2 ) r 1 ⇢ r 1 1 q 1 q 2 + c(2m+1) 2(1 q 1 q 2 ) c(2m+3) 2p 2 +r 2 c 2p 2 c 2(1 q 1 q 2 ) + m X r=1 p 1 1 q 1 q 2 (1 p 1 1 q 1 q 2 ) r 1 ⇢ c(m+1)(m+2) 2p 2 +(1 p 1 1 q 1 q 2 ) m ⇢ c 1 q 1 q 2 m(m+1) 2 + m 1 q 1 q 2 + 1 p 1 = 1 1 q 1 q 2 + c(2m+1) 2(1 q 1 q 2 ) c(2m+3) 2p 2 1 q 1 q 2 p 1 (m+ 1 q 1 q 2 p 1 )(1 p 1 1 q 1 q 2 ) m + c 2p 2 c 2(1 q 1 q 2 ) " 1+ q 1 p 2 1 q 1 q 2 ( p 1 1 q 1 q 2 ) 2 1+ q 1 p 2 1 q 1 q 2 ( p 1 1 q 1 q 2 ) 2 +m 2 + 2m p 1 1 q 1 q 2 ! ( q 1 p 2 1 q 1 q 2 ) m # +(1 p 1 1 q 1 q 2 ) m ⇢ c 1 q 1 q 2 m(m+1) 2 + m 1 q 1 q 2 + 1 p 1 c(m+1)(m+2) 2p 2 + c(m+1)(m+2) 2p 2 After some algebra we get W(1,m)W(0,m+1) =mc ✓ q 2 p 2 ◆ + ✓ 1 p 1 1q 1 q 2 ◆ m c ✓ q 1 p 2 p 1 ◆ + 1 p 1 + c(q 1 q 2 +q 1 q 2 p 2 1) p 1 p 2 86 .1. APPENDIX Now it is straight forward to see that W(1,m)W(0,m+1)decreasesmono- tonically in m,withafinalvalue1 and an initial value W(1,0)W(0,1) =c ✓ q 1 p 2 p 1 ◆ + 1 p 1 + c(q 1 q 2 +q 1 q 2 p 2 1) p 1 p 2 = 1 p 1 + c(q 1 q 2 +q 1 p 2 q 2 +q 1 p 2 p 2 1) p 1 p 2 = 1 p 1 + c(q 1 1) p 1 p 2 > c(p 2 +q 1 1) p 1 p 2 > 0 if p 1 p 2 87 .1. APPENDIX Algebra involved in Chapter 7. . Assume balls arrive according to a Piosson Process with parameter . With probability p, an arrival is eligible for an empty box, whose lifetime has an exponential distribution with parameterµ.Theproblemendseitherwhenthe box gets filled or when it disappears. The probability that the box gets filled by t is: 1e (p +µ)t p p +µ . An explanation of this formula is that by time t,aneventhappensanditis an arrival. The formula could also be obtained by solving the equation P(t)= Z t 0 e s pe µs ds+ Z t 0 e s (1p)e µs P(ts)ds This equation is obtained by conditioning on the time that a ball arrives. The first part of the equation describes a scenario that a ball arrives before t;the box is alive and the ball is eligible for the box. The second part describes the scenario that a ball arrives before t,theboxisalive,howevertheballisnot eligible to the box. 88
Abstract (if available)
Abstract
This dissertation studied a Stochastic Assignment Problem, called “A Stochastic Employment Problem(SEP)”. There are n boxes having quota S =(S₁...Sn), that is box i needs Si balls, i = 1. . . n. Balls arrive sequentially,with each one having a binary vector X = (X₁, X₂...Xn) attached, with the interpretation being that if Xᵢ = 1 this ball is eligible to be put in box i, i = 1. . . n. When a ball arrives, its vector is revealed and the ball is put in one box for which it is eligible. Assuming the vectors are independent and identically distributed among the successive balls, this problem continues until there are at least Si balls in box i, for all i. The SEP could be applied to an organizational employment decision problem, with the interpretation being that the boxes are the types of jobs and the balls are the job seekers, with Si implying the number of type i jobs and X indicating which jobs a seeker is qualified to take. ❧ Variations of the Stochastic Employment Problem were considered. Such as, balls arrive according to a renewal process and each box has a lifetime under a specified distribution. Thus the SEP could be considered as a stochastic control problem associated with a single server queuing system. Therefore it could be applied to channel/processor scheduling in telecommunication/computer industry. For example, in a time slotted network, n users share one channel with user i having Si packets to transmit i = 1. . . n, and X indicating which users are connected hence able to transmit. The SEP could also be applied to an organ transplant decision problem, with the box lifetime being a patient’s lifetime and the ball vector indicating which patients an arriving organ fits.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Bayesian optimal stopping problems with partial information
PDF
Dynamic programming-based algorithms and heuristics for routing problems
PDF
Stochastic models: simulation and heavy traffic analysis
PDF
Queueing loss system with heterogeneous servers and discriminating arrivals
PDF
Asymptotic analysis of the generalized traveling salesman problem and its application
PDF
Stochastic games with expected-value constraints
PDF
The warehouse traveling salesman problem and its application
PDF
Traffic assignment models for a ridesharing transportation market
PDF
Multi-armed bandit problems with learned rewards
PDF
I. Asynchronous optimization over weakly coupled renewal systems
PDF
Active state tracking in heterogeneous sensor networks
Asset Metadata
Creator
Wu, Teng
(author)
Core Title
A stochastic employment problem
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Industrial and Systems Engineering
Publication Date
04/26/2013
Defense Date
03/13/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
applied probability modeling,assignment problem,dynamic programming,OAI-PMH Harvest,optimization,sequential decision process,simulation,stochastic,stochastic dominance
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Ross, Sheldon M. (
committee chair
), Dessouky, Maged M. (
committee member
), Huang, Qiang (
committee member
), Toriello, Alejandro (
committee member
), Zhang, Jianfeng (
committee member
)
Creator Email
tengwu@usc.edu,wuteng@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-245624
Unique identifier
UC11288151
Identifier
etd-WuTeng-1609.pdf (filename),usctheses-c3-245624 (legacy record id)
Legacy Identifier
etd-WuTeng-1609-0.pdf
Dmrecord
245624
Document Type
Dissertation
Rights
Wu, Teng
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
applied probability modeling
assignment problem
dynamic programming
optimization
sequential decision process
simulation
stochastic
stochastic dominance