Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Efficient delivery of augmented information services over distributed computing networks
(USC Thesis Other)
Efficient delivery of augmented information services over distributed computing networks
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Efficient Delivery of Augmented Information Services over Distributed Computing Networks by Yang Cai A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) December 2022 Copyright 2022 Yang Cai Dedication To my beloved parents: Shaohong Yang and Zhongming Cai. ii Acknowledgements First, I would like to express the deepest appreciation to my advisor, Prof. Andreas F. Molisch. With broad views and rich experiences in the related research fields, he always provides insightful comments and sug- gestions in the technical discussions, which significantly accelerates my research progress. In addition, I also benefit from his research principle of doing cutting-edge and high-impact works, as well as staying associated with practical use cases. Indeed, an important reason for me to stay motivated and productive is his encouragement to try out promising research ideas and advanced technologies. Other than the tech- nical aspect, in terms of collaboration, he is patient and responsive, keeping the communication between us simple and efficient, and we get along perfectly well working together. I am also thankful to my project collaborators, Prof. Jaime Llorca and Prof. Antonia M. Tulino at New York University. Their expertise in advanced cloud network technologies is a perfect complement to our knowledge in network control, together pushing the project forward. They both participate in every step of my research, from topic selection, system build-up, to the presentation of our works. They provided me dedicated guidance, and I am grateful for every detailed discussion with them. I sincerely appreciate the time and effort of all the members at my qualifying and dissertation com- mittees (Prof. Michael Neely, Prof. Barath Raghavan, Prof. Leana Golubchik, Prof. Ramesh Govindan, and Prof. Konstantinos Psounis), listening to the presentations, evaluating my research works, and offering iii constructive suggestions. Talking with them help me think outside the box: they pointed out some im- portant aspects in the studied problem, from which I gained a more comprehensive understanding of the actual system. I also want to say thank you to the WiDeS alumnus, Hao Feng, Ming-Chun Lee, and Minseok Choi, for their continued interest in my research progress, as well as helpful discussions, which inspired many interesting research ideas. Hao and Ming-Chun gave me a lot of help when I started working on computing network control and caching network design, respectively. They also shared with me valuable experiences through their Ph.D. trajectories, such as academic writing and time management. iv TableofContents Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Augmented Information Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.2 Distributed Cloud Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2.2 Requirements of Emerging AgI Services . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2.2.1 Delay-constrained Services . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.2.2 Mixed-cast Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.2.3 Data-intensive Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.3 Research Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3 Literature Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3.1 Service Function Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3.2 Dynamic Cloud Network Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3.3 Throughput-optimal Packet Routing . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.4 Multi-access Edge Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3.5 Machine Learning for Network Control . . . . . . . . . . . . . . . . . . . . . . . . 10 1.4 Organization of Later Chapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Chapter 2: Efficient Delivery of Delay-constrained Services . . . . . . . . . . . . . . . . . . . . . . 15 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.3.1 Cloud Layered Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.3.2 Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3.3 Arrival Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3.4 Queuing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 v 2.4 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4.1 Admissible Policy Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4.2 General Network Stability Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.4.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.5 The Average Capacity Constrained Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.5.1 The Virtual Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.5.1.1 Virtual Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5.1.2 Physical Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5.2 Connections BetweenP 1 andP 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.6 Solution to Average-Constrained Network . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.6.1 Optimal Virtual Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.6.2 Flow Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.6.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.6.3.1 Virtual Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.6.3.2 Performance of Algorithm 1 . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.7 Solution to Peak-Constrained Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.7.1 Request Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.7.2 Capacity Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.8 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.8.1 Illustrative Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.8.1.1 Effects of Parameter V . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.8.1.2 Effects of Lifetime and Arrival Model . . . . . . . . . . . . . . . . . . . . 49 2.8.2 Practical Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.8.2.1 Stability Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 2.8.2.2 Throughput and Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.8.2.3 Performance of RCNC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 2.9 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.9.1 Mixed Deadline-Constrained and Unconstrained Users . . . . . . . . . . . . . . . . 57 2.9.2 Time-Varying Slot Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Chapter 3: Efficient Delivery of Mixed-cast Services . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.3 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.3.1 Cloud Layered Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.3.2 Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.3.3 Arrival Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.3.4 In-network Packet Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.3.4.1 Replication Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.3.4.2 Packet Replication and Coverage Constraint . . . . . . . . . . . . . . . . 64 3.3.4.3 Conditions on Replication Operation . . . . . . . . . . . . . . . . . . . . 65 3.4 Policy Spaces and Network Stability Region . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.4.1 Policy Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.4.1.1 Decision Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.4.1.2 Admissible Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.4.1.3 Cost Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.4.2 Multicast Network Stability Region . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 vi 3.5 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.5.1 Queuing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.5.1.1 Queuing Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.5.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.6 Generalized Distributed Cloud Network Control Algorithm . . . . . . . . . . . . . . . . . . 72 3.6.1 Lyapunov Drift-plus-Penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.6.2 Generalized Distributed Cloud Network Control . . . . . . . . . . . . . . . . . . . 73 3.6.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.6.3.1 Delay-Cost Tradeoff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.6.3.2 Complexity Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.7 GDCNC-R with Reduced Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.7.1 Duplication Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.7.2 Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.7.3 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.8 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.8.1 Multicast AgI Service Delivery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.8.1.1 Cloud Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.8.1.2 AgI Service Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.8.1.3 Processing Decision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.8.2 MEC Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.8.2.1 Wireless Transmission Model . . . . . . . . . . . . . . . . . . . . . . . . 79 3.8.2.2 Wireless Transmission Decision . . . . . . . . . . . . . . . . . . . . . . . 79 3.8.3 EGDCNC with Enhanced Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.9 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.9.1 Network Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.9.2 Uniform Resource Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 3.9.3 Optimal Resource Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.9.3.1 Network Stability Region . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.9.3.2 Delay and Cost Performance . . . . . . . . . . . . . . . . . . . . . . . . . 85 3.9.3.3 Effects of Destination Set Size . . . . . . . . . . . . . . . . . . . . . . . . 86 3.10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Chapter 4: Efficient Delivery of Data-intensive Services . . . . . . . . . . . . . . . . . . . . . . . . 88 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2.1 Caching and Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2.2 Joint 3C Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.3 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.3.1 Cache-Enabled MEC Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.3.2 Data-Intensive AgI Service Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.3.3 Client Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.3.3.1 Live Packet Arrival . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.3.3.2 Static Packet Provisioning . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.3.4 Queuing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.4 Policy Space and Stability Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.4.1 Augmented Layered Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.4.1.1 Topology of the ALG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.4.1.2 Flow in the ALG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 vii 4.4.2 Policy Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.4.2.1 Decision Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.4.2.2 Efficient Policy Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.4.3 Network Stability Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.5 Multi-Pipeline Flow Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.5.1 Virtual System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.5.1.1 Precedence Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.5.1.2 Virtual Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.5.2 Optimal Virtual Network Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.5.2.1 Lyapunov Drift Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.5.2.2 Route Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.5.3 Optimal Actual Network Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 4.5.4 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.5.4.1 Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.5.4.2 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.5.4.3 Discussions on Delay Performance . . . . . . . . . . . . . . . . . . . . . 114 4.6 Max-Throughput Database Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 4.6.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 4.6.1.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.6.1.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.6.1.3 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 4.6.2 Proposed Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 4.6.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 4.7 Database Replacement Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 4.7.1 Low-Rate Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.7.2 Rate-Based Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 4.7.3 Score-Based Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 4.8 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 4.8.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 4.8.2 Multi-Pipeline Flow Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 4.8.2.1 Network Stability Region . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 4.8.2.2 Resource Occupation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 4.8.3 Joint 3C Resource Orchestration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 4.8.3.1 Fixed Placement Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 4.8.3.2 Replacement Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 4.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Chapter 5: Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Appendix A: Proofs in Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 A.1 Proof for Proposition 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 A.2 Stability region ofP 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 A.2.1 Necessity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 A.2.2 Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 A.2.2.1 Randomized Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 A.2.2.2 Validate the Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 viii A.2.2.3 Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 A.3 Stability region ofP 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 A.3.1 Necessity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 A.3.2 Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 A.3.2.1 Randomized Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 A.3.2.2 Validate the Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 A.3.2.3 Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 A.4 Distribution ofx ij (t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 A.5 Proof for Proposition 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 A.5.1 Cost Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 A.5.2 ε-Convergence Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 A.6 Impacts of Estimation Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 A.7 Hybrid Queuing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 A.8 The Multi-Commodity AgI Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 A.8.1 AgI Service Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 A.8.2 Constructing the Layered Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 A.8.2.1 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 A.8.3 Relevant Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 A.8.4 Modifications of Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 Appendix B: Proofs in Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 B.1 Conditions on Replication Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 B.2 Multicast Network Stability Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 B.2.1 Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 B.2.1.1 Necessity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 B.2.1.2 Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 B.2.2 Proof of Proposition 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 B.2.2.1 Unicast Stability Region . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 B.2.2.2 Proof forΛ 0 ⊂ Λ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 B.2.2.3 Proof forΛ ⊂ DΛ 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 B.3 Derivation of LDP Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 B.4 Proof of Theorem 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 B.5 Average Queuing Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 B.6 Transformation of AgI Service Delivery into Packet Routing . . . . . . . . . . . . . . . . . 176 B.6.1 Constructing the Layered Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 B.6.2 Relevant Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 B.7 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 B.7.1 GDCNC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 B.7.2 GDCNC-R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 B.8 Notes on Generalized Flow Conservation Constraint . . . . . . . . . . . . . . . . . . . . . . 180 Appendix C: Proofs in Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 C.1 Necessity of Theorem 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 C.2 Throughput-optimality of DI-DCNC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 C.2.1 Stability of Virtual Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 C.2.1.1 I.I.D. Arrival . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 C.2.1.2 Markov-Modulated Arrival . . . . . . . . . . . . . . . . . . . . . . . . . . 187 C.2.2 Stability of Actual Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 ix C.2.2.1 An Equivalent Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 C.2.2.2 Stability ofR(t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 C.3 Flow-based Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 C.3.1 Necessity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 C.3.1.1 Capacity Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 C.3.1.2 Chaining Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 C.3.2 Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 C.3.2.1 Path-Finding for Stagem Live Flow . . . . . . . . . . . . . . . . . . . . . 196 C.3.2.2 Composition of Individual Paths . . . . . . . . . . . . . . . . . . . . . . . 197 C.4 Stability Region with Dynamic Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . 198 C.4.1 Necessity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 C.4.2 Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 C.4.2.1 Initial Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 C.4.2.2 Low-Rate Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 C.5 Equivalence of the MILP Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 x ListofTables 3.1 Network Resources and Operational Costs of the Studied System . . . . . . . . . . . . . . . 81 4.1 Clients and Service Function Specs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 xi ListofFigures 1.1 Examples of AgI services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Service DAG of the face recognition application [54]. . . . . . . . . . . . . . . . . . . . . . 2 1.3 Architectures of the traditional processing network and distributed computing network. . 3 1.4 Delivery of social VR service over the distributed cloud network. . . . . . . . . . . . . . . 5 1.5 Elements in the studied problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 Interaction between lifetime queues. Red and blue colors denote packet states and actions during transmitting and receiving phases, respectively. . . . . . . . . . . . . . . . . . . . . 20 2.2 A one-hop example network. Packets of lifetimeL = 1 arrive at the source according to two arrival processes of equal mean arrival rateλ =1. . . . . . . . . . . . . . . . . . . . . 24 2.3 Illustration of the devised virtual system. The source node supplies packets of lifetime2, which arrive as lifetime1 packets to the actual queue of the considered node. In the virtual system, the considered node is allowed to supply packets of any lifetime to the destination by borrowing them from the reservoir and building up in the corresponding virtual queue. The virtual queue of lifetime 1 is stable, since the received lifetime 1 packets from the source node can compensate the borrowed packets; while the virtual queue of lifetime2 builds up, pushing the node to stop sending out more lifetime 2 packets. The decision derived from the stability of the virtual system is aligned with the desired operation of the actual network, as only lifetime1 packets are available for transmission at the considered node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.4 The studied networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.5 The effect of V on the achieved operational cost, and the flow assignment for peak- constrained network underV =5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.6 The effect of V on the ε-convergence time (with ε = .01), and the achieved reliability level over time under various settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 xii 2.7 The effect of maximum lifetime L and the arrival model on the achieved operational cost and stability region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.8 Stability regions of Algorithm 1 and RCNC (for average- and peak-constrained cases, respectively), under different lifetimes and time slot lengths. . . . . . . . . . . . . . . . . . 53 2.9 Throughput and cost achieved by DCNC (with LIFO) [33], worst-case delay [57] and RCNC. 55 2.10 Performances attained by RCNC under different configurations. . . . . . . . . . . . . . . . 56 3.1 Illustration of joint packet forwarding and duplication operations (solid, dashed, and dotted lines represent packets selected for operation, transmitted copies, and reloaded copies, respectively) and incoming/outgoing flow variables associated with the status q queue of nodei (while only explicitly indicated for the red flow, note that all arrows with the same color are associated with the same flow variable). . . . . . . . . . . . . . . . . . . 68 3.2 Different multicast routes (i.e., routing trees) can share the same duplication choices (i.e., duplication tree). For example, the red and blue routes used to deliver the status(1,1,1) packet from source node 0 to destination nodes{1,2,3} are associated with the same duplication tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.3 Network stability regions attained by the benchmark and proposed algorithms. . . . . . . 83 3.4 Delay-cost tradeoffs attained by the three algorithms under different V parameters and destination set sizesD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.1 Network and service models studied in this chapter and related works. Data-intensive AgI [61] (this chapter): distributed cloud network, service DAG (see Section 4.3.2) with both live and static data. DECO [46]: distributed cloud network, one-step processing with static data. MEC [24]: single server, one-step processing with live data. SFC [33, 81]: distributed cloud network, SFC with live data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.2 The studied data-intensive service model composed of multiple functions (denoted by F), with each function requiring one live data input and one static data input. We depict an AR application with a single processing step (F1) as a special case, as well as extensions to multiple live and static inputs (using FM ϕ as an example). . . . . . . . . . . . . . . . . . . 93 4.3 Illustration of the paired-packet queuing system. Different shapes denote packets associated with different requests, blue and red colors the live and static packets, and solid and dashed lines the current and subsequent time slot, respectively. . . . . . . . . . . . . . 97 4.4 Illustration of the ALG model for the delivery of AR service. . . . . . . . . . . . . . . . . . 98 4.5 Illustration of the weight components in Eq. (4.23). . . . . . . . . . . . . . . . . . . . . . . 111 4.6 The min-ERs (denoted by red, blue, and green arrows) for the delivery of an AR service over the network, assuming nodes 1, 2, and 3 are selected to provision the static packet, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 xiii 4.7 The studied edge cloud network, including 9 edge servers (node 1 to 9) and a cloud datacenter (node10). Arrows of the same color indicate the source-destination pairs of each client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 4.8 Performance of DI-DCNC (under a given database placement). . . . . . . . . . . . . . . . . 126 4.9 Performance of max-throughput database placement policy. . . . . . . . . . . . . . . . . . 129 4.10 Performance of rate- and score-based replacement policies. . . . . . . . . . . . . . . . . . 130 A.1 An example of delivering a packet, requiring one processing step, over a4-node network and its associated layered graph. The stage1 packet arrives at the source nodeA and it is transmitted to nodeB (along the red path), where it also gets processed. The produced stage2 packet is then transmitted to the destination nodeD (along the pink path). We use red and pink bars to represent queues for packets at different stages, and depict the packet trajectory over the queuing system using green arrows. . . . . . . . . . . . . . . . . . . . 162 C.1 Composition of live and static packets’ paths into efficient routes (ERs). The segment p i +q j denotes the ER that combines pathsp i andq i for live and static packets, and its length represents the associated probability. . . . . . . . . . . . . . . . . . . . . . . . . . . 197 xiv Abstract We are entering a rapidly unfolding future driven by stream-processing services/applications, such as in- dustrial automation and Metaverse experiences, collectively referred to as augmented information (AgI) services. A large amount of AgI services are characterized by intensive resource consumption and real- time interactive requirements, which accelerate the need for distributed compute platforms with unprece- dented computation and communication requirements, as well as advanced networking solutions with an increasing level of computation-communication integration. The delay-critical, mixed-cast, and data-intensive characteristics exhibited by AgI services are receiv- ing the most recent attention. To be specific, delay-critical services require delivering processed streams ahead of corresponding deadlines on a per-packet basis, mixed-cast services require processed streams to be shared and simultaneously consumed by multiple users/devices, and data-intensive services require merging data-streams in multiple pipelines to process the service functions. However, existing technolo- gies lack efficient mechanisms to deal with these challenging requirements, leading to inefficient service delivery that will compromise the quality of experience (QoE). To this end, in this thesis, we present three research works targeting these critical requirements, summarized as follows: Chapter 2 focuses on delay-critical services, in which we design a novel queuing system able to track data packets’ lifetime and formalize the delay-constrained least-cost dynamic network control problem. To address this challenging problem, we first study the setting with average capacity (or resource budget) xv constraints, for which we characterize the delay-constrained stability region and design a throughput- optimal control policy leveraging Lyapunov optimization theory on an equivalent virtual network. Guided by the same principle, we tackle the peak capacity constrained scenario by developing the reliable cloud network control (RCNC) algorithm, which employs a two-way optimization method to make actual and virtual network flow solutions converge in an iterative manner. Chapter 3 deals with mixed-cast services, in which we establish a unified framework for distributed cloud network control with generalized (mixed-cast) traffic flows that allows optimizing the distributed ex- ecution of the required packet processing, forwarding, and replication operations. We first characterize the enlarged multicast network stability region under the new control framework (compared to its uni- cast counterpart). Then, we design a queuing system that allows scheduling data packets according to their current destination sets, and leverage Lyapunov optimization theory to develop the generalized dy- namic cloud network control (GDCNC) algorithm, which is the first fully decentralized, throughput- and cost-optimal algorithm for multicast cloud network flow control, as well as practical variants of reduced complexity (GDCNC-R) and enhanced delay (EGDCNC). Chapter 4 is centered around control policy design for the joint orchestration of compute, caching, and communication (3C) resources in next-generation distributed cloud networks for the efficient delivery of data-intensive services. We describe such applications via directed acyclic graphs (DAGs) able to model the combination of real-time stream-processing and content distribution pipelines. We establish a queuing system that holds paired live-static data packets in order for processing, and leverage Lyapunov optimiza- tion theory to design the first throughput-optimal control policy, data-intensive dynamic cloud network control (DI-DCNC), that coordinates joint decisions around (i) routing paths and processing locations for live data streams, with (ii) cache selection and distribution paths for associated data objects. In addition, we also extend the proposed solution to include a database placement policy and two efficient replacement policies targeting throughput maximization. xvi Chapter1 Introduction 1.1 Background 1.1.1 AugmentedInformationServices A rapidly unfolding future will be driven by the class of augmented information (AgI) services, which refers to a wide range of services and applications designed to deliver information of real-time relevance that results from the online aggregation, processing, and distribution of multiple data streams. AgI services will transform the way we live, work, and interact with the physical world. The so-called automation era or fourth industrial revolution will be driven by the proliferation of AgI services, and the use cases include manufacturing (e.g., machine control), transportation (e.g., self-driving cars), farming, and supply chain (a) Augmented reality. (b) Autonomous driving. (c) Telepresence. (d) Face recognition. Figure 1.1: Examples of AgI services. 1 Classification Training Samples Classification Results Feature Vector Database Face Detection Feature Extraction Pre-processing Digital Image Face Image Normalized Face Image Camera Figure 1.2: Service DAG of the face recognition application [54]. [54, 31, 73]. AgI applications will also shape the future of consumer experiences, such as the Metaverse (augmented reality (AR), virtual reality (VR), etc.), immersive video, and multi-player gaming [13, 26, 63]. In all these applications, real-time aggregation, analysis, and distribution of the information will be essential: designing truly smart factories, cities, farms, and supply chain systems requires the handling of data streams collected by wireless sensors – whose wide deployment is fueled by Internet of things (IoT) – attached to physical systems as soon as they are generated in order to understand systems’ health, predict future performance, and deliver actions back to actuators on those same physical systems; similarly, AgI applications of consumer experiences all require real-time aggregation of multiple source streams, in- network media production, and distribution of highly personalized streams to individual users. AgI services are usually characterized by intensive resource consumption and real-time interactive requirements. In addition to the communication resources needed to deliver data streams to corresponding destinations, AgI services also require a significant amount of computation resources for the real-time processing of source and intermediate data streams. In general, an AgI service, which requires processing multiplefunctions on data streams frommultiplesources to produce consumable contents, can be described by a directed acyclic graph (DAG), referred to as the associated service DAG. A special case of the service DAG that has important practical relevance is the service function chain (SFC), in which data stream from a single source is processed consecutively by a sequence of functions. We present illustrative examples 2 IT Infrastructure IP Network … (a) Traditional processing network. edge server cloud data-centers UE AP VF1 VF2 (b) Distributed computing network. Figure 1.3: Architectures of the traditional processing network and distributed computing network. of the service DAG and service chain models in Fig. 1.2 (where the sub-graph consisting of the first three functions forms a service chain). 1.1.2 DistributedCloudNetworks Nowadays, user equipments (UEs) are becoming increasingly small and lightweight, and the associated limitations in power and computing capabilities have been pushing the need to offload many computation- intensive tasks to the cloud. However, service delivery over traditional processing-centric networks (as shown in Fig. 1.3a) can suffer from excessive access delay, which deals with (i) data processing at central- ized data centers, and (ii) data transmission between remote data centers and end users, as separate pro- cedures. This fuels the evolution of network architectures toward widespread deployments of distributed computation resources in proximity to end users, and we refer to the overall wide-area distributed com- putation network that results from the convergence of telco networks and cloud/edge/UE resources as a distributed cloud/computing network, such as fog and multi-access edge computing (MEC) [73, 53, 18]. Indeed, distributed cloud networks are tightly integrated compute-communication systems, which will essentially blur the space- and time-scale separation between data sensing, information processing, and knowledge delivery. The ubiquitous availability of computation resources enables reduced access delays 3 and energy consumption, thus providing better support for delay-sensitive, compute-intensive AgI ser- vices. Together with continued advances in network programmability (e.g., software defined network- ing (SDN)) and virtualization (e.g., network function virtualization (NFV)) technologies, in distributed cloud networks, disaggregated AgI services can be deployed as a sequence of software functions (or SFCs), which can be dynamically instantiated, flexibly interconnected, and elastically executed at different net- work locations. Last but not least, distributed cloud networks are heterogeneous and dynamic systems, with non- uniform resource distribution in both space and time dimensions. To be specific, the system can include a wide range of network elements, spanning from UEs, access points (APs), edge servers, to cloud data centers, with a large difference in the amounts of available communication (e.g., transmission power bud- get) and computation (e.g., operating frequency of the processing unit) resources. In addition, there also exist uncontrollable system states (e.g., channel states of wireless links), which can lead to time-varying computation and communication capacities. 1.2 Motivation The full breadth of AgI services/applications, involving dispersed users, multiple service functions, and remote destinations, imposes unprecedented requirements for communication and computation resources [13], which will demand a more powerful supporting infrastructure. We envision the solution to be a new universal compute platform driven by the efficient integration of communication and computation technologies into distributed cloud networks consisting of geographically separate devices, along with new tools and methods for the end-to-end optimization and dynamic control of the system, as shown in Fig. 1.4. 4 Edge server Physical sensors Digital assets F2 F1 Access point F1 Cloud datacenter Service function (Block diagram of the social VR application) Intermediate stream Function 2 e.g., rendering Live stream, e.g., video Static object, e.g., scene object Final stream, e.g., social VR Function 1 e.g., tracking Function1 Figure 1.4: Delivery of social VR service over the distributed cloud network. 1.2.1 PerformanceMetrics Delay, throughput, and cost are three essential metrics when evaluating the performance of a distributed cloud network supporting AgI service delivery. From the perspective of consumers, the delay for service delivery can significantly impact the quality of experience (QoE), especially for real-time interactive AgI applications, such as industrial automation and Metaverse experiences. From the perspective of network operators, the goal is to make use of the available network resources, in order to fulfill users’ requests, or in other words, to optimize the throughput, defined as the rate of service delivery. Furthermore, when there are multiple solutions to meet the service demands, reducing the overall resource (e.g., computation, communication) consumption, oroperationalcost, becomes the new goal, especially in heterogeneous net- works including energy-limited devices (e.g., smartphones). 1.2.2 RequirementsofEmergingAgIServices Notably, the emerging AgI services also demands new dimensions of performance requirements, as elab- orated in the following. 5 1.2.2.1 Delay-constrainedServices AgI services, such as industrial automation and Metaverse experiences, can require stringent timely de- livery of processed data streams under strict per-packet latency constraints. To be specific, each service request is associated with a strict deadline imposed by the application, by which it must be must be de- livered in order to be effective (since the information carried in a packet might become irrelevant and/or breaks application interactivity after the corresponding deadline). In this context,timelythroughput, which measures the rate of effective packet delivery (i.e., within-deadline packet delivery rate), becomes the ap- propriate performance metric [48, 23, 68]. This timely delivery has to bereliable, i.e., a prescribed percent- age of packets has to fulfill the delay constraints. 1.2.2.2 Mixed-castServices The increasing amount of real-time multi-user interactions present in AgI services, e.g., telepresence, mul- tiplayer gaming [63], and social television (TV) [26], where media streams can be shared and simultane- ously consumed by multiple users/devices, is creating the need to support a pressing growth of multicast traffic. In general, next-generation cloud network traffic will be a mixture of the four basic flow types, i.e., unicast (packets intended for a unique destination), multicast (packets intended for multiple destina- tions), broadcast (packets intended for all destinations), and anycast (packets intended for any node in a given group). The mixed-cast nature of the network traffic shall be handled, which requires delivering the associated packets to the set of designated destination node(s) for each service. 1.2.2.3 Data-intensiveServices In addition to the interaction- and compute-intensive nature, an increasingly relevant feature of AgI ser- vices, especially the Metaverse applications, is their intensive data requirements. In the Metaverse, user 6 Related Problems Ø Service placement Ø Route selection Ø Task scheduling Ø Resource allocation Emerging AgI Services Features: delay-critical, mixed-cast, data-intensive Distributed Computing Networks Characteristics: distributed, dynamic, heterogeneous Support AgI service delivery Figure 1.5: Elements in the studied problem. experiences result from the composition of multiple live media streams and pre-stored digital assets. So- cial VR [34], as illustrated in Fig. 1.4, is a clear example, which enriches source video streams with scene objects to generate enhanced experiences that can be consumed by end users [69]. Face/object recognition, as shown in Fig. 1.2, is another example, where the access to a dictionary of training samples is required to identify/classify the images recorded by user devices (e.g., smart glass) [74]. 1.2.3 ResearchProblems The efficient delivery of AgI services requires the end-to-end optimization of compute, caching, and com- munication (3C) decisions, as well as the joint orchestration of associated resources. Indeed, the delay, throughput, and cost performance will ultimately be dictated by the choice of cloud/edge locations where to execute the various AgI service functions, the network paths over which to route the service data streams, and the corresponding allocation of computation and communication resources. In order to max- imize the efficiency for AgI service delivery, two fundamental flow control problems need to be jointly addressed: • (Packetprocessing) where to instantiate and execute the required service functions in order to process associated data packets, and how much computation resource to allocate at each node • (Packetforwarding) how to route data packets to their corresponding processing locations and even- tually through the entire sequence of service functions to arrive at the destination, and how much communication resource to allocate at each link 7 In addition, the above placement, processing, routing, and resource allocation problems must be addressed in an online manner, in response to stochastic network conditions and service demands. Related research problems are presented in Fig. 1.5. 1.3 LiteratureSurvey The distributed cloud network control problem has received significant attention in the recent literature, especially for AgI services that can be modeled by SFCs. 1.3.1 ServiceFunctionChaining With the advent of SDN and NFV, AgI services can be deployed as a sequence of software functions or SFCs that can be flexibly interconnected and elastically executed at distributed cloud locations [73]. A number of studies have investigated the problem of joint SFC placement and service embedding (i.e., flow routing) over multi-hop networks, in order to optimize a network-wide objective, e.g., maximizing accepted service requests [41, 76, 80, 62], or minimizing overall operational cost [9, 10, 2, 8, 61, 79]. While useful for long timescale end-to-end service optimization, these solutions exhibit two main limitations: first, the problem is formulated as astatic optimization problem without considering the dynamic nature of service demands, a critical aspect in next-generation AgI services; second, due to the combinatorial nature of the problem, the corresponding formulations typically take the form of (NP-hard) mixed integer programs (MIP), and either heuristic solutions or loose approximation algorithms are developed, compromising the quality of the resulting solution. 1.3.2 DynamicCloudNetworkControl More recently, another line of work has addressed the SFC optimization problem in dynamic scenarios, where one needs to make online packet processing, routing, and scheduling decisions, in response to 8 stochastic system states (e.g., service demands and resource capacities). Among existing techniques, Lya- punov drift control has proven to be a powerful tool for the design of throughput-optimal cloud network communication and computation control policies, such as DCNC [33], DWCNC [32], and UCNC [81], by dynamically exploring processing and routing diversity. The works in [33, 32] employ a generalized cloud network flow model that allows joint control of processing and transmission flows. The work in [81] shows that the cloud network flow control problem (involving processing and transmission decisions) can be re- duced to a packet routing problem on a cloud layered graph that includes extra edges to characterize the computation operations (i.e., data streams pushed through these edges are interpreted as being processed by a service function). By this transformation, control policies designed for packet routing can be extended to address cloud network control problems. 1.3.3 Throughput-optimalPacketRouting Packet routing is a long-explored problem. A class of computational problems, known as network flow problems, deal with packet routing over static networks. In particular, themaximum flow problem aims to compute the greatest rate at which packets can be pushed through the network for a designated source- destination pair, while respecting the capacity constraints, and many celebrated solutions are developed, such as Ford-Fulkerson method and Edmonds–Karp algorithm [27]. In the dynamic setting, a number of existing algorithms are developed to maximize network throughput with bounded average delay. Among them, the back-pressure (BP) algorithm [71] is a well-known approach for throughput-optimal routing that leverages Lyapunov drift control theory to steer data packets based on the pressure difference (differ- ential queue backlog) between neighbor nodes. In addition, the Lyapunov drift-plus-penalty (LDP) control approach [58] extends the BP algorithm to also minimize network operational cost (e.g., energy expendi- ture), while preserving throughput-optimality. Despite the remarkable advantages of achieving optimal throughput performance via simple local policies without requiring any knowledge of network topology 9 and traffic demands, both BP and LDP approaches can suffer from poor average delay performance, espe- cially in low congestion scenarios, where packets can take unnecessarily long, and sometimes even cyclic, paths [12]. Average delay reductions were then shown to be obtained in [78] by combining BP and hop- distance based shortest-path routing, using a more complex Markov decision process (MDP) formulation in [59], or via the use of source routing to dynamically select acyclic routes for incoming packets, albeit requiring global network information, in [67]. 1.3.4 Multi-accessEdgeComputing The class of services including only one function is a special case of SFC [18], and the resulting task offloading problem [24] has been intensively studied in the MEC literature over the past decade. Many studies [24, 52, 72] have been conducted to design control policies for MEC networks, spanning aspects from user-server association and task offloading to resource allocation, with the objective of optimizing the end-to-end latency and energy consumption performances. To address the aforementioned problems, various techniques are leveraged to devise approximation algorithms to the resulting (NP-Hard) MIPs, e.g., iterative optimization [72], game theory [25], and reinforcement learning [82]. However, these studies usually adopt a one-shot formulation of the problem, lacking consideration for network stability and long- term optimality; besides, they rely heavily on assumptions of separate or single-layer servers, individual tasks, and thus are not general enough to model next-generation networks and services. 1.3.5 MachineLearningforNetworkControl Recent advances in the area of machine learning (ML), especially deep learning (DL) and reinforcement learning (RL), are accelerating the development of data-driven network control algorithms. One line of work combines existing network control policies with ML techniques providing high-accuracy estimations of the network states required for decision making. For example, workload modeling and prediction is a 10 crucial aspect for reliable service provisioning, and various ML methods are employed to accomplish this goal, such as neural network [77], Bayes classifier [29], support-vector machine [39], etc., which aid the subsequent decision making procedure to improve the overall performance. Another important line of work leverages RL techniques, such as Q-learning [70], to design control policies for systems with state evolution modeled by MDP. In the context of network control, the variant, multi-agent reinforcement learning (MARL), is usually of high complexity due to the exponentiality of the state-action space in the network size. Function approximation (e.g., using deep neural networks) is introduced for complexity reduction, and the resulting algorithms, such as deep Q-network (DQN) [55] and deep deterministic policy gradient (DDPG) [49], are applied to address the problems of 3C integration [38], traffic engineering [75], etc.. Compared to LDP approaches (Section 1.3.3), RL methods are able to optimize more complicated performance metrics; however, they require expensive training, which can be inefficient and unstable under the multi-agent setting [83]. 1.4 OrganizationofLaterChapters The rest of the thesis consists of three parts, tackling the three challenging requirements introduced in Section 1.2.2, respectively. Chapter 2 targets the reliable delivery of delay-critical AgI services, with strict deadline constraints on a per-packet basis. To this end, we design a novel queuing system able to track data packets’ lifetime and formalize the delay-constrained least-cost dynamic network control problem. The problem is addressed in two steps. First, we study the setting with average capacity (or resource budget) constraints, for which we characterize the delay-constrained stability region and design a throughput-optimal control policy leveraging Lyapunov optimization theory. Then, we follow the same principle to tackle the peak capacity constrained scenario by developing the reliable cloud network control (RCNC) algorithm, which employs 11 a two-way optimization method to make actual and virtual network flow solutions converge (i.e. flow matching) in an iterative manner. Related publications on this topic are as follow: [21] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Ultra-reliable distributed cloud network control with end-to-end latency constraints”. In: IEEE/ACM Trans. Netw. 1.99 (2022), pp. 1–16. [19] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Optimal cloud network control with strict latency constraints”. In: Proc. IEEE Int. Conf. Commun. Montreal, Canada, June 2021, pp. 1–6. Chapter 3 deals with multicast AgI services, which require processed streams to be shared and simul- taneously consumed by multiple users/devices. We establish a unified framework for distributed cloud network control with generalized (mixed-cast) traffic flows that allows optimizing the distributed execu- tion of the required packet processing, forwarding, and replication operations. We first characterize the enlarged multicast network stability region under the new control framework (with respect to its unicast counterpart). Then, we design a queuing system that allows scheduling data packets according to their cur- rent destination sets, and leverage Lyapunov optimization theory to develop thegeneralizeddynamiccloud network control (GDCNC) algorithm, which is the first fully decentralized, throughput- and cost-optimal algorithm for multicast cloud network flow control, as well as practical variants of reduced complexity (GDCNC-R) and enhanced delay (EGDCNC). Related publications on this topic are as follow: [14] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Decentralized control of dis- tributed cloud networks with generalized network flows”. Submitted to: IEEE Trans. Commun. (2022). 12 [20] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Optimal multicast service chain control: Packet processing, routing, and duplication”. In: Proc. IEEE Int. Conf. Commun. Montreal, Canada, June 2021, pp. 1–7. Chapter 4 is centered around control policy design for the joint orchestration of 3C resources in next-generation distributed cloud networks for the efficient delivery of data-intensive services. We de- scribe such applications via directed acyclic graphs (DAGs) able to model the combination of real-time stream-processing and content distribution pipelines. We establish a queuing system that holds paired live-static data packets in order for processing, and leverage Lyapunov optimization theory to design the first throughput-optimal control policy, data-intensive dynamic cloud network control (DI-DCNC), that co- ordinates joint decisions around (i) routing paths and processing locations for live data streams, with (ii) cache selection and distribution paths for associated data objects. In addition, we also extend the pro- posed solution to include a database placement policy and two efficient replacement policies targeting throughput maximization. Related publications on this topic are as follow: [17] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Joint compute-caching-com- munication control for online data-intensive service delivery”. Submitted to: IEEE Trans. Mobile Comput. (2022). [16] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Dynamic control of data- intensive services over edge computing networks”. In: Proc. IEEE Global. Telecomm. Conf. Rio de Janeiro, Brazil, Dec. 2022, pp. 1–6. Other works published during my PhD, which lay important foundations for the developed techniques (although they do not target the specific objectives of these problems), include: [22] Yang Cai and Andreas F. Molisch. “On the multi-activation oriented design of D2D-aided caching networks”. In: Proc. IEEE Global. Telecomm. Conf. Waikoloa, HI, USA, Dec. 2019, pp. 1–6. 13 [18] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Mobile edge computing network control: Tradeoff between delay and cost”. In: Proc. IEEE Global. Telecomm. Conf. Taipei, Taiwan, Dec. 2020, pp. 1–6. [13] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Compute- and data-intensive networks: The key to the Metaverse”. In: 2022 1st International Conference on 6G Networking (6GNet). Paris, France, July 2022, pp. 1–8. In [22], we propose a joint caching (i.e., caching distribution) and transmission policy (i.e., link scheduling and power control) design for wireless content distribution networks, allowing multiple users to commu- nicate simultaneously while avoiding request clashes; in [18], we leverage the LDP approach to develop a fully distributed, throughput- and cost-optimal control policy for the delivery of AgI services over multi- hop MEC networks, which are extensions to the typical setting (i.e., individual tasks and separate servers) studied in the literature; in [13], we describe the requirements of emerging Metaverse applications and the promising supporting infrastructure, and outline a comprehensive cloud network flow mathematical framework designed for the end-to-end optimization and dynamic control of such systems. 14 Chapter2 EfficientDeliveryofDelay-constrainedServices 2.1 Overview In this chapter, we investigate the problem of multi-hop cloud network control with the goal of delivering AgI services with strict per-packet deadline constraints, while minimizing overall operational cost. More concretely, we focus onreliableservicedelivery, which requires the timely throughput of each service, i.e., the rate of packets delivered by their deadlines, to surpass a given level in order to meet a desired QoE. We study the problem in dynamic scenarios, i.e., assuming the service requests are unknown and time-varying. There are two main challenges that prohibit the use of existing cloud network control methods (e.g., [33]) and associated queuing systems for reliable service delivery. In particular, existing queuing systems: (i) do not take packet deadlines into account and cannot track associated packet lifetimes; (ii) do not allow packet drops, which becomes critical in delay-constrained routing, since dropping outdated packets can benefit cost performance without impacting timely throughput. To overcome these drawbacks, we construct a novel queuing system with separate queues for differ- ent deadline-driven packet lifetimes, and allow packet drops upon lifetime expiry. In contrast to standard queuing systems, where packets are transmitted to reduce network congestion and keep physical queues 15 stable [58], the new queuing model is fundamentally different: stability of physical queues becomes ir- relevant (due to packet drops), and packet transmission is driven by the requirement to deliver packets on time (reliable service delivery). The proposed solution is presented in two stages. First, we study a relaxed average-constrained network control problem and derive an exact solution via a flow matching technique where flow scheduling decisions are driven by an LDP solution to an equivalent virtual network control problem. Then, we address the original peak-constrained problem, with the additional challenge of non-equivalent actual and virtual network formulations, and devise an algorithm that adapts the LDP plus flow matching technique via an iterative procedure. Our contributions can be summarized as follows: 1. We develop a novel queuing model that allows tracking data packet lifetimes and dropping outdated packets, and formalize the delay-constrained least-cost dynamic network control problemP 0 . 2. We derive a relaxed problemP 1 targeting the same objective in an average-capacity-constrained network, and characterize its delay-constrained stability region based on a lifetime-driven flow con- servation law. 3. We design a fully distributed near-optimal (see Proposition 3 for throughput and cost guarantees) control policy forP 1 by (i) deriving an equivalent virtual network control problemP 2 that ad- mits an efficient LDP-based solution, (ii) proving that P 1 andP 2 have identical stability region, flow space, and optimal objective value, and (iii) designing a randomized policy for P 1 guided by matching the virtual flow solution to P 2 . 4. We leverage the flow matching technique to develop an algorithm for P 0 , referred to as reliable cloud network control (RCNC), whose solution results from the convergence of actual (forP 0 ) and virtual (forP 2 ) flows via an iterative optimization procedure. 16 The rest of the chapter is organized as follows. In Section 2.2, we review the existing works related to this topic. In Section 2.3, we introduce network model and associated queuing system. In Section 2.4, we define the policy space and formulate the original problem P 0 . In Section 2.5, we study the relaxed problemP 1 and derive an equivalent LDP-amenable formulationP 2 . Section 2.6 presents the algorithm for solvingP 1 as well as its performance analysis, which is extended to develop an iterative algorithm forP 0 in Section 2.7. Numerical results are shown in Section 2.8, and possible extensions are discussed in Section 2.9. Finally, we summarize the main conclusions in Section 2.10. 2.2 RelatedWork Per-packet delay performance analysis for the packet routing problem, even under the assumption ofstatic arrivals, is a challenging problem. In particular, the restricted shortest path (RSP) problem, which aims to find the min-cost path for a given source-destination pair subject to an end-to-end delay (or path length) constraint, is known to be NP-hard [35]. Considering dynamic arrivals becomes a further obstacle that requires additional attention. An opportunistic scheduling policy is proposed in [57] that trades off worst- case delay and timely throughput, which preserves the delay guarantee when applied to hop-count-limited transmissions. However, it requires a link selection procedure (to meet the hop-count requirement) that weakens its performance in general networks (e.g., mesh topologies); besides, the timely throughput is with respect to the worst-case delay, rather than the deadline imposed by the application, leading to ei- ther sub-optimal throughput under stringent deadline constraints, or looser guarantees on the worst-case delay; finally, it treats packet scheduling on different links separately, lacking an end-to-end optimization of the overall delay. In [66], the authors formulate the problem of timely throughput maximization as an exponential-size constrained MDP (CMDP), and derive an approximate solution based on solving the optimal single-packet transportation problem for each packet; in addition, [65] addresses the more com- plex set-up of wireless networks with link interference. While this approach reduces the complexity from 17 exponential (of a general solution that makes joint packet decisions) to polynomial, it requires solving a dynamic programming problem for each packet at every time slot, which can still become computation- ally expensive. Furthermore, none of these works takes operational cost minimization into account, an important aspect in modern elastic cloud environments. 2.3 SystemModel 2.3.1 CloudLayeredGraph The ultimate goal of this work is to design control policies for distributed cloud networks to reliably sup- port multiple delay-sensitive AgI services, where the network is equipped with computation resources (cloud servers, edge/fog computing nodes, etc.) able to host service functions and execute corresponding computation tasks. While in traditional packet routing problems, each node treats itsneighbornodes as outgoing interfaces over which packets can be scheduled for transmission, a key step to address the AgI service control problem is to treat the co-located computing resources as an additional outgoing interface over which packets can be scheduled for processing [33]. Indeed, as illustrated in [81], the AgI service control problem, involving both packet routing and processing, can be reduced to a packet routing problem on alayeredgraph where cross-layer edges represent computation resources. Motivated by such a connection and for ease of exposition, Without loss of generality (w.l.o.g.), we illus- trate the developed approach focusing on the single-commodity delay-constrained min-cost packet routing problem. We remark that (i) it is still an open problem even in traditional communication networks, and (ii) the extension to distributed cloud networks hosting AgI services is presented in Appendix A.8. 18 2.3.2 NetworkModel The considered packet routing network is modeled via a directed graphG =(V,E), where edge(i,j)∈E represents a network link supporting data transmission from nodei∈V toj ∈V, and whereδ − i andδ + i denote the incoming and outgoing neighbor sets of nodei, respectively. Time is divided into equal-sized slots, and the available transmission resources and associated costs at each network link are quantified as: • C ij : the transmission capacity, i.e., the maximum number of data units (e.g., packets) that can be transmitted in one time slot, on link(i,j); • e ij : the unit transmission cost, i.e., the cost of transmitting one unit of data in one time slot, on link (i,j). We emphasize that in the layered graph, cross-layer edges represent data processing, i.e., data streams pushed through these edges are interpreted as being processed by corresponding service functions, and the capacity and cost of these edges represent the processing capacity and processing cost of the associated computation resources (e.g., cloud/edge servers). 2.3.3 ArrivalModel In this work, we focus on a delay-sensitive application, assuming that each packet has a strict deadline by which it must be delivered to the destinationd∈V. In other words, each packet must be delivered within its lifetime, defined as the number of time slots between the current time and its deadline. A packet is called effective if its remaining lifetimel is positive, and outdated otherwise. In addition, we define timely throughput as the rate of effective packet delivery. We assume that input packets can originate at any source node of the application, and in general, we assume that the set of source nodes can be any network node except the destination,V \{d}. The 19 lifetime + 1 lifetime Node . . . lifetime + 1 exogenous packets lifetime node (outgoing) node (incoming) x i (l+1) → (t) Q (l+1) i (t)−x i (l → +1) (t) x (l i +1) → (t) a (l) i (t) j ∈ δ i + j∈ δ i − i Figure 2.1: Interaction between lifetime queues. Red and blue colors denote packet states and actions during transmitting and receiving phases, respectively. packet’s initial lifetimel∈L≜{1,··· ,L} is determined by the application (based on the sensitivity of the contained information to delay), which can vary from packet to packet, withL denoting the maximum possible lifetime. Denote by a (l) i (t) the number of exogenous packets (i.e., packets generated externally) of lifetime l arriving at node i. We assume that the arrival process is i.i.d. over time, with mean arrival rateλ (l) i ≜E a (l) i (t) and an upper bound ofA max ; besides, we define the corresponding vectors a(t)= a (l) i (t):∀i∈V, l∈L andλ =E{a(t)}. 2.3.4 QueuingSystem Since each packet has its own delivery deadline, keeping track of data packets’ lifetimes is essential. A key step is to construct a queuing system with distinct queues for packets of different currentlifetimesl∈L. In particular, we denote byQ (l) i (t) the queue backlog of lifetimel packets at nodei at time slott, and define Q(t)= Q (l) i (t):∀i∈V, l∈L . Letx (l) ij (t) be theactual number of lifetimel packets transmitted from nodei toj at timet (which is different from a widely used assigned flow model, as explained in Remark 2). 20 Each time slot is divided into two phases, as illustrated in Fig. 2.1. In thetransmitting phase, each node makes and executes transmission decisions based on observed queuing states. The number of lifetimel+1 packets at the end of this phase is given by ˘ Q (l+1) i (t)=Q (l+1) i (t)− x (l+1) i→ (t) (2.1) where x (l+1) i→ (t)≜ P j∈δ + i x (l+1) ij (t) denotes the number of outgoing packets. In the receiving phase, the incoming packets, including those from neighbor nodes x (l+1) →i (t) ≜ P j∈δ − i x (l+1) ji (t) as well as exoge- nously arriving packets a (l) i (t), are loaded into the queuing system, and the queuing states are updated as: Q (l) i (t+1)= ˘ Q (l+1) i (t)+x (l+1) →i (t) +a (l) i (t) (2.2) where lifetime l +1 packets, including those still in the queue as well as those arriving from incoming neighbors during the transmitting phase of time slott (i.e., terms in the square bracket) turn into lifetime l packets during the receiving phase of time slott. In addition, lifetimel exogenous packets, a (l) i (t), also enter the lifetimel queue during the receiving phase of slott. All such arriving packets become ready for transmission at the transmitting phase of slott+1. To sum up, the queuing dynamics are given by Q (l) i (t+1)=Q (l+1) i (t)− x (l+1) i→ (t)+x (l+1) →i (t)+a (l) i (t) (2.3) for∀i∈V,l∈L. 21 In addition, we assume: 1) as the information contained in outdated packets is useless, i.e., outdated packets do not contribute to timely throughput, they are immediately dropped to avoid inefficient use of network resources: Q (0) i (t)=0, ∀i∈V, (2.4) and 2) for the destination noded, every effective packet is consumed as soon as it arrives, and therefore Q (l) d (t)=0, ∀l∈L. (2.5) Considering the lifetime reduction over time slots, in general, we do not send packets of lifetime1, i.e., x (1) ij (t) = 0, since the packets turn outdated at node j at the next time slot. The only exception occurs whenj = d: we assume that the packets of lifetimel = 1 are consumed as soon as the destination node receives them, while they are still effective. 2.4 ProblemFormulation In this section, we introduce the admissible policy space, the reliability constraint, and the formalized delay-constrained least-cost dynamic network control problem. 2.4.1 AdmissiblePolicySpace The control policies of interest make packet routing and scheduling decisions at each time slot, which are dictated by the flow variables x(t) = x (l) ij (t) :∀(i,j)∈E,l∈L . In particular, we focus on the space of admissible control policies with decision flow variables satisfying: 22 1. non-negativity constraint, i.e., x (l) ij (t)≥ 0 for∀(i,j)∈E, orx(t)⪰ 0; (2.6) 2. peak link capacity constraint, i.e., x ij (t)≜ X l∈L x (l) ij (t)≤ C ij , ∀(i,j)∈E; (2.7) 3. availability constraint, i.e., x (l) i→ (t)≤ Q (l) i (t), ∀i∈V, l∈L. (2.8) The availability constraint (2.8) requires the total number of (scheduled) outgoing packets to not exceed those in the current queuing system, since we define x(t) as the actual flow (see Remark 2 for a detailed explanation). As will be shown throughout the chapter, it plays an equivalent role to flow conservation in traditional packet routing formulations. 2.4.2 GeneralNetworkStabilityRegion In addition to the above admissibility constraints, we require the timely throughput achieved by the de- signed control policy to surpass a given level specified by the application, i.e., {E{x →d (t)}}≥ γ ∥λ ∥ 1 (2.9) where γ denotes the reliability level,∥λ ∥ 1 is the total arrival rate, and{z(t)} ≜ lim T→∞ 1 T P T− 1 t=0 z(t) denotes the long-term average of random process{z(t):t≥ 0}. 23 source destination transmission capacity = 1 arriving packets a(t) (a) Single-hop network. 0 2 4 6 8 Time 0 1 2 Number of Arrivals a 1 (t) (high-dynamic) a 2 (t) (constant) (b) Arrival processes. Figure 2.2: A one-hop example network. Packets of lifetimeL=1 arrive at the source according to two arrival processes of equal mean arrival rateλ =1. The reliability constraint (2.9) imposes the requirement on the routing policy to provide reliable (delay- constrained) packet delivery. It forces packets to be routed efficiently and avoid excessive in-network packet drops due to lifetime expiry. The reliability levelγ characterizes the robustness of the considered service to missing information, i.e., a percentage of up to (1− γ ) of the packets can be dropped with- out causing a significant performance loss. The reliability constraint plays an equivalent role to network stability in traditional packet routing formulations. Definition1. ForagivencapacitatednetworkG, wedefinethe delay-constrained stability regionastheset of(f a ,γ ) pairs that can be supported by an admissible policy, i.e., the pairs(f a ,γ ) such that there exists an admissible policy that satisfies (2.9) under an arrival process with probability density function (pdf)f a . Note that via the complete information of the pdff a , the mean arrival vectorλ in (2.9) can be derived, which is employed to characterize the stability region in many existing works (e.g., [58, 33]). However, such first order characterization is not sufficient for the studied problem, as illustrated in Remark 1, showing the necessity to include the entire pdf information. Remark1. Consider theExample shown in Fig. 2.2, where the initial lifetime of every packet is equal to 1. The achievable reliability level isγ 1 = 50% under a high-dynamic arrivala 1 (t), andγ 2 = 100% under the constant arrivala 2 (t); while the two arrival processes have the same rate of 1. This example shows that: in addition to arrival rate, arrival dynamics can also impact the performance in the studied problem. 24 Remark2. In the existing literature of stochastic network optimization (e.g., [33, 58, 57, 71]), a key element that has gained widespread adoption to improve tractability is the use of the assigned flow, which is different from the actual flow in that it does not need to satisfy the availability constraint (2.8). Dummy packets are created when there are not sufficient packets in the queue to support the scheduling decision, making the decisionvariablesnotconstrainedbythequeuingprocess. Suchformulation,however,isnotsuitablefordelay- constrainedrouting,wherereliablepacketdeliveryisimposedonthe actualpacketsreceivedbythedestination (via constraint (2.9)). 2.4.3 ProblemFormulation The goal is to develop an admissible control policy that guarantees reliable packet delivery, while mini- mizing overall network operational cost. Formally, we aim to find the policy with decisions {x(t):t≥ 0} satisfying P 0 : min x(t)⪰ 0 {E{h(x(t))}} (2.10a) s.t. {E{x →d (t)}}≥ γ ∥λ ∥ 1 (2.10b) x ij (t)≤ C ij , ∀(i,j)∈E (2.10c) x (l) i→ (t)≤ Q (l) i (t), ∀i∈V,l∈L (2.10d) Q(t) evolves by (2.3) – (2.5) (2.10e) where the instantaneous cost of the decisionx(t) is given by h(t)=h(x(t))= X (i,j)∈E e ij x ij (t)=⟨e,x(t)⟩ (2.11) with⟨·,·⟩ denoting the inner product of the two vectors. 25 The above problem belongs to the category of CMDP, by defining the queuing vector Q(t) as thestate and the flow variable x(t) as the action. However, note that the dimension of state-action space grows exponentially with the network dimension, which prohibits the application of the standard solution to this problem [5]. Even if we leave out the operational cost minimization aspect, it is still challenging to find an exact efficient solution to the remaining problem of timely throughput maximization, as studied in [66]. On the other hand, note that (2.10) deals with a queuing process, together with long-term average objective and constraints, which is within the scope of Lyapunov drift control [58]. However, it cannot be directly applied to solve (2.10) because: (i) the related queuing process (2.10e) is not of standard form; ∗ (ii) the decision variables are actual flows and depend on the queuing states (2.10d), which is different from a widely used assigned flow model (see Remark 2). 2.5 TheAverageCapacityConstrainedProblem The goal of this work is to derive an efficient approximate solution to P 0 . To this end, we start out with a less restricted setup in which only the average flow is constrained to be below capacity, leading to the following relaxed control problem: P 1 : min x(t)⪰ 0 {E{h(x(t))}} (2.12a) s.t. {E{x →d (t)}}≥ γ ∥λ ∥ 1 (2.12b) {E{x ij (t)}}≤ C ij (2.12c) x (l) i→ (t)≤ Q (l) i (t) (2.12d) Q(t) evolves by (2.3) – (2.5) (2.12e) ∗ In the designed lifetime-based queuing system, a packet can traverse queues of reducing lifetimes and eventually get dropped when entering the lifetime0 queue. On the other hand, in traditional queuing systems [58, 33], a packet stays in the same queue until selected for operation; in addition, since there are no packet drops, queue build up contributes to network congestion and creates pressure driving packet transmission [58]. 26 which relaxes the peak capacity constraint (2.10c) by corresponding average capacity constraint (2.12c). † Mathematically,P 1 is still a CMDP problem, making it challenging to solve. Instead of tackling it di- rectly, in the following, we derive a tractable problemP 2 corresponding to avirtualnetwork, and establish the connection between them by showing that they have identical flow spaces, which allows to address P 1 using the solution toP 2 as a stepping-stone. 2.5.1 TheVirtualNetwork The virtual network control problem is cast as P 2 : min x(t)⪰ 0 {E{h(x(t))}} (2.13a) s.t. {E{x →d (t)}}≥ γ ∥λ ∥ 1 (2.13b) x ij (t)≤ C ij (2.13c) ¯x (≥ l) i→ ≤ ¯x (≥ l+1) →i +λ (≥ l) i (2.13d) where ¯x (≥ l) i→ = lim T→∞ 1 T P T− 1 t=0 E x (≥ l) i→ (t) denotes the average transmission rate of packets with life- time≥ l, withx (≥ l) i→ (t)= P L ℓ=l x (ℓ) i→ (t) (similarly for ¯x (≥ l+1) →i ). A crucial difference in the derivation of P 2 is to replace the availability constraint (2.12d) by (2.13d), which states the fact that the lifetime of the packets must decrease (by at least 1) as they traverse any node i, and thus is called the causality constraint. As a consequence, we eliminate the unconventional queuing process (2.12e) and the dependency ofx(t) onQ(t), i.e., the two factors resulting in the failure of employing the LDP approach to addressP 1 . Especially, we will useν (t) (instead ofx(t)) to represent the decisions determined inP 2 , referred to as the virtual flow. † We note that such an average-constrained setting may find interesting applications of its own in next-generation virtual networks that allow elastic scaling of network resources [73]. 27 2.5.1.1 VirtualQueue Although there is no explicit queuing system inP 2 , it consists of long-term average objective and con- straints, which can be addressed via the LDP control of a virtual queuing system [58]. More concretely, the virtual queuing systemU(t) ={U d (t)}∪{U (l) i (t) : i∈V\{d}, l∈L} must be stabilized to ensure (2.13b) and (2.13d), defined as U d (t+1)=max U d (t)+γA (t)− ν →d (t), 0 , (2.14a) U (l) i (t+1)=max U (l) i (t)+ν (≥ l) i→ (t)− ν (≥ l+1) →i (t)− a (≥ l) i (t), 0 . (2.14b) whereA(t) = P i∈V,l∈L a (l) i (t) is the total amount of packets arriving in the network at time slott. ‡ We refer to (2.14a) and (2.14b) as the virtual queues associated with noded andi. To sum up, it is equivalent to castP 2 as P e 2 : min ν (t) {E{h(ν (t))}} (2.15a) s.t. stabilizeU(t) evolving by (2.14) (2.15b) 0≤ ν ij (t)≤ C ij ∀(i,j)∈E. (2.15c) 2.5.1.2 PhysicalInterpretation When deriving the virtual network control problemP 2 , we relax the precedence constraint that imposes that a packet cannot be transmitted from a node before it arrives at the given node. Instead, we assume that each node in the virtual network is adata-reservoir and has access to abundant (virtual) packets of any lifetime. At every time slot, each node checks packet requests from its outgoing neighbors and supplies ‡ Here we useA(t) instead of∥λ ∥1 as the latter information is usually not available in practice; furthermore, if the arrival information cannot be obtained immediately, delayed information, i.e.,A(t− τ ) withτ > 0, can be used as an alternative, which does not impact the result of time average. 28 source destination considered node lifetime 1 lifetime 2 virtual queues actual queues lifetime 2 packets do NOT send out lifetime 2 packets over the red link Figure 2.3: Illustration of the devised virtual system. The source node supplies packets of lifetime2, which arrive as lifetime1 packets to the actual queue of the considered node. In the virtual system, the considered node is allowed to supply packets of any lifetime to the destination by borrowing them from the reservoir and building up in the corresponding virtual queue. The virtual queue of lifetime1 is stable, since the received lifetime1 packets from the source node can compensate the borrowed packets; while the virtual queue of lifetime2 builds up, pushing the node to stop sending out more lifetime2 packets. The decision derived from the stability of the virtual system is aligned with the desired operation of the actual network, as only lifetime1 packets are available for transmission at the considered node. such needs using the virtual packets borrowed from the reservoir, which are compensated when it receives packets of the same lifetime (either from incoming neighbors or exogenous arrivals). The virtual queues can be interpreted as the data deficits (difference between outgoing and incoming packets) of the corre- sponding data-reservoirs. Specially, in (2.14a), the destination reservoirsendsoutγA (t) packets to the end user (to meet the reliability requirement), while receivingν →d (t) in return. When (2.13b) and (2.13d) are satisfied, or the virtual queues are stabilized, the network nodes do not need to embezzle virtual packets from the reservoirs; since the achieved network flow can be attained by the actual packets, it can serve as guidance for packet routing in the actual network (see Fig. 2.3 for illustration). 2.5.2 ConnectionsBetweenP 1 andP 2 We now describe key connections between the actual and virtual network control problems. Definition2 (Feasible Policy). For problemP ι (ι = 1,2), a policyp is called feasible if it makes decisions satisfying (2.12b) – (2.12e) (ι = 1) or (2.13b) – (2.13d) (ι = 2). The set of feasible policies is called feasible policy spaceF ι . Definition3 (Flow Assignment). GivenafeasiblepolicypforproblemP ι (ι =1,2)(withdecisions{x p (t): t≥ 0}), the achieved flow assignment is defined as x p ={E{x p (t)}}, i.e., the vector of transmission rates 29 forpacketswithdifferentlifetimesonallnetworklinks. Furthermore,the flow space isdefinedasthesetofall achievable flow assignments, i.e., Γ ι = x p :p∈F ι . Definition 4 (Stability Region). For problemP ι (ι = 1,2), the stability regionΛ ι is defined as the set of (λ ,γ ) pairs, under which the feasible policy spaceF ι is non-empty. We make the following clarifications about the above definitions: (i) we will prove (in Theorem 1) that the stability region ofP 1 only depends on the mean arrival rateλ , in contrast to the general Definition 1 which involves the arrival pdff a ; and so is that ofP 2 , which is clear from its definition (2.13); (ii) the feasible policy space and the flow space are associated with a certain point (λ ,γ ) in the stability region; (iii) since the networks considered inP 1 andP 2 are of the same topology, the flow assignment vectors are of the same dimension. We then reveal the intimate relationship between the two problems by the following three results. Proposition1. The availability constraint (2.12d) implies the causality constraint (2.13d). Proof. See Appendix A.1. Theorem 1. For a given network, the stability regions ofP 1 andP 2 are identical, i.e., Λ 1 = Λ 2 . In addition, a pair(λ ,γ ) is interior to the stability regionΛ ι (ι = 1,2) if and only if there exist flow variables x={x (l) ij ≥ 0:∀(i,j)∈E,l∈L}, such that for∀i∈V,(i,j)∈E, l∈L, x →d ≥ γ ∥λ ∥ 1 (2.16a) x ij ≤ C ij , ∀(i,j)∈E (2.16b) x (≥ l+1) →i +λ (≥ l) i ≥ x (≥ l) i→ , ∀i∈V,l∈L (2.16c) x (0) ij =x (l) dk =0, ∀k∈δ + d ,(i,j)∈E,l∈L. (2.16d) Furthermore,∀(λ ,γ )∈Λ ι , there exists a feasible randomized policy that achieves the optimal cost. 30 Proof. See Appendix A.2, A.3. Proposition2. For∀(λ ,γ )∈Λ 1 =Λ 2 , the two problems have identical flow spaces, i.e., Γ 1 =Γ 2 . Proof. By Theorem 1,P 1 andP 2 have the same stability region, i.e., Λ 1 = Λ 2 . Consider a point in the stability region Λ 1 = Λ 2 . For any flow assignment x ∈ Γ 1 , there exists a feasible policy p 1 ∈ F 1 , with decision variables{x 1 (t) : t ≥ 0}, that attains flow assignment x, i.e., {E{x 1 (t)}} = x. Therefore, {x 1 (t) : t ≥ 0} satisfies (2.12b) – (2.12e), which implies that x satisfies all the conditions in (2.16) (by Proposition 1). Using the method provided in Appendix A.3.2.1, we can construct a feasible randomized policy p 2 ∈ F 2 , with decision variables{x 2 (t) : t ≥ 0}, that achieves the same flow assignment, i.e., {E{x 2 (t)}} = x, forP 2 . Therefore,x∈ Γ 2 , and thusΓ 1 ⊂ Γ 2 . The reverse directionΓ 2 ⊂ Γ 1 can be shown via the same argument. Hence,Γ 1 =Γ 2 . The above propositions are explained in the following: by Proposition 1 and the fact that (2.13c) implies (2.12c), the feasible policy spaces satisfyF 1 ⊈ F 2 andF 2 ⊈ F 1 ; while Theorem 1 shows that they lead to the same stability region, by presenting an explicit, identical characterization (2.16) (where (2.16c) is the generalized lifetime-driven flow conservation law), which is in the form of a linear programming (LP) problem withL|E| variables (and thus of pseudo polynomial complexity); Proposition 2 further shows that P 1 andP 2 share the same flow space (for any point in the stability region), which is a crucial property since the two metrics of interest, i.e., timely throughput (2.9) and operational cost (2.11), are both linear functions of the flow assignment. Corollary1. P 1 andP 2 have the same optimal value. 31 Proof. Consider a feasible policyp 1 ∈F 1 , whose decisionsx 1 (t) attain flow assignment x. According to the Proposition 2, there exists a feasible policyp 2 ∈F 2 attaining the same flow assignment x by making decisionsx 2 (t). The operational cost satisfies {E{h(x 1 (t))}}={E{⟨e,x 1 (t)⟩}}=⟨e,{E{x 1 (t)}}⟩ =⟨e,x⟩=⟨e,{E{x 2 (t)}}⟩ ={E{⟨e,x 2 (t)⟩}}={E{h(x 2 (t))}}. (2.17) The reverse direction can be shown by the same argument. As a result, they have the same range (when treating the cost as a function of the policy), and thus optimal value. Corollary2. GivenafeasiblepolicytoP 2 ,wecanconstructafeasiblerandomizedpolicyforP 1 toachieve the same flow assignment. Proof. Suppose{ν (t) : t ≥ 0} ∈ F 2 . The associated flow assignment ν = {E{ν (t)}} satisfies (2.16), and we can construct a feasible randomized policy forP 1 as follows (see Appendix A.2.2.1 for details): at each time slot, for every packet of lifetimel∈L in the queuing system, nodei∈V selects the outgoing neighborj∈δ + i for it according to the pdf α (l) i (j)=ν (l) ij ν (≥ l+1) →i +λ (≥ l) i − ν (≥ l+1) i→ (2.18) otherwise the packet stays in nodei. It is shown in Appendix A.2 that this policy achieves flow assignment ν . 32 2.6 SolutiontoAverage-ConstrainedNetwork In this section, we take advantage of the LDP approach to addressP e 2 and guide the design of a fully distributed, near-optimal randomized algorithm forP 1 (by Corollary 2). 2.6.1 OptimalVirtualFlow We first present the LDP-based algorithm to solve P e 2 given by (2.15). Define the Lyapunov function as L(t) = ∥U(t)∥ 2 2 2, and the Lyapunov drift ∆( U(t)) = L(t + 1)− L(t). The LDP approach aims to minimize a linear combination of an upper bound of the Lyapunov drift (which can be derived by some standard manipulation [58]) and the objective function weighted by a tunable parameterV , or ∆( U(t))+Vh(ν (t))≤ B−⟨ ˜ a,U(t)⟩−⟨ w(t),ν (t)⟩ (2.19) whereB is a constant, ˜ a={− γA (t)}∪ a (≥ l) i (t):∀i∈V\{d}, l∈L , and the weightsw(t) are given by w (l) ij (t)=− Ve ij − U (≤ l) i (t)+ U d (t) j =d U (≤ l− 1) j (t) j̸=d (2.20) where the superscript (≤ l) refers to the operation of P l ℓ=1 . To sum up, at every time slot, the algorithm decides the virtual flow ν (t) by addressing the following problem max ν (t) ⟨w(t),ν (t)⟩, s.t. 0≤ ν ij (t)≤ C ij , ∀(i,j)∈E (2.21) 33 and the solution to it is in the max-weight fashion. More concretely, for each link(i,j), we first find the best lifetime l ⋆ with the largest weight, and devote all the transmission resource to serve packets of this lifetime if the weight is positive. Therefore, the optimal virtual flow assignment is ν (l) ij (t)=C ij I l =l ⋆ ,w (l ⋆ ) ij (t)>0 (2.22) wherel ⋆ =argmax l∈L w (l) ij (t),I{·} is the indicator function. To implement the above algorithm, at each time slot, a considered node exchanges the virtual queue information with its neighbor nodes (to calculate the weight of each lifetime by (2.20)), and decides the virtual flow according to (2.22), which can be completed in a fully distributed manner; the computational complexity at nodei is given byO(L|δ + i |). 2.6.2 FlowMatching The algorithm developed above can provide a near-optimal (will be proved in next subsection) solution {ν (t) : t ≥ 0} toP 2 , from which we will design a feasible, near-optimal policy forP 1 in this section. The decided (actual) flow is denoted by µ (t), to distinguish it from the virtual flow ν (t). We will design an admissible policy forP 1 (i.e., satisfying (2.12c) – (2.12d)) to pursue the goal of flow matching, i.e., {µ (t)}={ν (t)}. (2.23) The reason to set the above goal is two-fold: (i) it ensures that the designed policy can attain the same throughput and cost performance (recall that both metrics are linear functions of the flow assignment) as the virtual flow, which is feasible (satisfying the reliability constraint) and achieves near-optimal cost performance, (ii) the existence of the policy is guaranteed (as a result of identical flow spaces); actually, 34 Algorithm1 Randomized Flow-Matching Algorithm 1: for t≥ 0 andi∈V do 2: Solve the virtual flow ν (t) from (2.21); 3: Update the empirical averages ¯ν (t) and ˆ λ (t) by (2.24); 4: Update probability values ˆ α (l) i (j):j∈δ + i l∈L by (2.18) (using the above empirical averages); 5: for l∈L do 6: For each packet inQ (l) i (t), decide its outgoing link according to pdf ˆ α (l) i (j):j∈δ + i ; 7: endfor 8: endfor given the feasible solution{ν (t)}, Corollary 2 presents a construction procedure of a feasible policy for P 1 to realize the goal. Corollary 2 requires the exact values of{ν (t)} andλ as input, which are not available in practice. As an alternative, we employ the corresponding empirical values, i.e., the finite-horizon average of the virtual flow and the arrival rate ¯ν (t)= 1 t t− 1 X τ =0 ν (τ ), ˆ λ (t)= 1 t t− 1 X τ =0 a(τ ) (2.24) to calculate the probability values in (2.18), by which we decide the outgoing flow at time slot t. Since the above empirical values are updated at every time slot, it leads to a time-varying randomized policy; as ¯ν (t)→{ν (t)} and ˆ λ (t)→λ asymptotically, the policy gradually converges. § The proposed control policy is summarized in Algorithm 1, and we emphasize that (i) at a given time slot, the policy in Corollary 2 makes i.i.d. decisions for packets with the same lifetime (i.e., fix the lifetime l, the pdf ˆ α (l) i (j):j∈δ + i to determine the routing decision for each packet is the same). It is equivalent to make flow-level decisions based on packets’ lifetime, by generating multinomial random variables with § It is possible that ¯ν (t) can violate (2.16c) at some time slot, which is not qualified to construct a valid randomized policy. However, as t → ∞, ¯ν (t) converges to{ν (t)}, which satisfies the constraints. With this asymptotic guarantee, when such violation occurs, we can choose not to update the control policy at that time slot. 35 parameterQ (l) i (t) and the common pdf; (ii) in addition to deciding the virtual flow, the developed random- ized policy requires each node to update the empirical averages (2.24), calculate the pdf ˆ α , and make the decisions at a complexity ofO(L|δ + i |). Remark 3. For the studied packet routing problem (where flow scaling is not relevant), under the widely used assumption of Poisson arrivals, we can show that the instantaneous flow size x ij (t),∀(i,j), t, follows a Poisson distribution, which enjoys good concentration bounds (see Appendix A.4). Remark4. In[66],asubproblemofP 1 isstudied,whichinvolvesconstraintsonaveragecapacityandtimely throughput, while leaving out the aspect of operational cost. The formulated CMDP problem is solved by a dynamicprogrammingalgorithm,whichcanalsobeaddressedfollowingthesameprocedurepresentedinthis section, at a lower complexity. 2.6.3 PerformanceAnalysis In this section, we first prove that the LDP-based algorithm (for P 2 ) stabilizes the virtual queues (and consequently, the timely throughput satisfies the reliability constraint (2.13b)) and attains near-optimal cost performance; then we show that the flow matching-based randomized policy (for P 1 ) achieves the same throughput and cost performance as the previous algorithm. The effects of parameter V are also analyzed. 2.6.3.1 VirtualNetwork In addition to proving that the algorithm stabilizes the virtual queues, we analyze the effect of V on the ε-convergence time defined as follows. 36 Definition5 (ε-Convergence Time). Theε-convergencetimet ε istherunningtimefortheaveragesolution to achieve a reliability within a margin ofε from the desired value, i.e., t ε ≜min τ n sup s≥ τ h γ ∥λ ∥ 1 − s− 1 X t=0 E{ν →d (t)} s i ≤ ε o . (2.25) The existence oft ε (under the proposed algorithm) is shown in Appendix A.5.2 for anyε>0. Proposition 3. For any point in the interior of the stability region, the virtual queues are mean rate stable under the proposed algorithm with a convergence time t ε ∼ O (V) for any ε > 0, and the achieved cost performance satisfies {E{h(ν (t))}}≤ h ⋆ 2 (λ ,γ )+ B V (2.26) whereh ⋆ 2 (λ ,γ ) denotes the optimal cost performance that can be achieved under(λ ,γ ) inP 2 . Proof. See Appendix A.5. We make the following clarifications about the above proposition, (i) for a finite horizon, the reliability (2.13b) and causality (2.13d) constraints might not be satisfied; (ii) the virtual queues are stabilized, imply- ing that the two constraints hold asymptotically; (iii) by pushing the parameterV →∞, the achieved cost performance approaches the optimal cost (since the gap B/V vanishes), with a tradeoff in convergence time. 2.6.3.2 PerformanceofAlgorithm1 Proposition 4. For any point in the interior of the stability region, Algorithm 1 is feasible forP 1 , while achieving the near-optimal cost performance ofh({ν (t)}). 37 Proof. Algorithm 1 makes decisions for the packets in the queuing system, and thus satisfying the con- straints (2.12d). Besides, as lim t→∞ ¯ν (t) = {ν (t)} and lim t→∞ ˆ λ (t) = λ , the instantaneous policy converges to a fixed policy constructed from {ν (t)} and λ , which achieves the same flow assignment {µ (t)}={ν (t)} as is proved in Corollary 2, leading to identical throughput and cost performance. Remark 5. We note that Algorithm 1 relies on the knowledge of the arrival rate (via (2.18) in step 4), and the empirical estimate (2.24) we use for implementation may be subject to estimation errors that can impact the attained cost performance. As shown in Appendix A.6, in some extreme cases, the estimation error can leadtoaconsiderableperformanceloss,drivenbytheLagrangianmultiplier(orshadowprice)associatedwith the constraints involvingλ . However, under i.i.d. arrivals, the estimated rate converges to the true value, and Algorithm 1 is guaranteed to achieve near-optimal asymptotic performance. 2.7 SolutiontoPeak-ConstrainedNetwork In this section, we aim to address the original problemP 0 (with peak-capacity constraint), leveraging the flow matching technique we develop in the previous section. There are two problems we need to address: (i) the actual flow (decided by the randomized policy) can violate the peak capacity constraint (2.10c); (ii) the actual and virtual flow spaces are not identical, i.e., Γ 0 ⊂ Γ 2 (while in the average-constrained case,Γ 1 =Γ 2 ). 38 To address problem (i), we propose a request queue stability approach in order to constrain instanta- neous transmission rates. For problem (ii), we introduce an auxiliary variableϵ ij ,∀(i,j) to represent the gap in flow spaces, leading to the following optimization problem over {ν (t),µ (t),ϵ }: P 3 : min {E{h(ν (t))}} (2.27a) s.t. ν ij (t)≤ C ij − ϵ ij , (11b), (11d), (11e), (2.27b) {E{µ ij (t)}}={E{ν ij (t)}}, (8c) – (8f), (2.27c) 0⪯ ϵ ≜{ϵ ij }⪯{ C ij }. (2.27d) While solvingP 3 in a joint manner is difficult, we propose an iterative optimization approach: i) fix ϵ andµ (t): find ν (t) by LDP control (19), and derive the virtual flow assignment with optimal operational cost; ii) fix ϵ andν (t): find µ (t) with the goal of flow matching (i.e., by stabilizing the request queues); iii) fix ν (t) andµ (t): updateϵ based on the gap between optimal and achievable rates, i.e.,{E{ν (t)}}− {E{µ (t)}}, which is non-zero if (25c) is violated. Due to the randomness of network states, we introduce atimeframe structure: step i) and ii) are executed on a per-slot basis, while step iii) on a per-frame basis (to obtain better rate estimates). The developed algorithm, referred to as reliable cloud network control (RCNC), is described in Algorithm 2. 39 2.7.1 RequestQueue We propose to achieve (2.23) by making admissible flow decisions (i.e., satisfying (2.10c) – (2.10e)) to sta- bilize the request queuesR(t)={R (l) ij (t):∀(i,j)∈E,l∈L}, defined as R (l) ij (t+1)=R (l) ij (t)+¯ν (l) ij (t)− µ (l) ij (t) (2.28) where ¯ν (t) is given by (2.24), and we still adopt the notationµ (t) to refer to the actual flow decided in P 0 , without causing ambiguity (P 1 is not relevant in this section). We consider the n-slot look-ahead scheme, under which the current decision is made together with n− 1 (anticipated) future decisions. Such a scheme is employed since it creates flexibility for a packet to change its lifetime by delaying transmission, in favor of relieving the burden of the request queue with the heaviest backlog, as well as balancing the transmission load. From a formal point of view, we make decisions for n time slots (starting from the current time slot) to optimize the multi-slot Lyapunov drift, defined as ∆ n (R(t))≜ ∥R(t+n− 1)∥ 2 2 −∥ R(t)∥ 2 2 2 . (2.29) An upper bound for the multi-slot drift is derived in the following. We apply telescope sum on the queuing dynamics (2.28) for the periodt,··· ,t+(n− 1), which leads to R (l) ij (t+n− 1)=R (l) ij (t)+ t+n− 1 X τ =t ¯ν (l) ij (τ )− µ (l) ij (τ ) . (2.30) Following the same procedure as in Section 2.5.1, we obtain ∆ n (R(t))≤ B ′ (t)− X (i,j)∈E ⟨R ij (t),M ij 1⟩ (2.31) 40 whereM ij is a L× n matrix associated with link (i,j), with column τ (0 ≤ τ < n ) representing the vectorµ ij (t+τ ), andB ′ (t) gathers all the uncontrollable terms. The goal is to minimize the bound for multi-slot drift, by making admissible flow decisions for the n time slots. By applying the queuing dynamics (2.3) recursively, the availability constraint (2.10d) at the (t+τ )-th time slot can be cast as X j∈δ + i g τ +1 (M ij )− D X j∈δ − i g τ (M ji )≤ g τ +1 (A i ), ∀i (2.32) whereA i =[Q i (t),a i (t),··· ,a i (t+n− 1)] is the arrival matrix, with the columnsa i (t+τ )={a (l) i (t+ τ ):l∈L} representing the exogenous arrivals in the future;g denotes the delay function, given by g τ +1 (X)= X τ s=0 D τ − s X[:,s] (2.33) in whichD is the delay matrix of ordern, andX[:,τ ] is theτ -th column of matrixX. We clarify that a i (t + τ ) are random vectors, leading to a complex stochastic optimization problem. We simplify the problem by replacing the random vectors with their estimated averages, i.e., empirical arrival rates ˆ λ i . This reducesA i to a deterministic matrix A emp i =[Q i (t), ˆ λ i ,··· , ˆ λ i ], (2.34) and the problem to a common LP that can be addressed by standard solvers. 41 To sum up, at each slot, the proposed RCNC algorithm solves the following LP to determine the trans- mission flow H : max M X (i,j)∈E ⟨R ij (t),M ij 1⟩ (2.35a) s.t. 1 T M ij ⪯ C ij , ∀(i,j)∈E (2.35b) (2.32) withA i =A emp i , ∀i, 0≤ τ <n (2.35c) M ij ⪰ 0, ∀(i,j)∈E (2.35d) whereM = ∪ i∈V M i ≜ {M ij : j ∈ δ + i }, and (2.35b) is the peak capacity constraint. Note that the above problem involves all the flow variables of the entire network ( nL|E| in total); and due to (2.35c), the decisions of the nodes are dependent on each other, which are determined in a centralized manner. After the optimal solutionM ⋆ ij is obtained, its first column µ ⋆ ij (t) = M ⋆ ij [:,0] will be used as the decided flow for the current time slot. The rest of its columns are discarded, and the procedure repeats at the next time slot based on the updated information to make the corresponding decision. Remark 6 (Choice of n). An intuitive choice is n = L, since the packets of the largest lifetime L will be outdated after L time slots, and we ignore the effects of the current decision at the distant time slots in the future. Another choice is n = 1, which simplifies the formulation by considering only the current time slot, and optimizes the LDP greedily; the solution does not involve any (estimated) future information, which can be implemented in a distributed manner. Remark 7 (Distributed RCNC). To develop a distributed algorithm, we assume that future arrivals from neighbornodesµ (l) ji (t+τ )areestimatedbytheirempiricalaverage ˆ u (l) ji . ThisleadstoanLPformulationthat isthesameas (2.35),onlytoreplace (2.35b)by P j∈δ + i g τ +1 (M ij )≤ g τ +1 ( ˜ A emp i ),with ˜ A emp i =[Q i (t), ˆ λ i + 42 Algorithm2 RCNC 1: for each framek≥ 0 do 2: for t=0:K− 1 do 3: Solve the virtual flow ν (t) from (2.21); 4: Solve the actual flow µ (t) from (2.35); 5: Update the request queueR(t) by (2.28) (using the virtual and actual flows derived above); 6: endfor 7: Update the transmission capacities of the links in the virtual network by (2.36) – (2.38); 8: endfor ˆ u →i ,··· , ˆ λ i +ˆ u →i ]. However,numericalresultssuggestthatthisformulationdoesnotoutperformthesimple algorithm usingn=1 (see Section 2.8.2.3). Remark8 (Complexity). At every time slot, the centralized algorithm requires solving an LP problem with nL|V|+n|E| constraints and nL|E| variables, and the time complexity isO(n 2 L 2 |E| 2 ) (at the centralized controller). For the distributed algorithm, the complexity reduces toO(n 2 L 2 |δ + i | 2 ) at nodei. Intuitively, the centralized algorithm with n = L can achieve a better performance; however, its complexity isO(L 4 |E| 2 ), whichcanbecomeprohibitiveinpractice. Byselectingn=1,wecanobtainthemostefficientalgorithm(with some performance loss), at a complexity ofO(L 2 |δ + i | 2 ). Remark9. Inpractice,wecanselectLastensoftimeslots(e.g.,L=10),basedonfollowingconsiderations. On one hand, it is on the order of network diameter and sufficient to support packet transmission within the network(notethatthenetworkdiameter–representingthehop-distanceofthelongestpath–ofahierarchical edgecomputingnetworkis∼O (log(|V|)). Ontheotherhand,itfallsintotheregimeinwhichRCNCcanrun efficiently. Accordingly, we can select appropriate time slot lengths based on the delay requirement of the supported applications, ¶ to achieve a value of L as marked above. For delay-sensitive applications, such as VR (7–20 ms) [28] and real-time gaming (50 ms) [7], a choice ofL on tens of time slots would result in time slot length ¶ Note that network slicing allows customizing (virtualized) networks for applications with similar delay requirements. 43 of around1 ms. A larger slot length can be considered for applications with higher delay budgets, e.g., it can be selected as15 ms for live streaming (150 ms) [7]. 2.7.2 CapacityIteration The flow matching technique proposed in the previous section assumes that the virtual flow assignment is achievable. This is not necessarily true since there is no guarantee for the equivalence of the flow spaces, as opposed to the average-constrained case. In fact, when deciding the flow assignment in the virtual network, the network controller prefers to transmit the packets along the low-cost routes, leading to a considerable amount ofbottleneck links (i.e., for whom the assigned rate equals the transmission capacity), especially in the high-congestion regime. However, the achieved rate on the bottleneck link is sensitive to the dynamic input, which is usually strictly lower than the link capacity due to truncation (recallExample in Section 2.5). This motivates us to reduce the assigned virtual flow on these links, which can be realized by decreasing the corresponding link capacity in the virtual network. In practice, a sign of unsuccessful flow matching is the instability of the request queues (i.e., R (l) ij (t) grows linearly). In addition to the reason of overestimating the transmission capacity of the bottleneck link, in a multi-hop network, the request queue of link(i,j) can also exhibit unstable behavior when its source i receives insufficient packets from the neighbors compared to the virtual flow assignment. Both factors will be considered when updating the parameters (i.e., link capacities) of the virtual network. To sum up, the parameters of the virtual network are updated on a larger timescale unit, we referred to as frames. Each framek consists ofK time slots, during which the algorithm developed in the previous subsection is performed in an attempt to stabilize the request queues. At the end of the frame, the increasing rate of the request queue for each link(i,j)∈E is calculated by r ij = X l∈L r (l) ij , withr (l) ij ≜max n 0, 1 K R (l) ij (K) o (2.36) 44 3 1 4 2 (a) Illustrative network. (b) Mesh cloud network. Processing U B S (Capacity [GHz], cost [per GHz]) (0.5, 5) (1, 2) (5, 1) Transmission (U, B) (B, B) (B, S) (S, S) (Capacity [Gbps], cost [per Gbps]) (1, 1) (2, 1) (5, 1) S1 B1 U1 U2 U3 B2 U4 U5 U6 S2 B3 U7 U8 U9 B4 U10 U11 U12 S3 C i e i C ij e ij (c) Hierarchical cloud network (devices of the same type have the same configuration). Figure 2.4: The studied networks. and its link capacity is updated by C ij (k+1)= (1− κ ) h C ij (k)− ϵ (k) ij i +κC ij C ij 0 (2.37) in which ϵ (k) ij =r ij − r →i (¯ν ij /¯ν i→ ) (2.38) where κ ∈ (0,1) is a constant, [z] C ij 0 ≜ min{max{0,z},C ij }. The update rule is explained as follows. First, the second term in (2.38) results from insufficient input, where r →i is the total amount of insufficient input to nodei, and ¯ν ij /¯ν i→ is the percentage that link(i,j) takes up among all the outgoing interfaces. 45 0 2 4 6 8 10 Parameter V 10 14 18 22 26 30 34 Cost Average-Constrained network Peak-Constrained network (a) Operational cost. 0 1 2 3 4 5 Time 10 4 4.4 4.6 4.8 5 Rate Actual Flow Virtual Flow Assigned Capacity All Rates 4.47 (b) Capacity iteration of link(1,2). Figure 2.5: The effect of V on the achieved operational cost, and the flow assignment for peak-constrained network under V =5. Second, note that even if the request queue is stabilized, it is possible for ϵ (k) ij to be positive due to ran- dom arrival, leading to too conservative flow assignment; therefore, we add κC ij in (2.37) to avoid such situation, which explores the possibility to increase the assigned flow rate in the considered link (i,j). 2.8 NumericalExperiments In this section, we carry out numerical experiments to evaluate the performance of the proposed design. We start with an illustrative example (Fig. 2.4a), in which we explain the related concepts, as well as showing some intermediate results. After that, a more realistic scenario of edge computing network (Fig. 2.4b) is studied. Both average- and peak-constrained networks are considered, and the term “link capacity” should be interpreted either way. We setn = L andK = 2× 10 3 as the default setting for the proposed RCNC algorithm. Some key observations are listed as follows: 1) the analytical results (e.g., Proposition 3) are validated; 2) there is a performance gap between average- and peak-constrained problems, which vanishes as we reduce the arrival dynamics; 3) the throughput and cost performance improve with longer admissible lifetimes; 4) the distributed algorithm with n = 1 can achieve a comparable performance (especially in low-congestion regimes) with much lower complexity. 46 2.8.1 IllustrativeExample We study the packet routing problem based on the illustrative network in Fig. 2.4a, which consists of 4 nodes and 4 undirected links. The links exhibit homogeneous transmission capacity of C ij = 5 for ∀(i,j)∈E, with different costs given by: e 12 =e 24 =1,e 13 =e 34 =5. A single commodity is considered, where the packets of interest emerge at node1 (the source node), and are desired by node4 (the destination node). Each packet is of maximum lifetime ofL = 2 at birth, which implies that it can not be delayed for even one single time slot in order to be effective. The packet arrival process follows a Poisson distribution with parameterλ = 6, and a reliability level ofγ = 90% is demanded by the application. 2.8.1.1 EffectsofParameter V In this experiment, we study the tradeoff between the convergence time and operational cost controlled by parameterV . We implement and run the control algorithm using various parametersV ∈{0,1,··· ,10}. For eachV value, we carry out100 experiments, and observe the system forT =1× 10 6 time slots. The results are depicted in Fig. 2.5 and 2.6, and we make the following observations. First, for the average-constrained network, the operational cost reduces withV (Fig. 2.5a), which is in accordance with the analytical results presented in Proposition 3 and 4 (it also implies that flow matching is achieved). By intuition, we find that in this two-route example, the cheap route 1→ 2→ 4 (with cost e 12 +e 24 =2) is preferable when transmitting the packets to benefit the cost performance, which should be exploited to the largest extent; while to satisfy the throughput constraint, some packets still need to be pushed through the expensive route 1 → 3 → 4 (with cost e 13 +e 34 = 10). More concretely, the flow assignment of the entire network isx 12 = x 24 = min{C 12 ,C 24 } = 5 (the corresponding link capacity), x 13 =x 34 =γλ − x 12 =5.4− 5=0.4, leading to a cost performance ofh ⋆ 1 =5× 2+0.4× 10=14. As 47 0 2 4 6 8 10 Parameter V 0 1 2 3 4 5 6 0.01-Convergence Time [k slots] Average-Constrained network Peak-Constrained network (a).01-convergence time. 0 2000 4000 6000 8000 10000 Time 10 -4 10 -3 10 -2 10 -1 10 0 Gap from = 90% Average (V = 1) Average (V = 5) Peak (V = 1) Peak (V = 5) (b) The achieved reliability level. Figure 2.6: The effect of V on theε-convergence time (withε = .01), and the achieved reliability level over time under various settings. we can observe in Fig. 2.5a, the blue curve converges to the value of14 asV increases, which agrees with the above result. On the other hand, for the peak-constrained network, the principle to prioritize the cheap route also applies when transmitting the packets. However, due to the dynamics of the arrival processa(t) and the truncation, the amount of packets that can be scheduled for this route is x 12 (t) = min{a(t),C 12 } at every time slot. Under the assumption of i.i.d. Poisson arrival, it can be calculated that{x 12 (t)}≈ 4.47. Therefore, the optimal flow assignment for the peak-constrained network is x 12 = x 24 = 4.47, and x 13 = x 34 = γλ − x 12 = 0.93, leading to a cost ofh ⋆ 2 = 4.47× 2+0.93× 10 = 18.24. The proposed RCNC algorithm finds the flow assignment by trial and exploration, as shown in Fig. 2.5b (with V = 5): we use the link capacityC 12 =5 as the initial guess for the achievable flow rate, which overestimates the transmission capacity of the link; then its link capacity in the virtual network is reduced (on a frame basis), which gears the corresponding virtual flow; finally, flow matching is achieved when the link capacity ≈ achieved rate≈ 4.47. Finally, we emphasize that by increasing the value of V , it takes longer to converge to the desired reliability level. Fig. 2.6b shows the gap between the achieved and the desired reliability level over time, for two particular valuesV =1 andV =5. We find that (i) all the gap curves reduce over time, implying 48 2 4 6 8 10 Lifetime L 12 16 20 24 Cost Uniform Poisson Binomial Constant Converge to h = 14 (a) Operational cost. 2 4 6 8 10 Lifetime L 7 8 9 10 11 12 Capacity Region Uniform Poisson Binomial Constant Converge to = 11.1 (b) Stability region. Figure 2.7: The effect of maximum lifetime L and the arrival model on the achieved operational cost and stability region. convergence to the desired value, (ii) it takes longer for the peak-constrained network to converge than the corresponding averaged case (under the sameV ), which is due to the additional procedure of capacity iteration to find the feasible flow assignment, and (iii) the gap grows with V at a fixed time point for both average- and peak-constrained networks, i.e., a largerV results in slower convergence. In particular, we study the ε-convergence time with ε = .01, and the result is plotted in Fig. 2.6a. For the average- constrained network, the convergence time grows linearly withV , which supports the analytical result of O(V) in Proposition 3; similar result is observed from the peak-constrained network. 2.8.1.2 EffectsofLifetimeandArrivalModel Next, we study the effects of the maximum lifetime L and the statistics of the arrival process. The con- sidered lifetime L ranges from 2 to 10, and we try different models for the arrival process, including uniformU([0,2λ ]), Poisson Pois(λ ), binomialB(2λ, 1/2), as well as the constant arrival a(t) = λ . The four distributions are of the same mean valueλ , but decreasing dynamic (the corresponding variances are λ 2 /3 > λ > λ/ 2 > 0 if we assumeλ > 3). The reliability level is set asγ = 90%, andV = 10 is chosen to optimize the operational cost. 49 Two performance metrics are studied for each settings. One is the achieved operational cost, and the other is the stability region. ∥ In the first part of the experiment, we assume λ = 6 as in the previous experiments. The results for the peak-constrained networks are shown in Fig. 2.7. As we can observe, for any arrival model, as the maximum lifetimeL grows, the operational cost attained by RCNC reduces, while the stability region enlarges. The result agrees with our intuition, that as the initial lifetime grows, the packets are more likely to arrive at the destination while effective, and furthermore, through the cheap route (when possible). In this example, node 1 can withhold the packets in its queuing system at bursting time slots, leaving them for future transmission through 1 → 2 → 4 to optimize the cost. As L increases, the problem reduces to the traditional packet routing problem, where packet lifetime is not relevant, and the attained operational cost and stability regionconverge to the corresponding optimal results. ∗∗ We also find that under constant arrival, the maximum lifetime does not impact the performance, which approximately equals the optimal values of the traditional problem. †† Finally, we explain the effect of the arrival model. By fixing the maximum lifetime (e.g., L = 2), we compare different arrival models, and find that a higher dynamic of the arrival process can increase the operational cost, while shrinking the stability region, both due to the truncation effect of the peak- constrained links. With a given mean rateλ , it is more likely for a high-dynamic arrival process to exceed the transmission capacity, and more packets must be delayed for transmission, which can possibly lead to packet outdatedness, and thus reducing the achievable output rate. This is true for any link in general, and ∥ We recall that the stability region ofP0 is defined w.r.t. the pdf of the arrival process fa. In the experiment, as the model of the arrival process is fixed, we only need to specify λ to determine the pdf; in other words, the maximum arrival rateλ can represent the stability region under each model. ∗∗ However, we stress that RCNC does not guarantee convergence to the optimal performance in all cases. As we compare the uniform arrival with other models, there is gap in terms of both metrics, which is probably due to the sub-optimality of the flow matching technique in this case. †† In addition, for average-constrained networks, the various settings (including maximum lifetime and arrival model) do not impact the performance, either, which is the same as the plotted results for constant arrival. 50 in particular, the links lying in the cheap routes; as a result, a worsened performance of stability region and operational cost can be expected. ‡‡ 2.8.2 PracticalScenarios In this section, we demonstrate the performance of the proposed RCNC algorithm in two representative network scenarios: • mesh: a mesh edge computing network including25 servers that are randomly placed in an1 km× 1 km square area, with links established between any pair of servers within a distance of250 m, as shown in Fig. 2.4b, which is representative of a generic unstructured scenario. • hierarchical: a hierarchical edge computing network [46] composed of core, edge, access, and user layers, as shown in Fig. 2.4c, which represents envisioned practical MEC systems. In the mesh network, each link has a transmission capacity of C ij = 1 Gbps with a cost of e ij = 1/Gbps; each server has a processing capacity ofC i =2 GHz with a random coste i ∈{5,7.5,10}/GHz, which accounts for the heterogeneity of the computing devices. The parameters of thehierarchical network are summarized in the table in Fig. 2.4c. The default time slot length is1 ms in both networks. We adopt the AgI service model used in [33]. The AgI serviceϕ is modeled by a sequence of ordered functions, through which incoming packets must be processed to produce consumable results. The service functions can be executed at different network locations, and we assume that each network location can host all the service functions. Each function (say the m-th function of service ϕ ) is specified by two parameters: (i)ξ (m) ϕ : scaling factor, i.e., the output flow units per input flow unit. (ii) r (m) ϕ : workload, i.e., the required computation resource per input flow unit. ‡‡ The results suggest that when 1) the packet lifetime is abundant for transmission, or 2) the arrival process is of low-dynamic, P1 makes a good approximation for the original problemP0. However, the gap can be large in some extreme cases, e.g., uniform arrival withL=2 in the experiment. 51 In this experiment, we consider two AgI services including2 functions, with parameters given by (the workloadr (m) ϕ is in GHz/Mbps): Service1: ξ (1) 1 =1, ξ (2) 1 =2; r (1) 1 = 1 300 , r (2) 1 = 1 400 , Service2: ξ (1) 2 = 1 3 , ξ (2) 2 = 1 2 ; r (1) 2 = 1 200 , r (2) 2 = 1 100 . Each service has an i.i.d. Poisson arrival process (for packets with maximum lifetime), withλ 1 =λ 2 = λ Mbps, and requires a reliability level ofγ 1 = γ 2 = 90%. §§ The source-destination pair of each service is selected at random (with the shortest distance between them denoted byσ ). The maximum lifetime is then chosen asL=σ +2+∆ L, whereσ +2 is the least lifetime for packet delivery (“2” account for two processing slots), and∆ L≥ 0 denotes some allowable relaxation slots. 2.8.2.1 StabilityRegion In this section, we present the network stability regions achieved by the proposed algorithms, under differ- ent lifetime constraints and time slot lengths. We use a slot length of1 ms when conducting experiments for lifetime (Fig. 2.8a and 2.8c), and fix the delay constraint as L=50 ms when studying the effect of slot length (Fig. 2.8b and 2.8d). Fig. 2.8a and 2.8c depict the effect of lifetime, and we make following observations. First, the stability region enlarges with more available lifetimes, since packets can explore more network locations for ad- ditional computation resource; in particular, Fig. 2.8c saturates at ∆ L = 4 because the bottleneck links are constrained by the transmission limits. Second, the gap between the stability regions of average- and peak-constrained networks, is not significant (around 7% for mesh and5% for hierarchical). ¶¶ Finally, by §§ The service chain can expand or compress the size of the input flow, and we calculate the throughput on the basis of the input flow size. See Appendix A.8 for detailed explanation. ¶¶ We emphasize that RCNC does NOT guarantee to achieve the entire stability region in the peak-constrained case. In other words, the exact stability region in the peak-constrained case lies between the blue and red curves. 52 0 2 4 6 8 10 12 Lifetime L 0.8 1.0 1.2 1.4 1.6 1.8 Stability Region [Gbps] Average-constrained Peak-constrained (n = L) Peak-constrained (n = 1) 10% loss 7% gap (a) Effect of lifetime ( mesh). 0 2 4 6 8 10 Time Slot Length [ms] 0 0.5 1 1.5 2 Stability Region [Gbps] Average-constrained Peak-constrained (n = L) Peak-constrained (n = 1) 10% loss 7% gap (b) Effect of time slot length ( mesh). 0 2 4 6 8 10 Lifetime L 300 400 500 600 700 800 Stability Region [Mbps] Average-constrained Peak-constrained (n = L) Peak-constrained (n = 1) 10% loss 5% gap (c) Effect of lifetime ( hierarchical). 0 2 4 6 8 10 Time Slot Length [ms] 200 300 400 500 600 700 Stability Region [Mbps] Average-constrained Peak-constrained (n = L) Peak-constrained (n = 1) 5% gap 10% loss (d) Effect of time slot length ( hierarchical). Figure 2.8: Stability regions of Algorithm 1 and RCNC (for average- and peak-constrained cases, respectively), under different lifetimes and time slot lengths. comparing n = 1 and n = L, we find that including more look-ahead slots can benefit the throughput performance (by around10%), while resulting in a higher complexity ofO(n 2 L 2 )=O(L 4 ). Next, we tune the time slot length to study its impact on the stability region, with the results shown in Fig. 2.8b and 2.8d. As we increase the slot length, the attained throughput in mesh starts to degrade when it exceeds3 ms, since the resulting maximum lifetimeL(≤ 12) is not admissible to support the delivery of some services; while the throughput remains unchanged until a slot length of8 ms in hierarchical due to its simpler topology. On the other hand, a larger slot length can accelerate the algorithm: forn=1, as we increase the slot length from1 to10 ms, the running time for decision making reduces:52.8,13.5,6.6, 53 4.1,2.7,2.2,1.8,1.5,1.2,0.94 in mesh, and27.2,7.3,3.7,2.4,1.6,1.4,1.2,0.99,0.83,0.68 in hierarchical (in milliseconds). ∗∗∗ 2.8.2.2 ThroughputandCost In this experiment, we compare the timely throughput and operational cost attained by RCNC with two benchmark algorithms: • DCNC [33], which is shown to achieve optimal throughput and (near-optimal) cost performances, combined with last-in-first-out (LIFO) scheduling [40]; • an opportunistic scheduling algorithm [57] that provides worst-case delay guarantees for hop-count- limited transmissions, using the following parameters (notations are in line with [57]): g m (x) = x, β =ν m =1,A max m =1.25λ ,D (m),max n =max{ϵ, 1 (m) n A max m +µ max,in n }, withϵ found by grid search to optimize the timely throughput. We assume λ = 500 Mbps for mesh and 200 Mbps for hierarchical, and Fig. 2.9 depicts the achieved throughput and cost under different lifetimes ∆ L andV values. First, we focus on the reliability (or timely throughput) attained by the algorithms. The proposed RCNC algorithm can achieve a reliability level of90% under any∆ L andV that meets the requirement of the services (and the reliability constraint (2.10b) holds with equality). For DCNC, although it proves to be throughput optimal, the attained timely throughput is much lower, which increases with∆ L since more packets are counted as effective under a more relaxed lifetime constraint. ††† The worst-case delay algorithm [57] behaves slightly better than DCNC; however, there is still a considerable gap between the attained reliability and the imposed requirement (always guaranteed by the proposed RCNC algorithm), especially when the deadline constraint is stringent. ∗∗∗ Results are obtained using MATLAB 2021a running on a3.2 GHz computer, which leaves room for improvement, e.g., using a commercial solver and/or a faster processor. ††† As ∆ L → ∞, timely throughput converges to throughput, and DCNC can achieve a reliability level of 100% since it is throughput optimal. 54 0 2 4 6 8 10 Parameter V 10 6 0 0.2 0.4 0.6 0.8 1 Reliability 30 35 40 45 50 55 Cost RCNC DCNC + LIFO Worst case delay (a)V (mesh, with∆ L=2). 0 2 4 6 8 10 Lifetime L 0 0.2 0.4 0.6 0.8 1 Reliability 32 34 36 38 40 42 Cost RCNC DCNC + LIFO Worst case delay (b)∆ L (mesh, withV =1× 10 8 ). 0 0.5 1 1.5 2 Parameter V 10 7 0 0.2 0.4 0.6 0.8 1 Reliability 3 4 5 6 7 8 Cost RCNC DCNC + LIFO Worst case delay (c)V (hierarchical, with∆ L=2). 0 2 4 6 8 10 Lifetime L 0 0.2 0.4 0.6 0.8 1 Reliability 3.5 4 4.5 5 5.5 6 Cost RCNC DCNC + LIFO Worst case delay (d)∆ L (hierarchical, withV =1× 10 8 ). Figure 2.9: Throughput and cost achieved by DCNC (with LIFO) [33], worst-case delay [57] and RCNC. Next, we compare the operational cost of RCNC and DCNC. ‡‡‡ As shown in Fig. 2.9a and 2.9c, the operational costs of both algorithms reduce asV increases. WhenV is small, the cost of DCNC is signif- icantly higher, since it might deliver packets through cyclic routes; while RCNC can reduce the number of extra transmissions due to the deadline constraint. Second, in Fig. 2.9b and 2.9d, as we relax the dead- line constraint (or increase ∆ L), RCNC can achieve better cost performance, since packets can “detour” to cheaper network locations for processing; while packet lifetime is not relevant in DCNC, and its cost performance stays constant (which is near-optimal). Last but not least, we note that the operational cost of RCNC is lower than DCNC, because DCNC deliversall the packets to the destination; while RCNC only delivers effective packets to meet the reliability requirement. ‡‡‡ The worst-case delay algorithm [57] is excluded from the comparison, because (i) it does not optimize the operational cost (and the attained costs are around 70 for mesh and 30 for hierarchical), and (ii) in contrast to the other two algorithms, the parameterV has an essentially different interpretation. 55 0 1 2 3 4 5 Parameter V 10 6 0 0.2 0.4 0.6 0.8 1 Reliability 20 22 24 26 28 30 Cost Distributed n = 1 Distributed n = L Coordinated n = L Genie-aided n = L (a) Low-congestion regime (mesh). 0 1 2 3 4 5 Parameter V 10 6 0 0.2 0.4 0.6 0.8 1 Reliability 90 100 110 120 130 140 Cost Distributed n = 1 Distributed n = L Coordinated n = L Genie-aided n = L (b) High-congestion regime (mesh). 0 2 4 6 8 10 Parameter V 10 6 0 0.2 0.4 0.6 0.8 1 Reliability 2 3 4 5 6 7 Cost Distributed n = 1 Distributed n = L Coordinated n = L Genie-aided n = L (c) Low-congestion regime (hierarchical). 0 2 4 6 8 10 Parameter V 10 6 0 0.2 0.4 0.6 0.8 1 Reliability 5 10 15 20 25 30 Cost Distributed n = 1 Distributed n = L Coordinated n = L Genie-aided n = L (d) High-congestion regime (hierarchical). Figure 2.10: Performances attained by RCNC under different configurations. 2.8.2.3 PerformanceofRCNC Finally, we compare the performance of RCNC with different implementations: using n = 1 or L look- ahead slots (see Remark 6), centralized or distributed decision making (see Remark 7). A genie-aided algorithm serves as the benchmark, which can (by assumption) use accurate future arrival information to calculateA i (2.32) forn=L look-ahead slots. Assume∆ L=2. As we can observe from Fig. 2.10, in the low-congestion regime (λ = 20% of the stability region, Fig. 2.10a and 2.10c), the four implementations achieve comparable performance in bothmesh andhierarchical scenarios. However, when the network traffic becomes heavier ( λ = 80% of the stability region, Fig. 2.10b and 2.10d), the two distributed algorithms (usingn=1 orL) achieve sub-optimal cost performance (while still satisfying the reliability constraint); in contrast, the centralized algorithm remains robust, and the difference of its cost performance compared to the genie-aided algorithm is negligible. The result 56 shows the importance of coordinated decision making among the nodes in the high-congestion regime to preserve the optimality of the solution; yet, it also motivates the use of the simplified algorithm ( n=1) in practical systems, especially for networks with simpler topologies (such ashierarchical), which can achieve sub-optimal performance with greatly reduced computational complexity. 2.9 Extensions In this section, we briefly discuss flexible extensions to the proposed approach in order to handle scenarios of practical relevance. 2.9.1 MixedDeadline-ConstrainedandUnconstrainedUsers It is flexible to combine the proposed approach with existing queuing techniques [58] in order to treat hybrid scenarios that include both deadline-constrained and unconstrained users [59]. To be specific, we can establish a queuing system that is a hybrid of the proposed lifetime queues for the constrained users, and standard queues (i.e., without lifetime structure) for unconstrained users. As shown in Appendix A.7, the decisions for the two groups of users are loosely coupled, where unconstrained users follow the max- weight rule [58] for scheduling, and interact with constrained users via one additional variable for each link and look-ahead slot that represents the entire group, regardless of the number of unconstrained users. 2.9.2 Time-VaryingSlotLength While the technique developed in this chapter assumes a fixed slot length, it is also possible to adopt a varying slot length via “lifetime mapping”. In principle, when the slot length changes, we can construct a new queuing system, assign packets to queues of corresponding lifetimes, and map the decisions produced by the original policy to suite the new slot length. For example, if the slot length changes from 1 ms to 2 ms, we can add up the decisions (i.e., transmitted flows) for lifetime 2k− 1 and 2k packets to obtain 57 the decision for lifetimek packets based on the new slot length. The design of the mapping functions and associated performance loss analysis are topics worth further investigation. 2.10 Conclusions In this chapter, we investigated the delay-constrained least cost dynamic network control problem. We established a new queuing system to keep track of data packets’ lifetime, based on which we formalized the problemP 0 . To find an efficient approximate solution to this challenging problem, we first derived a relaxed problem with average capacity constraintsP 1 and designed a fully distributed, near-optimal solution that matches the LDP assigned flow on an equivalent virtual network P 2 . The methodology was then extended to solveP 0 , where we proposed a two-way optimization approach in order to use the assigned flow in P 2 to guide the flow solution to P 0 . Extensive numerical results were presented to validate the analytical results, illustrate the performance gain, and guide the system configuration. 58 Chapter3 EfficientDeliveryofMixed-castServices 3.1 Overview In this chapter, we aim to develop decentralized control policies for distributed cloud network flow prob- lems dealing with mixed-cast AgI services. We establish a new multicast control framework driven by a keyin-networkpacketreplication operation that guides the creation of copies of data packets as they travel toward their corresponding destinations. Compared to the state-of-the-art unicast approach [33] that “cre- ates one copy for each destination of a multicast packet upon arrival, and treats them as individual unicast packets”, the proposed policy employs ajointforwardingandreplication strategy that (i) eliminates redun- dant transmissions along network links common to multiple copies’ routes, and (ii) reduces computation operations by driving computation before replication when beneficial. Efficient in-network packet repli- cation is hence critical to improve the throughput and operational cost of multicast cloud network control policies. Two main limitations prohibit the use of existing queuing systems (e.g., [58, 33]) for the design of con- trol policies involving in-network packet replication: (i) they lack an effective mechanism to handle packets with identical content, but different destinations, (ii) they cannot capture in-network packet replication, known to violate the flow conservation law [81]. 59 The overcome these limitations, we propose a novel multicast queuing system that allows formalizing the packet replication operation that defines “where to create copies” and “how to assign destinations to resulting copies”, as a network control primitive. Under the proposed queuing system, each packet is labeled and queued according to its replication status, which keeps track of its current destination set. The packet replication operation is then specified by the partition of the destination set of a given packet into the destination sets of each resulting copy. We finally devise a fully decentralized packet processing, forwarding, and replication policy that attains optimal throughput and cost performance (see Theorem 3 for details), as well as a variant achieving sub-optimal performance with polynomial complexity. Our contributions can be summarized as follows: 1. We establish a novel queuing system to accommodate packets according to their replication status that allows supporting packet processing, routing, and replication operations as necessary cloud network control primitives for the delivery of mixed-cast AgI services. 2. We characterize the enlarged multicast network stability region obtained by including packet repli- cation as an additional control primitive, and quantify the resulting gain with respect to (w.r.t.) its unicast counterpart. 3. We devise GDCNC, the first fully decentralized, throughput- and cost-optimal algorithm for dis- tributed cloud network control with mixed-cast network flows. 4. We design GDCNC-R, a computational-efficient policy achieving sub-optimal performance with polynomial complexity by focusing on a subset of effective replication operations. 5. We conduct extensive numerical experiments that support analytical results and demonstrate the performance benefits of the proposed design for the delivery of mixed-cast AgI services. The rest of the chapter is organized as follows. In Section 3.2, we review the existing works related to this topic. In Section 3.3, we introduce the model for the “multicast packet routing” problem. In Section 60 3.4, we define the policy space and present a characterization for the multicast network stability region. Section 3.5 describes the multicast queuing system and defines the problem formulation. In Section 3.6, we devise the GDCNC control policy and analyze its performance, which further motivates the design of GDCNC-R in Section 3.7. Extensions to the proposed design are discussed in Section 3.8. Section 3.9 presents the numerical results, and conclusions are drawn in Section 3.10. 3.2 RelatedWork The challenge to address the design of optimal routing policies for multicast traffic comes from the need for packet replications, which violates the flow conservation law [81]. Despite the large body of existing works on this topic [43], throughput-optimal least-cost multicast packet routing remains an open problem. Even under static arrivals, the Steiner tree problem, which aims to find the multicast tree (a tree that covers all destinations) with minimum weight, is known to be NP-hard [47]. Many heuristic approaches have been developed to address this problem, such as the Extended Dijkstra’s Shortest Path Algorithm (EDSPA) [6], which delivers multicast packets along a tree formed by the union of the shortest paths from the source to all destinations. However, in addition to their heuristic nature, packets are delivered along fixed paths under these policies, lacking dynamic exploration of route and processing diversity. Considering dynamic arrivals becomes a further obstacle that requires additional attention. A centralized dynamic packet routing and scheduling algorithm, UMW, was proposed in [67], shown to achieve optimal throughput with mixed-cast network flows. Nonetheless, this design exhibits two limitations: (i) it makes centralized decisions based on global queuing information, incurring additional communication overhead, (ii) it leaves out operational cost minimization, which is an important aspect in modern elastic network environments. 61 3.3 SystemModel 3.3.1 CloudLayeredGraph The ultimate goal of this work is to design decentralized control policies for distributed cloud networks equipped with computation resources (e.g., cloud servers, edge/fog computing nodes, etc.) able to host AgI service functions and execute corresponding computation tasks. While in traditional packet routing problems, each node treats its neighbor nodes as outgoing interfaces over which packets can be scheduled for transmission, a key step to address the AgI service delivery problem, involving packet processing and routing decisions, is to treat co-located computation resources (i.e., computing devices) as an additional interface over which packets can be scheduled for processing. As illustrated in [81], the AgI service control problem can be transformed into a packet routing problem on a layered graph where cross-layer edges represent computation resources. Motivated by such a connection and for ease of exposition, in this chapter, without loss of generality, we illustrate the developed approach focusing on the single-commodity, least-cost multicast packet routing problem. We remark that (i) the optimal decentralized multicast control problem is still open even in tra- ditional communication networks, and (ii) the extension to distributed cloud network control is presented in Section 3.8.1. 3.3.2 NetworkModel The considered packet routing network is modeled by a directed graphG =(V,E), and each edge(i,j)∈E represents a network link supporting data transmission from node i to j, where δ − i and δ + i denote the incoming and outgoing neighbor sets of nodei, respectively. 62 Time is slotted. For each link (i,j) ∈ E, we define: (i) transmission capacity C ij , i.e., the maximum number of data units (e.g., packets) that can be transmitted in one time slot, and (ii) unit transmission cost e ij , i.e., the cost to transmit one unit of data in one time slot. We emphasize that in the layered graph, cross-layer edges represent data processing, i.e., data streams pushed through these edges are interpreted as being processed by corresponding service functions, and the capacity and cost of these edges represent theprocessingcapacity andunitprocessingcost of the associated computation resources (e.g., cloud/edge servers). 3.3.3 ArrivalModel We focus on a multicast application (the extension to multiple applications is straightforward and derived in Appendix B.6) where each incoming packet is associated with a destination setD ={d 1 ,··· ,d D }⊂V , with d k denoting the k-th destination, and D = |D| the cardinality of the destination set D, i.e., the destination set size. At least one copy of the incoming packet must be delivered to every destination inD. Importantly, we assume that delivering multiple copies of the same packet (containing the same content) to a given destination does not increase network throughput. Multicast packets originate at the application source nodesS ⊂V\D . We denote bya i (t) the number of exogenous packets arriving at node i at time slot t, with a i (t) = 0 for i / ∈ S. We assume that the arrival process is i.i.d. over time, with mean arrival rate λ i ≜ E a i (t) and bounded maximum arrival; the corresponding vectors are denoted bya(t)= a i (t):i∈V andλ =E{a(t)}. Remark 10. By properly defining the destination set D, the above model can capture all four network flow types: (i)unicastandbroadcastflowsaretwospecialcasesofamulticastflow,definedbysetting D ={d}and D =V,respectively;(ii)ananycastflowcanbetransformedintoaunicastflowbycreatinga super destination node connected to all the candidate destinations [67]. Therefore, it suffices to focus on the multicast flow to derive a general solution for the mixed-cast flow control problem. 63 3.3.4 In-networkPacketReplication We now formalize the most important operation for multicast packet routing, namely in-network packet replication. 3.3.4.1 ReplicationStatus We assume that each packet is associated with a label that indicates its current destination set, i.e., the set of destinations to which a copy of the packet must still be delivered, formally defined as follows. Definition6. For a given packet, the replication statusq = [q 1 ,··· ,q D ] is aD-dimensional binary vector where thek-th entry (k = 1,··· ,D) is set toq k = 1 if destinationd k ∈D belongs to its current destination set, and set toq k =0 otherwise. Three important cases of the replication statusq∈{0,1} D follow: (i) q = 1, which maps to the entire destination setD, is assigned to each newly arriving packet prior to any replication operation; (ii) q =b k describes a packet with one destinationd k , which behaves like a unicast packet; (iii) q =0 describes a packet without any destination, which is removed from the system immediately. where{0,1} D denotes the set of all D-dimensional binary vectors, 1 the all-ones vector, b k the vector with only thekth entry equal to1 (and0’s elsewhere), and0 the zero vector. 3.3.4.2 PacketReplicationandCoverageConstraint A replication operation creates copies of a packet and assigns a new destination set, which must be a subset of the original packet’s destination set, to each resulting copy. Letq∈{0,1} D denote the replication status of the original packet; then, the set of all possible replication status of the copies is given by its power set 64 2 q ≜{s:s k =q k u k ,u∈{0,1} D }. To ensure the delivery of a packet to all of its destinations, we impose the Coverage constraint on the replication operation: each destination node of the original packet must be present in the destination set of at least one of the resulting copies. 3.3.4.3 ConditionsonReplicationOperation In addition to theCoverage constraint, we require the replication operation to satisfy the followingCon- ditions: a) Joint forwarding and replication: Replication is performed only on packets to be transmitted. b) Efficient replication: The destination sets of the created copies do not to overlap. c) Duplication: Only two copies are created by one replication operation. d) Non-consolidation: Co-located packets of identical content are not combined. It can be shown that theseConditions do not reduce the achievable throughput nor increase the min- imum attainable cost. Condition a) avoids replicating packets that are not scheduled for transmission, which only increases network congestion and should be avoided. Condition b) is motivated by the fact that “receiving multiple packets of identical content at the same destination does not increase network throughput”, and thus replication should be performed in an efficient manner, i.e., NOT producing copies with overlapping destinations, to alleviate network traffic and associated resource consumption. Condi- tionsc) andd) are justified in Appendix B.1. Combining theCoverage constraint and the aboveConditions leads to the following important prop- erty: each destination node of a packet undergoing duplication (we use “duplication” instead of “replica- tion” in the rest of the chapter, e.g., duplication status) must be present in the destination set of exactly one of the two resulting copies. Formally, let q, s, r denote the duplication status of the original packet and of the two copies that result from the duplication operation, respectively; then,q =s+r. 65 As illustrated in Fig. 3.1, the duplication operation process works as follows. Letq denote the duplica- tion status of a packet selected for operation. Upon duplication, one copy is transmitted to the correspond- ing neighbor node (referred to as the transmitted copy, of statuss), and the other copy stays at the node waiting for future operation (referred to as the reloaded copy, of status r = q− s). We refer to the pair (q,s) as the duplication choice and denote byΩ = (q,s) : q∈{0,1} D ,s∈ 2 q the set of all duplication choices. Remark11. Wenotethattheduplicationchoice(q,q)describesthespecialcasethatapacketistransmitted without duplication. Inparticular,notethatforastatusb k packet,i.e.,a(unicast)packetwithonedestination, (b k ,b k ) is the only duplication choice. Remark 12. Duplication (q,b k ) automatically takes place when a status q packet with d k in its current destination set (i.e., q k = 1) arrives at destinationd k , in which case: the statusb k copy departs the network immediately, and the statusq− b k copy stays at noded k . 3.4 PolicySpacesandNetworkStabilityRegion In this section, we introduce the policy space for multicast packet delivery, based on which we characterize the multicast network stability region. 3.4.1 PolicySpace 3.4.1.1 DecisionVariables We consider a general policy space for multicast packet delivery, encompassing joint packet forwarding and duplication operations, with the decision variables given by µ (t)= µ (q,s) ij (t):(q,s)∈Ω , (i,j)∈E (3.1) 66 whereµ (q,s) ij (t) denotes the amount of statusq packets selected for (forwarding and duplication) operation, with duplication choice(q,s) and forwarding choice(i,j), at timet. That is,µ (q,s) ij (t) statusq packets are duplicated, resulting in µ (q,s) ij (t) status s packets transmitted over link (i,j), and µ (q,s) ij (t) status q− s packets reloaded to nodei, as illustrated in Fig. 3.1. 3.4.1.2 AdmissiblePolicies A control policy is admissible if the flow variables satisfy: a) non-negativity:µ (t)⪰ 0, i.e.,µ (q,s) ij (t)≥ 0 for∀(i,j)∈E,(q,s)∈Ω . b) link capacity constraint: P (q,s)∈Ω µ (q,s) ij (t)≤ C ij for∀(i,j)∈E. c) generalized flow conservation: for each intermediate node i∈V\D andq∈{0,1} D , X s∈2 ¯q X j∈δ − i µ (q+s,q) ji (t) + X s∈2 ¯q X j∈δ + i µ (q+s,s) ij (t) +λ (q) i = X s∈2 q X j∈δ + i µ (q,s) ij (t) (3.2) where ¯q = 1− q denotes the complement of vector q, and λ (q) i is the mean rate of exogenously arriving packetsa (q) i (t) = a i (t)I{q =1} withI{A} denoting the indicator function (equal to1 if eventA is true, and0 otherwise). d) boundary conditions: µ (q,s) d k j (t)=0 for∀d k ∈D, j∈δ + d k ,q withq k =1,k∈{1,··· ,D}. The generalized flow conservation c) can be described as follows. (i) In contrast to the instantaneous constraints, a), b), and d), which must hold at each time slot, c) imposes an equality constraint on the average flow rates of incoming/outgoing status q packets to/from node i. (ii) As illustrated in Fig. 3.1, for each nodei and statusq queue: the incoming flow of status q packets has three components: packets received from each neighbor nodej∈δ − i after undergoing duplication(q+s,q) (which creates transmitted copies of statusq), i.e.,µ (q+s,q) ji (t); packets that stay at nodei after undergoing local duplication(q+s,s) 67 Duplication choice Reloaded copy Duplication choice Transmitted copy Packet selected for operation Outgoing flow Incoming flow Node Node queues Exogenous arrival Node Node Duplication choice Operation Interface a (q) i (t) q + s 1 q q − s 2 μ (q 1 +s1,s1) ij (t) μ (q 2 +s3,q) j i (t) μ (q 3 ,s2) ij (t) μ (q 3 ,s2) ij (t) μ (q 3 ,s2) ij (t) (q + s 1 ,s 1 ) (q,s 2 ) (q + s 3 ,q) s 1 q + s 3 s 3 s 2 i j 1 j 2 j 3 Figure 3.1: Illustration of joint packet forwarding and duplication operations (solid, dashed, and dotted lines represent packets selected for operation, transmitted copies, and reloaded copies, respectively) and incoming/outgoing flow variables associated with the statusq queue of nodei (while only explicitly indicated for the red flow, note that all arrows with the same color are associated with the same flow variable). (which creates reloaded copies of statusq =(q+s)− s), i.e.,µ (q+s,s) ij (t); and exogenously arriving statusq packets, i.e.,a (q) i (t). The outgoing flow includes all status q packets selected for operation with duplication choice(q,s), i.e.,µ (q,s) ij (t). 3.4.1.3 CostPerformance The instantaneous overall resource operational cost of an admissible policy is given by h(t)=h(µ (t))= X (i,j)∈E e ij X (q,s)∈Ω µ (q,s) ij (t)=⟨e,µ (t)⟩. (3.3) where⟨e,µ (t)⟩ denotes the inner product of vectore andµ (t). We employ theexpectedtimeaveragecost {E{h(t)}}≜ lim T→∞ (1/T) P T− 1 t=0 E{h(t)} to characterize the cost performance and denote by h ⋆ (λ ) the minimum attainable cost under the arrival vectorλ . 3.4.2 MulticastNetworkStabilityRegion In this section, we characterize the multicast network stability region Λ to measure the throughput performance of the system, defined as the set of arrival vectors λ ={λ i :i∈V} under which there exists an admissible policy satisfying the constraints in Section 3.4.1.2. 68 Theorem 2. An arrival vectorλ is interior to the stability regionΛ if and only if there exist flow variables f = f (q,s) ij ≥ 0 and probability valuesβ = β (q,s) ij ≥ 0: P (q,s)∈Ω β (q,s) ij ≤ 1 such that: X s∈2 ¯q X j∈δ − i f (q+s,q) ji + X s∈2 ¯q X j∈δ + i f (q+s,s) ij +λ (q) i ≤ X s∈2 q X j∈δ + i f (q,s) ij , ∀i∈V,q∈{0,1} D , (3.4a) f (q,s) ij ≤ β (q,s) ij C ij , ∀(i,j)∈E,(q,s)∈Ω , (3.4b) f (q,s) d k j =0, ∀d k ∈D, j∈δ + d k , q :q k =1, k∈{1,··· ,D}. (3.4c) Inaddition,thereexistsastationaryrandomizedpolicyspecifiedbyprobabilityvalues β (i.e., at each time slot t, each link (i,j) selects C ij status q packets for forwarding and duplication operation with duplication choice(q,s) with probabilityβ (q,s) ij ) and attains optimal costh ⋆ (λ ). Proof. See Appendix B.2. Next, we analyze the benefit of the multicast framework exploiting in-network packet duplication to enlarge network stability region, compared to its unicast counterpartΛ 0 . Proposition5. The multicast network stability regionΛ satisfies Λ 0 ⊆ Λ ⊆ DΛ 0 , in whichDΛ 0 ≜{Dλ : λ ∈ Λ 0 }. Furthermore, for any arrival vector λ ∈ Λ 0 , the minimum attainable cost under the unicast framework, denoted byh ⋆ 0 (λ ), satisfies h ⋆ 0 (λ )≥ h ⋆ (λ ). Proof. See Appendix B.2. We illustrate the above results as follows. Compared to the unicast approach, the multicast framework can increase the achievable throughput and reduce the resource cost, leveraging in-network packet dupli- cation to eliminate redundant transmissions along network links common to multiple unicast routes. The gain factor of the achievable throughput is bounded by the number of destinationsD. 69 3.5 ProblemFormulation 3.5.1 QueuingSystem Since forwarding and duplication decisions are driven by each packet’s destination set, keeping track of data packets’ destination sets is essential. A key step is to construct a queuing system with distinct queues for packets of different duplication status q. In particular, we denote byQ (q) i (t) the backlog of the queue holding status q packets at node i at time t. Define Q (0) i (t) = 0 and Q(t) = Q (q) i (t) : i ∈ V, q ∈ {0,1} D . Each time slot is divided into two phases. In the transmitting phase, each node makes and executes forwarding and duplication decisions based on the observed queuing states. In the receiving phase, the incoming packets, including those received from neighbor nodes, reloaded copies generated by duplication, as well as exogenously arriving packets, are loaded into the queuing system, and the queuing states are updated. 3.5.1.1 QueuingDynamics The queuing dynamics are derived for two classes of network nodes. (i) For an intermediate nodei∈V\D, the queuing dynamics are given by Q (q) i (t+1)≤ max h 0,Q (q) i (t)− µ (q) i→ (t) i +µ (q) →i (t)+a (q) i (t), ∀q∈{0,1} D (3.5) whereµ (q) i→ (t) andµ (q) →i (t) are the outgoing and (controllable) incoming network flows of status q packets from/to nodei, given by (as illustrated in Fig. 3.1) µ (q) i→ (t)= X j∈δ + i X s∈2 q µ (q,s) ij (t), µ (q) →i (t)= X j∈δ − i X s∈2 ¯q µ (q+s,q) ji (t)+ X j∈δ + i X s∈2 ¯q µ (q+s,s) ij (t). (3.6) 70 (ii) For a destination nodei=d k (k =1,··· ,D), the queuing dynamics are given by Q (q) i (t+1)≤ I +µ (q+b k ) →i (t) I{q k =0}, ∀q∈{0,1} D , (3.7) whereI represents the right hand side of (3.5), . To wit, at destination noded k : (i) all statusq queues with q k =1 are always empty, and (ii) all other queues have an additional incoming flow corresponding to the reloaded copies resulting from the automatic duplication of statusq+b k packets arriving at the destination (see also Remark 12), i.e.,µ (q+b k ) →i (t). Remark 13. Similar to many existing control policies, e.g., [33, 71, 58, 18], we note that flow variable µ (t) canleadtoatotaloutgoingflow µ (q) i→ (t)exceedingtheavailablepacketsinqueueQ (q) i (t), andweincludethe ramp functionmax[0,· ] in (3.5) to avoid negative queue length. Dummy packets are created when there are not sufficient packets in the queue to support the scheduling decision: an operation known not to affect the network stability region [58]. 3.5.2 ProblemFormulation The goal is to develop an admissible control policy that stabilizes the queuing system, while minimizing overall operational cost. Formally, we aim to find a control policy with decisions {µ (t):µ (q,s) ij (t)≥ 0, t≥ 0} satisfying min µ (t) {E{h(t)}} (3.8a) s.t. {E{∥Q(t)∥ 1 }}<∞, i.e., stabilizingQ(t) (3.5) – (3.7), (3.8b) X (q,s)∈Ω µ (q,s) ij (t)≤ C ij , ∀(i,j). (3.8c) 71 In addition, in Appendix B.5, we show that the average delay is linear in the queue backlog{E{∥Q(t)∥ 1 }}, and thus (3.8b) is equivalent to guaranteeing finite average delay . 3.6 GeneralizedDistributedCloudNetworkControlAlgorithm In this section, we leverage the LDP theory [58] to address (3.8), which guides the design of the proposed GDCNC algorithm. 3.6.1 LyapunovDrift-plus-Penalty Define the Lyapunov function as L(t)=∥Q(t)∥ 2 2 /2, and the Lyapunov drift∆( Q(t))=L(t+1)− L(t). The LDP approach aims to minimize a linear combination of an upper bound of the Lyapunov drift (which is derived in Appendix B.3) and the objective function weighted by a tunable parameterV , i.e., ∆( Q(t))+Vh(t)≤|V| B+⟨a(t),Q(t)⟩−⟨ w(t),µ (t)⟩ (3.9) whereB is a constant, and the duplication utility weightsw(t) ={w (q,s) ij (t) : (i,j)∈E,(q,s)∈ Ω } are given by w (q,s) ij (t)=Q (q) i (t)− Q (q− s) i (t)− Q (s) j (t)− Ve ij . (3.10) Equivalently, the proposed algorithm selects the flow variable µ (t) to maximize⟨w(t),µ (t)⟩ at each time slot, which can be decomposed into separate problems for each link(i,j): max µ (q,s) ij (t):(q,s)∈Ω X (q,s)∈Ω w (q,s) ij (t)µ (q,s) ij (t), s.t. X (q,s)∈Ω µ (q,s) ij (t)≤ C ij , µ (q,s) ij (t)≥ 0. (3.11) The resulting max-weight solution is described in next section. 72 Algorithm3 GDCNC 1: for t≥ 0 and(i,j)∈E do 2: Calculate the duplication utility weightw (q,s) ij (t) for all duplication choices(q,s)∈Ω by (3.10). 3: Find the(q,s) pair with the largest weight(q ⋆ ,s ⋆ )=argmax (q,s)∈Ω w (q,s) ij (t). 4: Assign transmission flow: µ (q,s) ij (t)=C ij I w (q ⋆ ,s ⋆ ) ij (t)>0,(q,s)=(q ⋆ ,s ⋆ ) . 5: endfor 3.6.2 GeneralizedDistributedCloudNetworkControl The developed algorithm, referred to as generalized distributed cloud network control (GDCNC), is de- scribed in Algorithm 3, and exhibits two salient features. (i) Decentralized: it requires local information (i.e., neighbor nodes’ queuing states) exchange and decision making, which can be implemented in a fully distributed manner. (ii) Sparse: for each link (i,j), it selects one duplication choice (q ⋆ ,s ⋆ ) at each time slot, which affects the states of status q ⋆ andq ⋆ − s ⋆ queues at nodei, and statuss ⋆ queue at nodej. There- fore, for each nodei, the number of queues with changing states in one time slot is2|δ + i |+|δ − i |∼O (|δ i |), where|δ i | denotes the degree of nodei. Remark14. Notethatwhendealingwithaunicastflow, whereallpacketshavethesamesingledestination, the only valid duplication status is q = 1 and the only valid duplication choice is (q,s) = (1,1), in which case GDCNC reduces to DCNC [33]. 3.6.3 PerformanceAnalysis In this section, we analyze the delay and cost performance of GDCNC, and its complexity from both com- munication and computation dimensions. 3.6.3.1 Delay-CostTradeoff In the following theorem, we employ the minimum attainable cost h ⋆ (λ ) as the benchmark to evaluate the performance of GDCNC. 73 Theorem3. For any arrival vectorλ interior to the multicast stability regionΛ , the average queue backlog and the operational cost achieved by GDCNC satisfy {E{∥Q(t)∥ 1 }}≤ |V|B ϵ + h ⋆ (λ +ϵ 1)− h ⋆ (λ ) ϵ V, {E{h(t)}}≤ h ⋆ (λ )+ |V|B V , (3.12) for anyϵ> 0 such thatλ +ϵ 1∈Λ . Proof. See Appendix B.4. The above theorem is illustrated as follows. For any arrival vector interior to the stability region, GDCNC (using any fixed V ≥ 0) can stabilize the queuing system, and thus is throughput-optimal. In addition, GDCNC achieves an[O(V),O(1/V)] tradeoff between delay (which is linear in queue backlog) and cost: by pushing V → ∞, the attained cost can be arbitrarily close to the minimum h ⋆ (λ ), with a tradeoff in network delay. 3.6.3.2 ComplexityIssues Next, we analyze the complexity of GDCNC. Communication Overhead: At each time slot, GDCNC requires local exchange of queue backlog information. In contrast to exchanging the states of all the queues (of size|{0,1} D | ∼ O (2 D )), we can leverage sparsity (see Section 3.6.2) to reduce the communication overhead. To be specific, nodes only exchange information about the queues with changing states, reducing the overhead toO(|δ i |). Computational Complexity: At each time slot, GDCNC calculates the utility weight of each du- plication choice (q,s), of computational complexity proportional to|Ω | ∼ O (3 D ) (see Appendix B.7), i.e., exponential in the destination set size. This is due to the combinatorial nature of the multicast rout- ing problem. Indeed, the state-of-the-art centralized solution to the multicast flow control problem [81] requires solving the NP-complete Steiner tree problem for route selection at each time slot. 74 (1,1,1) (1,0,0) (0,1,1) (0,0,1) (0,1,0) duplication tree ( : packet duplication) 3 2 4 0 1 (1,0,0) (0,1,0) (0,0,1) route A (0,1,1) (1,1,1) 3 2 4 0 1 (1,0,0) (0,1,1) (0,0,1) route B (0,1,1) (1,1,1) Figure 3.2: Different multicast routes (i.e., routing trees) can share the same duplication choices (i.e., duplication tree). For example, the red and blue routes used to deliver the status(1,1,1) packet from source node0 to destination nodes{1,2,3} are associated with the same duplication tree. 3.7 GDCNC-RwithReducedComplexity In this section, we develop GDCNC-R, a variant of GDCNC achieving sub-optimal performance with Re- duced complexity. 3.7.1 DuplicationTree In order to illustrate the complexity reduction of GDCNC-R, we first define the duplication tree, as a useful representation of the duplication operations performed on a given multicast packet. Definition7. A duplication treeT is a binary tree with each node associated with a duplication status: the root node represents the status of the initial multicast packet,1; theD leaf nodes represent the status of the copiesdeliveredtoeachdestination,b k (k∈{1,··· ,D});andeachinternalnodeq splitsintotwochildnodes s andr, associated with duplication(q,s) and(q,r), respectively. As illustrated in Fig. 3.2, the duplication choices of a given multicast route are described by a dupli- cation tree. Note however that different routes can share the same duplication choices, hence the same duplication tree. GDCNC achieves optimal performance by evaluating all possible duplication choices at each node, which is equivalent to consider all routes in all duplication trees. This is also the reason for its high computational complexity. 75 3.7.2 ProposedApproach In contrast, the developed GDCNC-R algorithm narrows the focus on the multicast routes included in a subset of effective duplication trees and associated duplication choices. The goal in selecting effective duplication trees is to minimize the resulting (throughput and cost) performance loss, and the proposed approach is referred to as destination clustering. Importantly, the main benefit of the multicast framework is to save redundant operations by exploiting in-network packet duplication. To maximize this gain, a duplication shall be performed only on packets with “distant” destinations in their current destination set (under distance metrics listed in the following). To this end, we propose to construct the duplication tree as follows: Starting from the root node repre- senting the entire destination setD, the two child nodes of each node are given by dividing the associated destination set into two disjoint clusters according to the given distance metric. While there are many existing clustering methods, we use the widely adoptedk-means clustering [4]. Some effective choices for the inter-node distance are listed as follows. To optimize the throughput performance, we can use the reciprocal transmission capacity as the inter-node distance; while unit trans- missioncost is a proper metric to deal with the cost performance. In the wireless scenario, a straightforward metric isgeographicdistance, whose effectiveness is validated by numerical experiments (see Fig. 3.3a and 3.3b). 3.7.3 ComplexityAnalysis As shown in Appendix B.7, each duplication tree includes 2D − 1 nodes, with D − 1 internal nodes (including the root node) andD leaf nodes. There are3 possible duplication choices associated with each internal nodeq (with child nodess andr), i.e.,(q,q),(q,s) and(q,r), while only one duplication choice (b k ,b k ) is associated with each leaf nodeb k . Therefore, every duplication tree includes4D− 3∼O (D) possible duplication choices. 76 GDCNC-R uses K duplication trees, with K chosen to strike a good balance between perforamnce optimality and computational complexity. Specifically, GDCNC-R has O(KD) complexity, i.e., polynomial in the number of duplication trees and destination set size. 3.8 Extensions In this section, we present extensions to the GDCNC algorithms, including: (i) a cloud network control policy for multicast AgI service delivery, (ii) a modified algorithm for MEC scenario (with wireless links), (iii) a variant of GDCNC with Enhanced delay performance, EGDCNC. 3.8.1 MulticastAgIServiceDelivery In line with [33, 81], the additional packet processing decisions involved in the distributed cloud network setting can be handled as follows. 3.8.1.1 CloudNetworkModel Consider a cloud network composed of compute-enabled nodes that can process data packets via corre- sponding service functions, with the available processing resources and associated costs defined as: (i) processing capacity C i , i.e., the computation resource units (e.g., computing cycles per time slot) at node i, and (ii) unit processing coste i , i.e., the cost to run one unit of computation resource in one time slot at nodei. 3.8.1.2 AgIServiceModel Consider an AgI service modeled as SFC, i.e., a chain ofM− 1 functions, through which source packets must be processed to produce consumable results, resulting in the end-to-end data stream divided into M stages. Each processing step can take place at different network locations hosting the required service 77 functions, and each function(m∈{1,··· ,M− 1}) in the service chain is described by two parameters: (i) scaling factorξ (m) , i.e., the number of output packets per input packet, and (ii) workloadr (m) , i.e., the amount of computation resource to process one input packet. 3.8.1.3 ProcessingDecision We create different queues to hold packets of different processing stages and duplication status. Let Q (m,q) i (t) denote the backlog of stage m status q queue packets, and µ (m,q,s) i (t) the scheduled process- ing flow , i.e., the amount of stage m status q packets selected for (processing and duplication) operation with duplication choice(q,s) at nodei at timet. Following the procedure in Section 3.6.1 on a properly constructed cloud layered graph (see Appendix B.6 for the full derivation), the processing decisions for each nodei at timet are given by (i) Calculate the duplication utility weightw (m,q,s) i (t) for each(m,q,s) tuple: w (m,q,s) i (t)= Q (m,q) i (t)− ξ (m) Q (m+1,s) i (t)− Q (m,q− s) i (t) r (m) − Ve i . (3.13) (ii) Find the(m,q,s) tuple with the largest weight: (m ⋆ ,q ⋆ ,s ⋆ )=argmax (m,q,s) w (m,q,s) i (t). (iii) Assign processing flow: µ (m,q,s) i (t)= C i r (m) I w (m ⋆ ,q ⋆ ,s ⋆ ) i (t)>0,(m,q,s)=(m ⋆ ,q ⋆ ,s ⋆ ) . 3.8.2 MECScenario In line with [18], the packet transmission decisions can be modified to handle wireless distributed cloud network settings, e.g., MEC [24], as follows. 78 3.8.2.1 WirelessTransmissionModel Consider a MEC network composed of two types of nodes, i.e., UEs and edge servers (ESs), collected inV UE andV ES , respectively. Each ES, equipped with massive antennas, is assigned a separate frequency band of width B 0 , and with the aid of beamforming techniques, it can transmit/receive data to/from multiple UEs simultaneously without interference (assuming that the UEs are spatially well separated) [3]; while each UE is assumed to associate with only one ES at each time slot. For each wireless link (i,j), the transmission powerp ij (t) is assumed to be constant during a time slot (with the maximum power budget of node i denoted by P i ), incurring cost (˜ e i τ )p ij (t), with ˜ e i denoting the unit energy cost at node i and τ the time slot length. Besides, we assume that the channel gain g ij (t) is i.i.d. over time and known by estimation, and the noise power is denoted byσ 2 ij . The ESs are connected by wired links, as described in Section 3.3.2. 3.8.2.2 WirelessTransmissionDecision In addition to flow assignment µ (t) = {µ (q,s) ij (t) ≥ 0 : (i,j),(q,s)}, the control policy makes decisions on power allocation, i.e., p(t) = {p ij (t) ≥ 0 : (i,j)}, and link activation (or UE-ES association), i.e., χ (t) = {χ ij (t) ∈ {0,1} : (i,j)}, where χ ij (t) indicates if link (i,j) is activated (χ ij (t) = 1) or not (χ ij (t)=0). 79 The LDP bound (3.9) remains valid, and the resulting problem (3.11) is modified as max X (i,j)∈E χ ij (t)Ψ ij (t), Ψ ij (t)≜ X (q,s)∈Ω w (q,s) ij (t)µ (q,s) ij (t)− V(˜ e i τ )p ij (t) (3.14a) s.t. X j∈δ + i χ ij (t)≤ 1 for∀i∈V UE ; χ ij (t)=1 for∀i∈V ES , (3.14b) X (q,s)∈Ω µ (q,s) ij (t)≤ τR ij (t), withR ij (t)≜B 0 log 2 1+g ij (t)p ij (t) σ 2 ij , (3.14c) X j∈δ + i χ ij (t)p ij (t)≤ P i , ∀i∈V (3.14d) wherew (q,s) ij (t) = Q (q) i (t)− Q (q− s) i (t)− Q (s) j (t). Following the procedure in [18] to solve (3.14), we can derive the wireless transmission decisions for each link(i,j) at timet as Power allocation: p ⋆ ij (t)=min p ij (t;0),P i , i∈V UE ; p ⋆ ij (t)=p ij (t;ν ⋆ ), i∈V ES , (3.15a) Flow assignment: µ ⋆(q,s) ij (t)=τR ij (t)I{(q,s)=(q ⋆ ,s ⋆ )}, (q ⋆ ,s ⋆ )=argmax (q,s) w (q,s) ij (t), (3.15b) Link activation: χ ⋆ ij (t)=I{j =j ⋆ ,Ψ ⋆ ij (t)>0}, i∈V UE ; χ ⋆ ij (t)=1, i∈V ES (3.15c) wherep ij (t;ν )≜ max h B 0 w (q ⋆ ,s ⋆ ) ij (t) ˜ e i V +ν ln2− σ 2 ij g ij (t),0 i ,ν ⋆ = max{0,ν 0 } withν 0 satis- fying P j∈δ + i p ij (t;ν 0 )=P i , andj ⋆ =argmax j Ψ ⋆ ij (t). 3.8.3 EGDCNCwithEnhancedDelay In line with [78, 33], a biased queue that incorporates network topology information can be designed to enhance the delay performance of multicast flow control. In the unicast setting, [78] defines the bias term for each node i,H U (i,d), as theminimumhop-distance between nodei and destinationd, and combines it with the physical queue as ˜ Q i (t)=Q i (t)+ηH U (i,d). The bias term creates an intrinsic pressure difference that pushes packets along the shortest path to the 80 Table 3.1: Network Resources and Operational Costs of the Studied System Processing Transmission UE Ci =1 GHz,ei =2/GHz UE-ES i∈V UE:Pi =200 mW,˜ ei =.01/J;i∈V ES:Pi =1 W,˜ ei =.005/J ES Ci =5 GHz,ei =1/GHz ES-ES Cij =1 Gbps,eij =1/Gbps for(i,j) if distance(i,j)=100 m destination. The parameterη can be found by grid search to optimize the combined effect of hop-distance and queue backlog on the total network delay. In the multicast setting, we propose to modify the biased queue as ˜ Q (q) i (t)=∥q∥ 1 Q (q) i (t)+ηH M (i,q), withH M (i,q)≜ D X k=1 q k H U (i,d k ). (3.16) Differently from the unicast case, (i) the bias term H M (i,q) is now the sum of minimum hop-distances to all current destinations, and (ii) the backlog termQ (q) i (t) is now weighted by∥q∥ 1 in order to capture the impact on all involved copies, as shown in Appendix B.5. EGDCNC works just like GDCNC, but using ˜ Q(t) in place ofQ(t) to make forwarding and duplication decisions. Following the procedure in [78], we can show that EGDCNC (using any fixed η ≥ 0) does not lose throughput and cost optimality as compared to GDCNC. 3.9 NumericalResults 3.9.1 NetworkSetup We consider a MEC network within a square area of200 m× 200 m, including9 UEs and4 ESs. In a Carte- sian coordinate system with the origin located at the square center, the ESs are placed at(± 50 m,± 50 m). Each user is moving according to Gaussian random walk (reflected when hitting the boundary), and the displacement in each time slot distributes inN(0,10 − 4 I) m. The length of each time slot isτ =1 ms. 81 We employ the transmission model described in Section 3.8.2, where ESs are connected by wired links, while UEs and ESs communicate via wireless links with the following parameters: bandwidthB 0 = 100 MHz, path-loss= 32.4+20log 10 (f c )+31.9log 10 (distance) dB withf c = 30 GHz (urban microcell [1]), communication range=150 m, standard deviation of shadow fadingσ SF =8.2 dB, andσ 2 ij =N 0 B 0 with noise spectral densityN 0 =− 174 dBm/Hz. The processing/transmission capacities and associated costs are shown in Table 3.1. Consider two AgI services composed of 2 functions with parameters given by (the subscript of each parameter denotes the associated serviceϕ , and workloadr (m) ϕ is in GHz/Mbps): ξ (1) 1 =1, r (1) 1 = 1 300 , ξ (2) 1 =2, r (2) 1 = 1 400 ; ξ (1) 2 = 1 3 , r (1) 2 = 1 200 , ξ (2) 2 = 1 2 , r (2) 2 = 1 100 . Each service has1 source node andD =3 destination nodes, which are randomly selected from the UEs, and the arrival process is modeled as i.i.d. Poisson withλ Mbps. 3.9.2 UniformResourceAllocation First, we compare the proposed GDCNC algorithms with two state-of-the-art cloud network control poli- cies: UCNC (a throughput-optimal source routing algorithm) [81] and EDSPA (a widely used multicast routing technique) [6]. Since resource allocation exceeds the scope of the benchmark algorithms, the following policy is employed for all algorithms for fair comparisons: each node/link uses maximum pro- cessing/transmission capacity, and each ES allocates equal transmission power for each UE. In the GDCNC algorithms, we selectV = 0 to optimize the delay performance; in GDCNC-R, we selectK = 1 duplica- tion tree byk-means clustering using geographic distance; in UCNC, route selection is based on delayed information resulting from hop-by-hop transmission of the queuing states from all nodes to the source. 82 0 100 200 300 400 500 Arrival rate [Mbps] 0 0.2 0.4 0.6 0.8 1 Average delay [s] EDSPA UCNC GDCNC EGDCNC GDCNC-R EGDCNC-R 0 40 80 0 10 20 30 ms 70 430 2% loss (a) Uniform allocation (benchmark: EDSPA, UCNC). 0 100 200 300 400 500 600 Arrival rate [Mbps] 0 0.2 0.4 0.6 0.8 1 Average delay [s] GDCNC EGDCNC GDCNC-R EGDCNC-R DCNC EDCNC 0 100 0 50 ms 580 210 2% loss unicast stability region reuse gain 2.76 (b) Optimal allocation (benchmark: DCNC, EDCNC). Figure 3.3: Network stability regions attained by the benchmark and proposed algorithms. Fig. 3.3a depicts the average delay attained by the algorithms under different arrival rates. First, we focus on the throughput performance. We observe an identical critical point (≈ 430 Mbps) for GD- CNC, EGDCNC, and UCNC, at which point the average delay blows up, indicative of the stability region boundary, validating Proposition 3, i.e., the throughput-optimality of GDCNC/EGDCNC (because UCNC is throughput-optimal [81]). GDCNC-R/EGDCNC-R achieve sub-optimal throughput by only≈ 2%, il- lustrating the marginal performance loss from precluded duplication trees. Finally, EDSPA only achieves a maximum rate≈ 70 Mbps, which is far from the stability region boundary, due to the lack of route diversity. When looking at the delay performance, we observe that the enhanced variants EGDCNC/ EGDCNC-R effectively reduce the delay of the initial solutions GDCNC/GDCNC-R. While in low-congestion regimes, EDSPA (by delivering each packet along the shortest path) and UCNC (by selecting acyclic paths) can outperform EGDCNC/EGDCNC-R, the gap vanishes as the network congestion grows. Finally, we study the complexity of the algorithms. We first sort the algorithms by communication overhead, i.e., required information for decision making. Noting that network topology changes much slower than queuing states, we have: EDSPA (requiring network topology)≪ GDCNC (requiring local queuing states)≈ EGDCNC (requiring local queuing states and network topology)≪ UCNC (requiring 83 global queuing states). We then present the running time of the algorithms as a measure forcomputational complexity: EDSPA (1.3 s)< GDCNC≈ EGDCNC (2.7 s)≪ UCNC (5.8 min), among which: EDSPA runs fastest because it uses fixed routes; GDCNC can operate efficiently to complete simple algebraic operations (3.10); and UCNC requires solving the NP-complete Steiner tree problem (in a39-node layered graph) at each time slot, which incurs high computational complexity that can increase when applied to larger networks. To sum up: in low-congestion regimes, EDSPA is a good choice considering its low complexity and su- perior delay performance; in high-congestion regimes, UCNC works better for small networks that impose low overhead for information collecting and decision making; and EGDCNC is competitive in all regimes and especially suitable for large-scale distributed cloud networks. 3.9.3 OptimalResourceAllocation Next, we demonstrate the cost performance of the GDCNC algorithms, employing DCNC [33] as the bench- mark, which is throughput- and cost-optimal for unicast cloud network flow control. 3.9.3.1 NetworkStabilityRegion Fig. 3.3b shows the network stability regions attained by the algorithms. We find that: the designed power allocation policy (3.15) boosts the stability region (≈ 580 Mbps), compared to uniform power allocation (≈ 430 Mbps, as shown in Fig. 3.3a). Another observation is: compared to DCNC (which attains theunicast stability region≈ 210 Mbps), the multicast framework enlarges the stability region by a factor of 2.76, which is bounded by the destination set size D = 3, validating Proposition 5. Finally, the performance loss of GDCNC-R is marginal (≈ 2%) compared to GDCNC, which, together with the results shown in Fig. 3.3a, validates its competitive throughput performance. 84 0 2 4 6 8 10 V 10 7 0 50 100 150 Average delay [s] GDCNC GDCNC-R DCNC (a) Average delay (versusV ). 10 6 10 7 10 8 V [in log-scale] 5 15 25 Cost GDCNC GDCNC-R DCNC 2.5 2.9 (b) Operational cost (versusV ). .03 .1 1 10 100 500 Average delay [s, in log-scale] 5 10 20 50 100 Cost [in log-scale] GDCNC GDCNC-R D = 4 D = 5 D = 3 D = 2 GDCNC and GDCNC-R (c) Effects of destination set size D. Figure 3.4: Delay-cost tradeoffs attained by the three algorithms under different V parameters and destination set sizesD. 3.9.3.2 DelayandCostPerformance Next, we present the tunable delay and cost performance of the algorithms, under λ = 150 Mbps and η =0 (we fix η and focus on the effects of V ). As shown in Fig. 3.4a, the average delay attained by each algorithm increases linearly withV . Note however that DCNC can achieve a better delay than GDCNC (and almost the same as GDCNC-R) for arrival rates within the unicast stability region (low congestion regime in the enlarged multicast stability region), because it can select separate paths for each copy to optimize the individual delays. This is expected, as the throughput-optimal design of GDCNC, whichjointly selects the copies’ paths to reduce network traffic and enlarge the stability region, can lead to higher delay in low-congestion regimes. Fig. 3.4b shows the reduction in operational cost with growingV , validating Theorem 3. By pushing V →∞, the curves converge to the corresponding optimal costs, given by: DCNC (7.7)> GDCNC-R (2.8) ≈ GDCNC (2.65). ForV <10 7 , GDCNC-R attains an even lower cost than GDCNC, albeit its sub-optimal asymptotic performance (e.g., V > 10 7 ). However, note that a large V also leads to an excessive delay, making it a sub-optimal choice in practical systems. For example, when increasingV from10 6 to10 7 , the cost attained by GDCNC reduces from21.4 to6.7, while the delay grows from1.3 to13.4 seconds (these results are for comparison purpose, and can be improved by EGDCNC, as shown in Fig. 3.3a and 3.3b). 85 3.9.3.3 EffectsofDestinationSetSize Finally, we present in Fig. 3.4c the delay-cost tradeoffs attained by the GDCNC algorithms, under different destination set sizes varying fromD =2 to5, andλ =300 Mbps. We make the following observations. A larger destination set size D results in (i) increasing delay and cost, since more network resources are consumed to handle copies for additional destinations, which also results in longer time waiting for available network resources, (ii) growing running time, due to the greater number of duplication choices, e.g., GDCNC (3.2 s)≈ GDCNC-R (2.9 s) whenD =3, and GDCNC (8.9 s)> GDCNC-R (5.8 s) whenD =5, (iii) a widening gap between GDCNC and GDCNC-R, especially in the delay dimension, which validates the benefit of GDCNC-R reducing the queuing system and decision space. When selecting appropriate V values, GDCNC-R can attain suitable delay-cost performance pairs (0.8 s,20),(1.3 s,23),(2 s,27) under destination set sizesD =3,4,5, respectively. To sum up, although we cannot provide an analytical bound on the performance loss of GDCNC- R, numerical results validate that it can remain competitive in throughput performance (with negligible performance loss), while striking an even better delay-cost tradeoff than GDCNC. 3.10 Conclusions We addressed the problem of decentralized control of mixed-cast AgI services in distributed cloud net- works. Under the multicast framework, we characterized the enlarged network stability region and ana- lyzed the benefit of exploiting in-network packet duplication. By extending LDP control to a novel queuing system that allows accommodating the duplication operation and making flow control decisions driven by data packets’ current destinations, we designed the first decentralized, throughput- and cost-optimal packet processing, routing, and duplication policy for generalized (multicast) flow control, GDCNC, as well 86 as practical variants targeting reduced complexity, enhanced delay, and extended scenarios. Via numerical experiments, we validated the performance gain attained via effective in-network packet duplication, as well as the benefits of joint processing, routing, and duplication optimization for the efficient delivery of multicast AgI services over distributed cloud networks. 87 Chapter4 EfficientDeliveryofData-intensiveServices 4.1 Overview In this chapter, we investigate the problem ofjoint3Ccontrolforonlinedata-intensiveservicedelivery. The data-intensive service can include multiple functions, and each function requires multiple input streams that can be live data (generated by device sensors), static objects (pre-stored in network), or intermediate streams (generated by previous processing functions), as illustrated in Fig. 4.1. Compared to existing service models, e.g., MEC, DECO, SFC, two challenges arise in the delivery of data-intensive AgI services: (i) processing location selection impacts not only the resulting computation load, but also the communication load of all input streams, (ii) the static data input can be created (by replication) at any caching location (that stores the copy of the content) in an on-demand manner (i.e., per service function’s request), which is fundamentally different from a live data input, associated with a fixed source and alivestreaming rate. Existing cloud network control policies [24, 33, 81, 46], designed for simpler service models cannot efficiently handle these challenges, not to mention the coupling between them, i.e., joint selection of processing and caching locations, as well as live and static data routing paths. We term this problem multi-pipeline flow control . Another key element impacting the performance of service delivery is the caching policy design, in- cluding “which databases to cache” and “where to place the databases”. As mentioned in existing works 88 MEC SFC Data-intensive AgI Edge server Sensor (live data) Database (digital objects) F2 F1 F1 F1 Access point F2 Cloud datacenter User User Service function DECO F1 Figure 4.1: Network and service models studied in this chapter and related works. Data-intensive AgI [61] (this chapter): dis- tributed cloud network, service DAG (see Section 4.3.2) with both live and static data. DECO [46]: distributed cloud network, one-step processing with static data. MEC [24]: single server, one-step processing with live data. SFC [33, 81]: distributed cloud network, SFC with live data. on caching-communication integration, database placement shall be jointly optimized with flow control decisions, but going beyond flow routing to also include flow processing, especially for heterogeneous networks with highly-distributed 3C resources. Furthermore, when service request distribution are time- varying, the service delivery performance shall benefit from the dynamic adjustment of the caching policy. These problems are collectively referred to as joint 3C resource orchestration. This chapter addresses the above problems, and our contributions are summarized as follows: 1. We characterize the stability region of distributed cloud networks supporting data-intensive AgI service delivery, in the settings of fixed and dynamic database placement. 2. We design the first throughput-optimal control policy for online data-intensive service delivery, termed DI-DCNC, which coordinates joint decisions around (i) routing paths and processing loca- tions for live data streams, and (ii) cache selection and distribution paths for associated static data objects, under a given database placement. 3. We propose a database placement policy targeting throughput maximization, and derive an equiva- lent mixed integer linear programming (MILP) problem to implement the design. 89 4. We develop two database replacement policies able to adapt to time-varying service demand statis- tics, based on online estimations of service request distribution and database score, respectively. The rest of the chapter is organized as follows. In Section 4.2, we review the existing works related to this topic. In Section 4.3, we introduce the models for cache-enabled edge cloud and data-intensive service. In Section 4.4, we define policy spaces and characterize network stability regions. We derive the DI-DCNC algorithm for cloud network flow control in Section 4.5, followed by a max-throughput database placement policy in Section 4.6, as well as two replacement policies in Section 4.7. Section 4.8 presents the numerical results, and conclusions are drawn in Section 4.9. 4.2 RelatedWork 4.2.1 CachingandCommunication Over the past decade, the dramatic growth of user demands for multimedia content has fueled rapid ad- vances in caching techniques, especially at the wireless edge. By storing copies of popular content close to users, the network traffic and latency for content retrieval and distribution can be significantly reduced [44, 64, 36]. Caching and delivery policies are two key elements in content distribution network design, dealing with (i) content placement in the network, and (ii) content delivery to users, respectively. Various caching policies have been designed aiming to optimize different performance metrics, e.g., throughput [44], delay [64], and energy efficiency [37]. In addition, the overall content distribution performance can benefit from the joint optimization of the caching policy and the employed communication technique, e.g., non- orthogonal multiple access (NOMA) [30], multiple-input and multiple-output (MIMO) [50], and coded- multicast [45]. 90 In multi-hop networks, flow routing plays an important role in the delivery policy design, i.e., selecting the caching location to provision, and the path to deliver, the required content. Similarly, the overall network performance can benefit from the joint optimization of caching and routing [60]. Some existing studies propose formulations targeting either throughput maximization [51] or service cost minimization [42], and approximation algorithms are developed to address the resulting MIP problems. 4.2.2 Joint3COptimization While there is a large body of works on the integration of computation-communication and caching- communication technologies into network design, 3C integration is a less explored topic with fewer known results. Two combinations, computing-assisted information centric networking (ICN) and cache-enabled MEC, are studied in [84], as promising directions for 3C integration. In cache-enabled MEC, a key aspect is ser- vice caching, dealing with service functions (software) with non-trivial storage requirements [62]; another aspect is data caching, i.e., caching frequently used data [54], such as processed results (from previous tasks) that might be repeatedly requested [56], [72], to save extra computation resources and latency for content generation. In this chapter, we focus on integrating data caching into the delivery of data-intensive AgI services (in particular, Metaverse applications), assuming that service functions process a combination of cached digital objects (static data) and user-specific streams (or live data) to generate highly personalized experiences for end users. Under such assumption, [61] develops approximation algorithms for the data-intensive service embedding problem in a static setting (i.e., with known average demands); a dynamic (but simplified) setting is investigated in [46], focusing on static object distribution and processing without considering the live service chain routing and processing pipeline. 91 4.3 SystemModel Fig. 4.1 illustrates the cache-enabled MEC network as the supporting infrastructure for the delivery of data-intensive AgI services, described as follows. 4.3.1 Cache-EnabledMECNetworkModel Consider a distributed cloud network, modeled by a directed graphG = (V,E), withV andE denoting the node and edge sets, respectively. Each vertex i ∈ V represents a node equipped with computation resources (e.g., edge server) for service function processing. Each edge (i,j) ∈ E represents a point-to- point communication link, which can support data transmission from node i to j. Let δ − (i) and δ + (i) denote the incoming and outgoing neighbor sets of nodei, respectively. Time is slotted, and the network processing and transmission resources are quantified as follows: • Processing capacityC i : the maximum number of processing instructions (e.g., floating point opera- tions) that can be executed in one time slot at nodei. • Transmission capacityC ij : the maximum number of data units (e.g., packets) that can be transmitted in one time slot over link(i,j). The network nodes are also equipped with storage resources ∗ to cache databases composed of digital objects whose access may be required for service function processing. LetK denote the set of databases. Define the cachingvector as x={x i,k ∈{0,1}:i∈V,k∈K} (4.1) ∗ In this chapter, “storage resource” refers to memory or disk used for database caching. Data packets emanating from live or static data that travel through the network are collected in separate buffers, referred to as “actual queues” (see Section 4.3.4). 92 Live data Static object F 1 F 2 F F’ xtensions in Remark 1 Additional live/static input tudied model Stage 0 Stage 1 packet packets . . . AR application (special case) Live data: source video stream Static data: scene objects F 1: AR Processing packets packets packets packets Stage Stage ζ 1 (φ) Ξ (φ) 1 ζ 2 (φ) Ξ (φ) Mφ−1 ζ (φ) Mφ M φ − 1 M φ Ξ (φ) 0 =1 Ξ (φ) 1 Ξ (φ) Mφ M φ Figure 4.2: The studied data-intensive service model composed of multiple functions (denoted by F), with each function requiring one live data input and one static data input. We depict an AR application with a single processing step (F1) as a special case, as well as extensions to multiple live and static inputs (using FM ϕ as an example). wherex i,k is a binary variable indicating if databasek∈K is cached at nodei (x i,k =1) or not (x i,k =0). LetV(k)={i∈V :x i,k =1}⊂V denote the static sources of databasek, i.e., the set of nodes that cache databasek. A caching vector must satisfy the following storage resource constraint: for∀i∈V, X k∈K F k x i,k ≤ S i (4.2) whereF k denotes the size of databasek∈K, andS i the storage capacity of nodei∈V, i.e., the maximum number of static data units (e.g., databases) that can be cached at nodei∈V, respectively. LetX denote the caching vectorsx satisfying (4.2). We assume that there exists a cloud datacenter in the network, serving as an external trusted source with all databases stored, from which the edge servers can download databases for caching. We assume that such downloads happen at a longer timescale and neglect their impact on the network communication resources. † 93 4.3.2 Data-IntensiveAgIServiceModel We assume a data-intensive service ϕ is composed of a sequence of M ϕ functions, through which the user-specific data, referred to as live data, must be processed to produce consumable streams, resulting in the end-to-end data stream divided into M ϕ + 1 stages. We refer to the output stream of function m ∈ {1,··· ,M ϕ } as stage m live packets, with source packets denoted as stage 0 packets. In order to process each live packet, the associated service function requires access to a pre-stored digital object, referred to asstaticdata [61, 46]. We then usestagemstaticpackets to denote the static object input stream to functionm+1. Each processing step can take place at different network locations hosting the required service functions: for example, in Fig. 4.1, the two service functions F1 and F2 (in blue) are executed at different edge servers. Each service function – say the mth function of service ϕ – can be described by four parameters (ξ (ϕ ) m ,r (ϕ ) m ,k (ϕ ) m ,ζ (ϕ ) m ), defined as follows: • Objectnamek (ϕ ) m : the name (or index) of the database to which the static object belongs. • Mergingratioζ (ϕ ) m : the number of static packets per input live packet. • Workloadr (ϕ ) m : the amount of computation resource (e.g., instructions per time slot) to process one input live packet. • Scalingfactorξ (ϕ ) m : the number of output packets per input live packet. † Database replacement can be supported by the backhaul connections between the cloud datacenter and edge servers, subject to restricted communication rates. 94 We also define the cumulativescalingfactorΞ (ϕ ) m as the number of stagem live packets per stage0 live packet (i.e., the live packet prior to any processing operation), given by Ξ (ϕ ) m = 1 m=0 ξ (ϕ ) m Ξ (ϕ ) m− 1 m=1,··· ,M ϕ . (4.3) Remark 15 (Extended Models). We note that in general, a data-intensive service may be described by a DAG with multiple live and static data inputs per service function. While for ease of exposition, we illustrate the proposed design in the context of one live and one static input per function, the generalization to multiple inputs is straightforward: (i) For functions with multiple static inputs, we can extend k (ϕ ) m and ζ (ϕ ) m (from scalars) to sets. (ii) For functions with multiple live inputs, we can create a tree node for each service function and describe its inputs as child-nodes until reaching the source data (i.e., leaf nodes). 4.3.3 ClientModel We define each client c by a 3-tuple (s,d,ϕ ), denoting the source node s (where the live packets arrive to the network), the destination noded (where the final packets are requested for consumption), and the requested serviceϕ (which defines the sequence of service functions and the static packets that are required to process the live packets and create the final packets), respectively. ‡ 4.3.3.1 LivePacketArrival Leta (c) (t) be the number of live packets of clientc arriving to the network at timet. For each clientc, we assume the arrival process{a (c) (t):t≥ 0} is i.i.d. over time, with mean arrival rate ofλ (c) , and bounded maximum arrival number. Each live packet is immediately admitted to the network upon arrival. ‡ We use packet to refer to the minimum data unit that can be processed independently by the service or application, such as a video segment or frame in video-based applications. 95 Remark16. InSection4.6(andsubsequentsections),weassumethatthereexistsaservicerequestdistribution (4.32) p (c) :∀c governingthearrivalratesofallclients,i.e.,λ (c) ∝p (c) ,whendesigningdatabaseplacement policies targeting throughput maximization. Remark 17. We assume i.i.d. arrivals for ease of exposition. The analytical results: Theorem 4, 5, and Proposition 7, are valid under the general assumption of Markov-modulated arrivals, i.e., the arrival rate is time-varying and follows a Markov process (see [58, Section 4.9]). 4.3.3.2 StaticPacketProvisioning Upon a live packet arrival, for each required static packet, one static source is selected to generate the required copy, which is loaded into the network immediately. Definition 8 (Packet-level request). We refer to a live packet and all static packets required for its (multi- stage) processing as belonging to the same packet-level request. In the following, we use packet-level request and request interchangeably. Remark 18 (Static Object). A database is composed of multiple objects, e.g., the scene object library in an AR application, and distinct objects may be required by different service requests. We assume that the live packet and static packets belonging to the same packet-level request get associated, i.e., the static packets are dedicated for the processing of the associated live packet. 4.3.4 QueuingSystem Each packet (live or static) admitted to the network gets associated with a route for its delivery, and we establish actual queues to accommodate packets waiting for processing or routing. For each link (i,j) ∈ E, we create one transmission queue collecting all packets – regardless of the client, stage, or type (i.e., live or static) – waiting to cross the link, i.e., packets currently held at node i 96 Waiting queue (unpaired packets) Processing queue (paired packets) Node Transmission queue Node i j Figure 4.3: Illustration of the paired-packet queuing system. Different shapes denote packets associated with different requests, blue and red colors the live and static packets, and solid and dashed lines the current and subsequent time slot, respectively. and having node j as its next hop in the route. In contrast, a novel paired-packet queuing system is constructed at each nodei∈V, composed of the following queues: (i) the processing queue collecting the paired live and static packets concurrently present at node i, which are ready for joint processing, and (ii) the waiting queue collecting the unpaired live or static packets waiting for their in-transit associates, which are not qualified for processing until joining the processing queue upon their associates’ arrivals. An illustrative example is shown in Fig. 4.3. At the current time slot, the paired-packet processing queue holds a blue and red circle pair, representing live and static packets belonging to the same request, which are ready for processing. At the next time slot, when node i receives the red square packet, it gets paired with the blue square packet held in the waiting queue, and together enter the paired-packet processing queue to be scheduled for processing. 4.4 PolicySpaceandStabilityRegion In this section, we propose an augmented layered graph (ALG) model to analyze and optimize the data- intensive AgI service delivery problem, based on which we characterize the network stability region. 97 Live source Super static source Stage ŗ live pipeline Processor at node Database at node and Stage ŗ live output Stage Ŗ live, static inputs (Actual network) (ALG) Sensor s p v d Stage Ŗ live pipeline Stage Ŗ static pipeline v d p o ′ 0 s 0 s 1 s ′ 0 p 0 p 1 p ′ 0 v 0 v 1 v ′ 0 d 0 d 1 d ′ 0 Figure 4.4: Illustration of the ALG model for the delivery of AR service. 4.4.1 AugmentedLayeredGraph Recent studies have shown that the AgI service (modeled by SFC) control problem can be transformed into a packet routing problem on a properly constructed layered graph [81]. 4.4.1.1 TopologyoftheALG The ALG associated with serviceϕ is composed ofM ϕ +1 layers, indexed by layer0,··· ,M ϕ , respectively. Within each layerm, there are two pipelines, referred to as stagem live and static pipelines, respectively, except for layerM ϕ , which only includes the live pipeline. Each live pipeline has the same topology as the actual network, while the stage m static pipeline includes an additional super static source node o ′ m and its outgoing edges to all static sources, i.e., (o ′ m ,v ′ m ) with∀v ∈ V(k (ϕ ) m ). We note that: (i) The live and static pipelines in layerm accommodate stagem live and static packets, respectively, and represent their associated routing over the network. (ii) With the super static sourceo ′ m created for the static pipeline, it is equivalent to assume thato ′ m is the only static source of databasek (ϕ ) m . § (iii) There are inter-layer edges connecting corresponding nodes in adjacent layersm andm+1, which represent processing operations, § To wit, if nodeo ′ m can provide static packets to nodei ′ m along the path(o ′ m ,v ′ m ,··· ,i ′ m ) in the ALG, then, in the actual network, we can select the static sourcev to produce the packets and send them to nodei along the rest of the path. And vice versa. 98 i.e., the stage m live and static packets pushed through these edges are processed into stage m+1 live packets in the actual network. The example in Fig. 4.4 illustrates the delivery of a single-function AR application (as shown in Fig. 4.2) over a 4-node network. The stagem=0 live packet, which arrives to the network at the source node s, and the stagem = 0 static packet, which is generated via replication at the static sourcev, are routed following the blue and green paths to nodep, respectively. After getting processed at nodep, the produced stagem+1=1 packet is delivered along the red path to destinationd. In the ALG, the highlighted links in different pipelines indicate the routing paths of each packet; (o ′ 1 ,v ′ 1 ) indicates selecting the static source v ∈ V(k (ϕ ) 1 ) = {v,d} to create the static packet, and (p 1 ,p 2 ) and (p ′ 1 ,p 2 ) indicate packet processing at nodep. Mathematically, given the actual networkG and the database placementx, the ALG of serviceϕ , de- noted byG (ϕ ) =(V (ϕ ) ,E (ϕ ) ), is defined as V (ϕ ) = M ϕ [ m=0 V (ϕ ) L,m ∪ M ϕ − 1 [ m=0 V (ϕ ) S,m (4.4a) E (ϕ ) = M ϕ [ m=0 E (ϕ ) L,m ∪ M ϕ − 1 [ m=0 E (ϕ ) S,m ∪ M ϕ − 1 [ m=0 E (ϕ ) m,m+1 (4.4b) in which (with L/S in the subscripts denoting live/static) V (ϕ ) L,m ={i m :i∈V}, V (ϕ ) S,m ={i ′ m :i∈V}∪{o ′ m } E (ϕ ) L,m ={(i m ,j m ):(i,j)∈E} E (ϕ ) S,m ={(i ′ m ,j ′ m ):(i,j)∈E}∪{(o ′ m ,v ′ m ):v∈V(k (ϕ ) m )} E (ϕ ) m,m+1 ={(i m ,i m+1 ),(i ′ m ,i m+1 ):i∈V}. 99 Remark 19. Note that, on one hand, the proposed ALG model can capture special cases such as services modeled as SFCs (in which static objects are not relevant) by removing the static pipelines, which reduces to the layered graph model in existing works [81]. On the other hand, extended service models with multiple live/static inputs (see Remark 15) can be flexibly incorporated by adding extra pipelines into each layer. 4.4.1.2 FlowintheALG Letf ıȷ ≥ 0 denote the network flow associated with edge (ı,ȷ)∈E (ϕ ) in the ALG, defined as the average packet rate traversing the edge (in packets per slot). In particular: • f imjm andf i ′ m j ′ m denote the transmission rates of stagem live and static packets over link(i,j)∈E. • f o ′ m v ′ m denotes the local replication rate of stagem static packets at the static sourcev∈V(k (ϕ ) m ). • f imi m+1 andf i ′ m i m+1 denote the processing rates of stagem live and static packets at nodei∈V. The flow rates must satisfy the following constraints: (i) Live flow conservation: for∀i∈V,0≤ m≤ M ϕ , X j∈δ + (i) f imjm +f imi m+1 = X j∈δ − (i) f jmim +ξ (ϕ ) m f i m− 1 im , (4.5) i.e., for stagem live flow, the total outgoing rate of packets that are transmitted and processed equals the total incoming rate of packets that are received and generated by processing. Note that processing stage m− 1 live packets at rate f i m− 1 im (by function m) produces stage m live packets at rate ξ (ϕ ) m f i m− 1 im at nodei, by the definition of “scaling factor” in Section 4.3.2. Define f i − 1 i 0 =f i M ϕ i M ϕ +1 =0. (ii) Static flow conservation: for∀i∈V,0≤ m≤ M ϕ − 1, X j∈δ + (i) f i ′ m j ′ m +f i ′ m i m+1 = X j∈δ − (i) f j ′ m i ′ m +f o ′ m i ′ m , (4.6) 100 i.e., for stage m static flow, the total outgoing rate of packets that are transmitted and processed equals the total incoming rate of packets that are received and generated by replication. Define f o ′ m i ′ m = 0 for i / ∈V(k (ϕ ) m ), i.e., nodes that are not static sources. (iii) Data merging: for∀i∈V,1≤ m≤ M ϕ , f i ′ m− 1 im =ζ (ϕ ) m f i m− 1 im , (4.7) i.e., the processing rates of static and live packets at each nodei are associated by the merging ratioζ (ϕ ) m , as defined in Section 4.3.2. Remark 20. We note that multiple edges in the ALG are associated with the same node/link in the actual network. Forexample,edges(i m ,j m ),(i ′ m ,j ′ m )∀mareassociatedwithlink(i,j),andalloperationsonthese edges consume the link’s communication resource. 4.4.2 PolicySpace Next, we define the space of policies for data-intensive service delivery under a given database placement, encompassing joint packet processing, routing, and replication decisions. In line with [81] and the in- creasingly adopted software defined networking (SDN) paradigm, we focus on source/centralized routing and distributed scheduling policies. To be specific, upon a packet-level request and associated live packet arrival at its source, the policy decides (i) routing paths and processing locations for the live packets, and (ii) cache selections (i.e., chosen static sources to replicate each static packet) and distribution paths for as- sociated static packets. In addition, each node/link needs to schedule packets for processing/transmission at each time slot. 4.4.2.1 DecisionVariables An admissible policy thus consists of two actions. 101 RouteSelection: For each packet-level request, choose a set of edges in the ALG and associated flow rates f satisfying (4.5) – (4.7), based on which: (i) the cache selection decision for static packet k (ϕ ) m is specified by the replication rate f o ′ m i ′ m , (ii) the routing path for each packet (live or static) is given by the edges with non-zero rates in the corresponding pipeline, and (iii) the processing location selection is indicated by the processing rates f i m− 1 im and f i ′ m− 1 im , which guarantee that the live and static packets meet at the same nodei due to (4.7). PacketScheduling: At each time slot, for each nodei and link(i,j), schedule packets from the pro- cessing queues (which hold paired live and static packets) and transmission queues for corresponding operations, without exceeding associated resource capacitiesC i andC ij . 4.4.2.2 EfficientPolicySpace In this section, we define an efficient policy space in which the routing path of each packet is required to be acyclic, without compromising the achievable performance (e.g., throughput, delay, and resource consumption). More concretely, each request gets associated with anefficientroute(ER) σ in the ALG, defined as: • σ includes a sequence of processing locations, denoted by θ (m) ∈ V : 1 ≤ m ≤ M ϕ , with corresponding edges in the ALG given by θ (m) m− 1 , θ (m) m , θ (m) ′ m− 1 , θ (m) m M ϕ m=1 where functionm is executed at nodeθ (m) , and we define θ (0) =s andθ (M ϕ +1) =d. 102 • σ includes acyclic routing paths for all packets, i.e., σ 1,m = θ (m) m ,··· , θ (m+1) m , 0≤ m≤ M ϕ , σ 2,m = o ′ m ,··· , θ (m+1) ′ m , 0≤ m≤ M ϕ − 1, denote the (multi-hop) paths of stagem live packet (from θ (m) m to[θ (m+1) ] ′ m ) and stagem static packet (fromo ′ m to[θ (m+1) ] ′ m ), respectively. In the efficient policy space, for each client c, the set of all possible ERs, denoted byF c (x), is finite , and the route selection decision can be represented by A(t)={a (c,σ ) (t):σ ∈F c (x),c} (4.8) wherea (c,σ ) (t)≥ 0 denotes the number of requests raised by clientc at timet that get associated withσ for delivery, which satisfies X σ ∈Fc(x) a (c,σ ) (t)=a (c) (t), ∀c. (4.9) Note however thatF c (x) includes an exponential number of ERs, i.e.,|F c (x)|=Ω( |V| M ϕ ). 4.4.3 NetworkStabilityRegion In this section, we characterize the network stability region, which describes the network capability to support service requests, defined as follows. 103 Definition9. The network stability region is defined as the set of arrival vectors λ under which there exists an admissible policy to stabilize the actual queues, i.e., lim t→∞ 1 t h X i∈V (R i (t)+R ′ i (t))+ X (i,j)∈E R ij (t) i =0 whereR i (t),R ′ i (t) andR ij (t) denote the backlogs of the processing queue for nodei, waiting queue for node i, and transmission queue for link(i,j) at timet. Let Λ( x) and Λ denote the network stability regions under fixed database placement x and when allowing dynamic replacement, which are characterized in the following. Theorem4. For any fixed database placement x∈X, an arrival vectorλ is interior to the stability region Λ( x) if and only if for each clientc, there exist probability values P c (σ ): X σ ∈Fc(x) P c (σ )=1 andP c (σ )≥ 0, such that for each nodei and link(i,j): X c λ (c) X σ ∈Fc(x) ρ (c,σ ) i P c (σ )≤ C i (4.10a) X c λ (c) X σ ∈Fc(x) ρ (c,σ ) ij P c (σ )≤ C ij (4.10b) whereρ (c,σ ) i andρ (c,σ ) ij denotetheprocessingandtransmissionresourceloadsimposedonnodeiandlink(i,j) if a packet-level request of clientc is delivered by ERσ , given by: ρ (c,σ ) i = X m w (c) i m− 1 im 1 {(i m− 1 ,im)∈σ } (4.11) ρ (c,σ ) ij = X m w (c) imjm 1 {(im,jm)∈σ } +w (c) i ′ m j ′ m 1 {(i ′ m ,j ′ m )∈σ } 104 in whichw (c) ={w (c) ıȷ :(ı,ȷ)∈E (ϕ ) } is given by w (c) ıȷ = Ξ (ϕ ) m r (ϕ ) m+1 (ı,ȷ)=(i m ,i m+1 ) 0 (ı,ȷ)=(i ′ m ,i m+1 ) or(o ′ m ,i ′ m ) Ξ (ϕ ) m (ı,ȷ)=(i m ,j m ) Ξ (ϕ ) m ζ (ϕ ) m+1 (ı,ȷ)=(i ′ m ,j ′ m ) . (4.12) Proof. The proof for necessity is given in Appendix C.1, and we show the sufficiency by designing an admissible policy, DI-DCNC, in the subsequent section, and proving that it can support any arrival vector λ ∈Λ( x). In Theorem 4: (i) The “sum” operation in (4.11) results from multiple ALG edges sharing a common node/link (see also Remark 20). (ii) A randomized policy for route selection can be defined based on the probability valuesP c (σ ), operating as follows: at each time slot, select the ER σ ∈ F c (x) to deliver the requests of clientc with probability (w.p.)P c (σ ). (iii) The result is valid under the general assumption of Markov-modulated arrivals, in which case the stability region is defined with respect to (w.r.t.) the time average arrival rateλ (c) =lim T→∞ (1/T) P T− 1 t=0 a (c) (t). Proposition6. Whenallowingdatabasereplacement,anarrivalvectorλ isinteriortothestabilityregionΛ if and only if there exist probability values P(x): X x∈X P(x)=1 andP(x)≥ 0, and (for each database placementx∈X and clientc) P c,x (σ ): X σ ∈Fc(x) P c,x (σ )=1 andP c,x (σ )≥ 0, 105 such that for each nodei and link(i,j): X x∈X P(x) X c λ (c) X σ ∈Fc(x) ρ (c,σ ) i P c,x (σ )≤ C i (4.13a) X x∈X P(x) X c λ (c) X σ ∈Fc(x) ρ (c,σ ) ij P c,x (σ )≤ C ij (4.13b) withρ (c,σ ) i andρ (c,σ ) ij given by (4.11). Proof. See Appendix C.4. In the above proposition, P(x) represents the distribution of caching vector x over time, resulting from the employed replacement policy. P c,x (σ ) plays an equivalent role toP c (σ ) in Theorem 4, i.e., the probability values specifying the route selection policy under placementx. Comparing Theorem 4 and Proposition 6, we find that allowing database replacement is promising to enlarge the stability region. To wit, settingP(x) = 1 {x=x 0 } in Proposition 6 leads to the same charac- terization as Theorem 4 with x = x 0 . The improvement brought by replacement is intuitive under the assumption that database replacement can be performed instantaneously, as one can adjust the database placementx at each time slot according to the received requests to optimize the service delivery perfor- mance. In Proposition 8, we will show that such result remains valid under restricted communication rate for database replacement, in line with footnote†. 4.5 Multi-PipelineFlowControl In this section, we present the proposed algorithm, referred to as data-intensive dynamic cloud network control (DI-DCNC), which makes joint routing and processing decisions for live and static pipelines, as well as packet scheduling, under fixed database placement x. 106 We first introduce a single-hop virtual system (in Section 4.5.1) to derive packet routing decisions (in Section 4.5.2). Then, we present the packet scheduling policy, and summarize the actions to take in the actual network (in Section 4.5.3). 4.5.1 VirtualSystem 4.5.1.1 PrecedenceConstraint In line with [67, 81], we create a virtual network that has the same topology as the actual network, with a virtual queue associated with each node and link. Theprecedenceconstraint, which imposes a packet to be transmitted hop-by-hop along its route, is relaxed by allowing a packet upon route selection to be imme- diately inserted into the virtual queues associated with all links in the route. The virtual queue measures the processing/transmission resource load for the node/link in the virtual system, which is interpreted as theanticipated resource load for the corresponding node/link in the actual network. For example, suppose that a packet gets associated with the route(i 1 ,i 2 ,i 3 ). Then, it immediately im- pacts the queuing states of link(i 1 ,i 2 ) and(i 2 ,i 3 ) in the virtual system, as opposed to the actual network, where it cannot enter the queue for link(i 2 ,i 3 ) before crossing(i 1 ,i 2 ). We emphasize that the virtual system is only used for route selection and it is not relevant to packet scheduling. 4.5.1.2 VirtualQueues Let ˜ Q i (t) and ˜ Q ij (t) denote the virtual queues for node i ∈ V and link (i,j) ∈ E, respectively. The queuing dynamics are given by: ˜ Q i (t+1)= ˜ Q i (t)− C i +˜ a i (t) + (4.14a) ˜ Q ij (t+1)= ˜ Q ij (t)− C ij +˜ a ij (t) + (4.14b) 107 where C i and C ij are interpreted as the amount of processing/transmission resource that is “served” at timet;˜ a i (t) and˜ a ij (t) are “additional” resource loads imposed on the node/link by newly arriving packets. Recall that each request gets associated with a route for delivery upon arrival, which immediately impacts the queuing states of all links in the route, and thus: ˜ a i (t)= X c X σ ∈Fc(x) ρ (c,σ ) i a (c,σ ) (t) (4.15a) ˜ a ij (t)= X c X σ ∈Fc(x) ρ (c,σ ) ij a (c,σ ) (t) (4.15b) withρ (c,σ ) i andρ (c,σ ) ij given by (4.11). 4.5.2 OptimalVirtualNetworkDecisions 4.5.2.1 LyapunovDriftControl Next, we leverage Lyapunov drift control theory to derive a policy that stabilizes the normalized virtual queues: Q(t)= Q i (t)≜ ˜ Q i (t) C i :i∈V ∪ Q ij (t)≜ ˜ Q ij (t) C ij :(i,j)∈E , (4.16) which have equivalent stability properties as the virtual queues (due to the linear scaling) and can be interpreted as queuing delays in the virtual system. Define the Lyapunov function as L(t)≜∥Q(t)∥ 2 /2, and the Lyapunov drift∆( Q(t))≜ L(t+1)− L(t). Then we can derive (as shown in Appendix C.2.1) the following upper bound of the drift∆( Q(t)): ∆( Q(t))≤ B−∥ Q(t)∥ 1 + X c X σ ∈Fc(x) O (c,σ ) (t)a (c,σ ) (t) (4.17) 108 whereB is a constant, andO (c,σ ) (t) is referred to as the weight of ERσ , given by: O (c,σ ) (t)= X i∈V Q i (t) C i ρ (c,σ ) i + X (i,j)∈E Q ij (t) C ij ρ (c,σ ) ij (4.18a) = X i∈V X m ˜ w (c) i m− 1 im (t)1 {(i m− 1 ,im)∈σ } + X (i,j)∈E X m h ˜ w (c) imjm (t)1 {(im,jm)∈σ } + ˜ w (c) i ′ m j ′ m (t)1 {(i ′ m ,j ′ m )∈σ } i (4.18b) in which we plug in (4.11), and ˜ w (c) ıȷ (t)= w (c) ıȷ Q i (t) C i (ı,ȷ)=(i m ,i m+1 ),(i ′ m ,i m+1 ) w (c) ıȷ Q ij (t) C ij (ı,ȷ)=(i m ,j m ),(i ′ m ,j ′ m ) (4.19) withw (c) given by (4.12). The proposed algorithm is designed to minimize the upper bound (4.17) over the route selection deci- sionA(t) given by (4.8), or equivalently, min A(t) X c X σ ∈Fc(x) O (c,σ ) (t)a (c,σ ) (t), s.t. (4.9). (4.20) 4.5.2.2 RouteSelection Given the linear structure of (4.20), the optimal route selection decision is given by: a ⋆(c,σ ) (t)=a (c) (t)1 {σ =σ ⋆ } (4.21) 109 where σ ⋆ =argmin σ ∈Fc(x) O (c,σ ) (t), (4.22) i.e., all requests of client c arriving at time t are delivered by the min-ER, i.e., the ER with the minimum weight, and the remaining problem is to find the min-ER among the exponential number of ERs in F c (x). To this end, we create aweightedALG where each edge(ı,ȷ) in the ALG is assigned the weight ˜ w (c) ıȷ (t) given by (4.19), under which the weight of the ERσ (4.18b) equals to the sum of individual edge weights. In the rest of this section, we propose a dynamic programming algorithm to find the min-ER based on the weighted ALG. Define: • ERweightmatrixW of size(M ϕ +1)×|V| , whereW(m,i) is theminimum weight to deliver the stagem live packet to nodei,optimizedover all previous packet processing, routing, and replication decisions. • ProcessinglocationmatrixP of sizeM ϕ ×|V| , whereP(m,i) is the optimal processing location of functionm, to deliver the stagem live packet to nodei. The ultimate goal is to find W(M ϕ ,d) and the associated ERσ ⋆ , and we propose to deriveW row-by- row (or layer-by-layer in the ALG). To be specific, suppose row m ofW , i.e.,{W(m,j):j∈V}, is given. Then, we can derive each element on rowm+1, e.g.,W(m+1,i), in two steps: First, assume that function m+1 is executed at node j. Then, we can optimize cache selection and routing decisions to minimize the weight, i.e., W j (m+1,i)=W(m,j)+SPW(o ′ m ,j ′ m )+ ˜ w (c) jmj m+1 (t)+SPW(j m+1 ,i m+1 ) (4.23) 110 Stage live and static pipelines Stage live pipeline . . . Term 1 Term 2 Term 4 Term 3 . . . m m+ 1 . . . o ′ m j ′ m j m j m+1 i m+1 s 0 . . . Figure 4.5: Illustration of the weight components in Eq. (4.23). whereSPW(ı,ȷ) denotes the weight of shortest path (SP)SP(ı,ȷ) from nodeı toȷ in the weighted ALG. As depicted in Fig. 4.5, the four terms represent: (i) the min-weight to deliver the stagem live packet to nodej, (ii) the min-weight to replicate and route the stagem static packet to nodej, (iii) the computation load at nodej, and (iv) the min-weight to route the stagem+1 live packet to nodei, respectively. Second, we optimize the processing location decision to minimize the overall weight, i.e., W(m+1,i)=min j∈V W j (m+1,i), (4.24a) P(m+1,i)=argmin j∈V W j (m+1,i). (4.24b) Repeat the above procedure to derive all entries of W and P . We then propose the following back- tracing procedure to derive the min-ER σ ⋆ , to deliver the stage M ϕ live packet to node d: starting from destination d, the optimal processing location of function M ϕ , θ (M ϕ ) , is the (M ϕ ,d) entry of matrix P . The remaining problem is to find the optimal decisions to deliver the stage M ϕ − 1 live packet to node θ (M ϕ ) , which has the same structure as the original problem and can be solved by repeating the above procedure, as described in Algorithm 4 (step6 to8). 111 Algorithm4 Dynamic Programming to Find the min-ER Input: ˜ w(t);Output: min-ERσ ⋆ , optimal weightW ⋆ (x). 1: Initialization: W(0,i)← SPW(s 0 ,i 0 ) for∀i∈V. 2: for m=0,··· ,M ϕ − 1 andi∈V do 3: Calculate row(m+1) forW andP by (4.24). 4: endfor 5: Letθ (M ϕ ) =P(M ϕ ,d) andσ ⋆ =SP([θ (M ϕ ) ] M ϕ ,d M ϕ ). 6: for m=M ϕ ,··· ,1 do 7: θ (m− 1) ← P(m− 1, θ (m) ) (note thatθ (0) =s), and σ ⋆ ← σ ⋆ ∪ SP(o ′ m− 1 ,[θ (m) ] ′ m− 1 )∪([θ (m) ] ′ m− 1 ,[θ (m) ] m ) ∪ SP([θ (m− 1) ] m− 1 ,[θ (m) ] m− 1 )∪([θ (m) ] m− 1 ,[θ (m) ] m ). 8: endfor 9: Return (i) min-ERσ ⋆ and (ii)W ⋆ (x)=W(M ϕ ,d). 4.5.3 OptimalActualNetworkDecisions Next, we present control decisions in the actual network. We adopt the route selection decisions made in the virtual network in Section 4.5.2. In addition, we adopt the extended nearest-to-origin (ENTO) policy [81] for packet scheduling: Ateachtimeslott,foreachnode/link,giveprioritytothepacketswhichhavecrossedthesmallestnumber of edges in the ALG from the corresponding processing / transmission queue. We note that ENTO is a distributed packet scheduling policy. The processing queue holds paired live and static packets, and we define the number of crossed hops for a packet-pair to be that of its live packet component. To sum up, the proposed DI-DCNC algorithm is described in Algorithm 5. Algorithm5 DI-DCNC 1: fort≥ 0do 2: For each clientc, all requests received at timet get associated with the min-ER found by Algorithm 4. 3: Each link transmits packets and each node processes paired-packets according to ENTO. 4: Update the virtual queues by (4.14). 5: endfor 112 4.5.4 PerformanceAnalysis 4.5.4.1 Throughput Under any fixed database placement, DI-DCNC is throughput optimal, as described in the following theo- rem. Theorem 5. For any fixed database placement x ∈ X and arrival rateλ interior to the network stability regionΛ( x), all actual queues are rate stable under DI-DCNC. Proof. See Appendix C.2. 4.5.4.2 Complexity We can take advantage of the following facts to simplify the calculation of (4.23) when implementing Algorithm 4: SPW(ı,ȷ)=w (c) ıȷ SPW 0 (i,j), ∀(ı,ȷ)=(i m ,j m ), (i ′ m ,j ′ m ) where SPW 0 (i,j) denotes the SP distance in the weighted graph, which has the same topology as the actual network with the weight of each edge(i,j)∈E given byQ ij (t)/C ij . In addition, we note that SPW(o ′ m ,j ′ m )= min i∈V(k (ϕ ) m ) SPW(i ′ m ,j ′ m ). (4.25) Therefore, we can implement Algorithm 4 as follows: (i) Calculate the pairwise SP distance, i.e.,{SPW 0 (i,j):(i,j)∈V×V} by Floyd-Warshall [27, Section 25.2], with complexityO(|V| 3 ). 113 (ii) In each iteration (step 3 in Algorithm 4): calculateSPW(o ′ m ,j ′ m ) for∀j∈V by (4.25), with complex- ityO(|V| 2 ). Then, calculate (4.23) for each(i,j) pair, with complexityO(|V| 2 ). The total complexity to calculate the entire matrix is thus given byO(M ϕ |V| 2 ). (iii) Perform back-tracing, with complexityO(M ϕ ). To sum up, the overall complexity of Algorithm 4 is given byO(|V| 3 +M ϕ |V| 2 ). 4.5.4.3 DiscussionsonDelayPerformance As observed in the numerical experiments (in Section 4.8), the designed DI-DCNC algorithm can achieve good delay performance. Note however that we cannot say DI-DCNC is delay optimal because: (i) it places the focus on queuing delay without taking into account the hop-distance of the selected path, which can become an important delay component in low-congestion regimes; and (ii) the actual service delay should be taken as the maximum over the concurrent live and static pipelines, while the ER weight (4.18b) is indicative of theaggregate delay (i.e., the sum of two). Addressing these challenges is of interest for future work. 4.6 Max-ThroughputDatabasePlacement The second part of this chapter tackles thejoint3Cresourceorchestration problem, with this section focusing on the setting of fixed database placement (and next section designing dynamic database replacement policies). 4.6.1 ProblemFormulation The goal is to design a fixed database placement policy to optimize the network’s throughput performance, together with the flow (processing and routing) control decisions. 114 4.6.1.1 Variables We define two variables, representing the database placement and flow control decisions, respectively, as follows: • Caching vectorx ={x i,k : i∈V,k∈K}, wherex i,k is a binary variable indicating if databasek is cached at nodei (x i,k =1) or not (x i,k =0). • Flow variablesf ={f (c) ıȷ : (ı,ȷ) = (i m ,j m )∈E (ϕ ) ,c} andf ′ ={f ′(k) ij : (i,j)∈E,k ∈K}, where f (c) ıȷ denotes the flow rate of live packets of client c on edge(ı,ȷ)∈E (ϕ ) in the ALG, andf ′(k) ij the flow rate of static packets of database k on link(i,j)∈E in the actual network, respectively. ¶ 4.6.1.2 Constraints We impose two classes of constraints on the variables: (i) Capacity constraints, which limit the 3C network resource usage, i.e., the incurred resource con- sumption shall not exceed the corresponding capacities. To be specific, the processing rate at each node i∈V and the transmission rate over each link(i,j)∈E must satisfy X c,m r (ϕ ) m f (c) i m− 1 im ≤ C i , X c,m f (c) imjm + X k∈K f ′(k) ij ≤ C ij , (4.26) and the storage constraint (4.2): P k∈K F k x i,k ≤ S i ,∀i∈V. ¶ The static flow defined in this section represents the sum of the individual static flows (on the same link (i,j) and of the same databasek) over all clients, i.e.,f ′(k) ij = P c,m f i ′ m j ′ m 1 {k (ϕ ) m =k} . 115 (ii) Service chaining constraints, which impose the relationship between input and output flows as they traverse network nodes and undergo service function processing. For live flows, the conservation law is given by: X j∈δ + (i) f (c) imjm +f (c) imi m+1 = X j∈δ − (i) f (c) jmim +ξ (ϕ ) m f (c) i m− 1 im +λ (c) 1 {im=s 0 } , ∀c,m,i:i m ̸=d M ϕ , (4.27) and for the destination node: f (c) d M ϕ j M ϕ =0, ∀j∈δ + (d). (4.28) For the static flows (of database k), the conservation law can be summarized as (1− x i,k ) X j∈δ + (i) f ′(k) ij +f ′(k) i − X j∈δ − (i) f ′(k) ji =0, (4.29) with the processing rate of static packets at nodei given by f ′(k) i ≜ X c,m ζ (ϕ ) m f (c) i m− 1 im 1 {k (ϕ ) m =k} . (4.30) The static flow conservation law (4.29) can be described as follows: for each node i that is not a static source, i.e.,x i,k = 0, the static flow of database k must satisfy the flow conservation constraint (see (4.6) for detailed illustration): X j∈δ + (i) f ′(k) ij +f ′(k) i = X j∈δ − (i) f ′(k) ji , (4.31) 116 and (4.29) is true; for any static source i, i.e., x i,k = 1 (and thus 1− x i,k = 0), (4.29) is true. We note that (4.31) does not necessarily hold at the static sources, because they can perform in-network packet replication: an operation known to violate the flow conservation law [15]. 4.6.1.3 Objective We assume that the arrival rates of all clients’ requests,{λ (c) : ∀c}, are governed by a service request distribution. To be specific, the arrival rates are given by λ (c) =p (c) λ : X c p (c) =1 , (4.32) and we useλ to measure the throughput performance. This objective is employed targeting better fairness performance, compared to another widely used metric ofsumarrivalrate, i.e., P c λ (c) , which favors service requests with lighter resource load (to improve the total throughput, provided the same network resources). Remark 21. Note that the service request distribution defined above is w.r.t. clients c = (s,d,ϕ ) (see Section 4.3.3). The service popularity distribution, which is w.r.t. servicesϕ , can be derived as its marginal distribution. Furthermore, the content popularity distribution, which is w.r.t. databases k, can be derived based on the service popularity distribution and the associated service parameters (i.e., scaling factor and merging ratio). Remark22. If the actual service request distribution is unknown, a uniform distribution is used by default. Besides,otherthanrepresentingtheservicerequestdistribution,thevaluesofp (c) canbedesignedforadmission control, customer prioritization, etc. 117 4.6.2 ProposedDesign To sum up, the problem is formulated as follows: max λ (4.33a) s.t. λ (c) ≥ p (c) λ, ∀c (4.33b) Capacity constraints (4.26), (4.2) (4.33c) Chaining constraints (4.27) – (4.30) (4.33d) x∈{0,1} |K|×|V| andf,f ′ ⪰ 0. (4.33e) We note that (4.33) is a MIP problem due to (4.29), which is not a linear constraint due to the cross terms ofx andf ′ . To improve tractability, we propose to replace (4.29) with the following linear constraint: X j∈δ + (i) f ′(k) ij +f ′(k) i − X j∈δ − (i) f ′(k) ji ≤ C max i,k x i,k (4.34) whereC max i,k is a constant, given by C max i,k = X j∈δ + (i) C ij +C i max k (ϕ ) m =k ζ (ϕ ) m r (ϕ ) m . (4.35) We claim that the resulting MILP problem: max λ, s.t. (4.33b), (4.33c), (4.27), (4.28), (4.34), (4.35), (4.30), (4.33e) (4.36) has the same optimal solution as (4.33). Proof. See Appendix C.5. 118 In general, the MILP problem (4.36) is still NP-hard, which can incur high complexity to find the exact solution. However, there are many software toolboxes designed to deal with general MILP problems, which can find approximate solutions that trade off accuracy with running time. For example, we use the widely adoptedintlinprog function in MATLAB to implement the proposed design. In addition, it can serve as a good starting point for future studies to design approximation algorithms. 4.6.3 PerformanceAnalysis In this section, we present an equivalent characterization of the stability region under a given database placement. Proposition7. Foranyfixeddatabaseplacement x∈X,anarrivalvectorλ isinteriortothestabilityregion Λ( x) if and only if there exist flow variables f,f ′ ⪰ 0 satisfying (4.26) – (4.30). Proof. In Appendix C.3, we show that the sets ofλ described in this proposition and Theorem 4 are equal, completing the proof. Furthermore, the result also applies to Markov-modulated arrivals, as Theorem 4 does. Proposition 7 shows that the proposed database placement policy can achieve max-throughput. 4.7 DatabaseReplacementPolicies In this section, we show that the benefit of database replacement to enlarge the stability region remains unchanged when the communication rate for database replacement (referred to as replacement rate) is restricted, using the proposed time frame structure, under which we develop two replacement policies to handle time-varying service demand statistics. 119 4.7.1 Low-RateReplacement In Section 4.4.3, we illustrated the benefit of database replacement assuming that replacement can be per- formed instantaneously, which can impose a high requirement on the replacement rate. In the following proposition, we show that the same throughput performance can be achieved under arbitrarily low re- placement rate. Proposition8. Foranyarrivalvectorinteriortothestabilityregion,thereexistsareplacementpolicyachiev- ing an[O(T),O(1/T)] tradeoff between average virtual queue backlog and replacement rate. Proof. See Appendix C.4.2.2. We propose the time frame structure and design a reference policy (including T as a parameter), shown to achieve the tunable performance. In the reference policy, we consider a two-timescale system, where processing and transmission deci- sions are made on a per time slot basis, while database replacement decisions are made on a per time frame basis, with each frameT includingT consecutive slots. The replacement is launched at the beginning of each frame, which must be completed by the end of the frame. The policy is throughput-optimal for any given T , and the required replacement rate can be arbitrarily close to zero by pushing T → ∞, with a tradeoff in queue backlog (and thus delay performance). In the rest of the section, we adopt the time frame structure to design two heuristic database replace- ment policies, based on estimatedservicerequestdistribution anddatabasescore, respectively. We note that the proposed design can flexibly incorporate advanced prediction techniques, which plays an equivalent role to estimation, but can enhance the timeliness of the quantities. ∥ ∥ We note that an estimation-based policy derives estimates over frame τ and executes the replacement decisions during frameτ +1. The new placement will take effect in frame τ +2. 120 4.7.2 Rate-BasedReplacement The first policy takes advantage of the max-throughput database placement policy described in Section 4.6. To handle time-varying demand statistics, we calculate the empirical service request distribution over each frameT as follows ˆ p (c) = P t∈T a (c) (t) P c ′ P t∈T a (c ′ ) (t) . (4.37) We then solve the MILP problem (4.36) based on{ˆ p (c) } to derive the updated database placement, and each node can perform the replacement accordingly. While straightforward, this policy exhibits three limitations. First, it neglects the existing resource loads in the network, which can lead to sub-optimal solution. Second, the updated database placement is designed independent with the current placement, which can impose a high requirement on the re- placement rate. Finally, it requires solving the MILP problem in an online manner, which can reduce the accuracy of the approximate solution. 4.7.3 Score-BasedReplacement The second policy is motivated by the “min-ER” rule for route selection (derived in Section 4.5.2), in which we propose to evaluate the benefit for each node i to cache database k by database score (or score for brevity), defined as follows. Definition10 (Score). Foreachinstance,i.e.,agivenrequestψ andnetworkstates(queuingstates,database placement), the score of databasek at nodei is the difference of the min-ER weights assuming node i does not cache the database, and the opposite, i.e., u i,k (ψ )=W ⋆ (x − (i,k))− W ⋆ (x + (i,k)) (4.38) 121 Node 1 (static source) Node 2 (static source) Live source Node 3 (not a static source) Destination Live packet paths Static packet paths Final packet paths Processing location Figure 4.6: The min-ERs (denoted by red, blue, and green arrows) for the delivery of an AR service over the network, assuming nodes 1, 2, and 3 are selected to provision the static packet, respectively. whereW ⋆ (x)isthemin-ERweightgivenbyAlgorithm4;x − (i,k)andx + (i,k)denotecachingvectorsequal tox, but withx i,k =0 andx i,k =1, respectively. In particular, for a given database k: for a static source node, the score is the increment of min-ER weight if it does not cache databasek; otherwise, the score is the reduction of min-ER weight if the node caches databasek. We illustrate the definition by the example in Fig. 4.6. Let W 1 ,W 2 , andW 3 denote the min-ER weights assuming that node1,2, and3 are selected to provision the static packet, respectively (note that node3 is not a static source, and it is assumed to cache the database to derive the green ER and associated weight W 3 ), withW 3 < W 1 < W 2 . Then, node1 is selected as the static source, serving as the benchmark. The database score at each node is derived as follows. Node1: if it does not cache databasek, node 2 will be selected, leading to a greater weightW 2 , and thusu 1,k =W 2 − W 1 . Node2: if itdoesnot cache databasek, the cache selection decision does not change, leading to the same weightW 1 , and thusu 2,k =W 1 − W 1 = 0. Node3: if it caches databasek, node3 will be selected as the static source, leading to a reduced weight W 3 , and thusu 3,k =W 1 − W 3 . We note that the above definition of score (i) assumes unchanged caching policies at the other network nodes, (ii) requires finding for x − (i,k) andx + (i,k) the corresponding min-ERs (which, in particular, can 122 include different processing locations), and (iii) accounts for a single instance, and the sum score of all instances within a time frame is a proper metric to evaluate the overall score, i.e., U i,k = X t∈T X ψ ∈A(t) u i,k (ψ ) (4.39) whereA(t) denotes the received requests at timet. Given the obtained scores, we formulate an optimization problem with the goal ofmaximizingthetotal score to find the updated database placement for each node i∈V, i.e., max x i ∈{0,1} |K| X k∈K U i,k x i,k , s.t. X k∈K F k x i,k ≤ S i . (4.40) The above problem, known as0/1knapsackproblem, admits a dynamic programming solution with pseudo- polynomial complexityO(|K|S i ) [47]. LetU ⋆ i andx ⋆ i denote the optimal value and solution, respectively. Finally, we find the node with the largest total score, i.e., i ⋆ = argmax i U ⋆ i , and only replace its databases according tox ⋆ i ⋆, which is referred to asasynchronous update, in line with the definition of score assuming unchanged caching policies at the other network nodes. There are three factors that can impact the performance of this policy. First, the proposed score metric focuses on each individual node (for tractability), and cannot capture the coupling between them. Second, we use the observed queuing states to calculate the score, which in turn are impacted by the database placement. Finally, the asynchronous update can be less efficient for database replacement and lead to slower adaptation. ∗∗ ∗∗ Multiple nodes, whose database scores are calculated independently, can update their caching policies synchronously. While out of this chapter’s scope, this extension is promising to accelerate adaptation. 123 10 1 3 2 4 6 5 7 9 8 Cloud datacenter Edge servers ... ... Figure 4.7: The studied edge cloud network, including9 edge servers (node1 to9) and a cloud datacenter (node10). Arrows of the same color indicate the source-destination pairs of each client. 4.8 NumericalResults 4.8.1 ExperimentSetup Consider a mesh MEC network composed of 9 edge servers and a cloud datacenter, connected by wired links, as shown in Fig. 4.7. Each edge server is equipped with4 processors of frequency2.5 GHz, and each link between them has 1 Gbps transmission capacity. The cloud datacenter is equipped with 8 identical processors; it is connected to all edge servers, and each link has 20 Mbps transmission capacity. †† The length of each time slot is1 ms. Consider4 clients requesting different services. Each service is composed of 2 functions, with param- eters shown in Table 4.1. To recall, each client is a3-tuple of (sources, destinationd, serviceϕ ), and the service function is described by (scaling factorξ (ϕ ) m , workloadr (ϕ ) m [GHz/Gbps], object namek (ϕ ) m , merging ratio ζ (ϕ ) m . The size of each packet is 1 kb, and the arrivals are modeled by i.i.d. Poisson processes with λ Mbps. There are|K| = 8 databases, and each database has the same size ofF = 1 Gb. In the following, we quantify the storage capacity of edge servers in number of databases. Assume that the cloud datacenter has all databases stored. †† The communication resources are dedicated for the delivery of packets belonging to service requests. Requirements for (database) replacement rate are depicted in Fig. 4.10b. 124 Table 4.1: Clients and Service Function Specs Client (1,9,ϕ 1) (3,7,ϕ 2) (7,3,ϕ 3) (9,1,ϕ 4) Func 1 (0.83, 7.1, 1, 0.92) (0.94, 10.0, 3, 0.52) (0.75, 8.7, 5, 1.48) (0.60, 8.4, 7, 0.91) Func 2 (1.06, 5.8, 2, 1.06) (1.22, 7.7, 4, 0.65) (1.31, 9.2, 6, 1.97) (1.34, 7.4, 8, 1.22) 4.8.2 Multi-PipelineFlowControl We first demonstrate the performance of DI-DCNC under a given database placement, where database k = 1,··· ,8 are stored at node i = 1,··· ,4,6,··· ,9, respectively. Two benchmark algorithms are employed for comparison: ‡‡ • Static-to-live(S2L), which makes individual routing decisions for the live packet [81], and then routes the static packet to the selected processing node from thenearest static source along theshortestpath (in weighted ALG). • Live-to-static (L2S), which makes routing decisions for the live packet by restricting processing lo- cations to the static sources (and use local static packet). 4.8.2.1 NetworkStabilityRegion First, we study network stability regions attained by the algorithms, and depict the average delay under different arrival rates (which is set to ∞ if the queues are not stable). As shown in Fig. 4.8a, DI-DCNC attains good delay performance over a wide range of arrival rates; whenλ crosses a critical point (≈ 1.05 Gbps), the average delay blows up, indicative of the stability region boundary. Similar behaviors are observed from S2L and L2S. Comparing the three algorithms, DI-DCNC outperforms the benchmark algorithms in terms of the achieved throughput: 1.05 Gbps (DI-DCNC) > 920 Mbps (L2S) > 660 Mbps (S2L); in other words, DI-DCNC can better exploit network resources to ‡‡ Both S2L and L2S do not consider joint control of multi-pipelines: they focus on the communication loads of either live or static packets. 125 200 400 600 800 1000 1200 Arrival rate [Mbps] 0 50 100 150 200 Delay [ms] DI-DCNC S2L L2S 920 1050 660 (a) Network stability region. 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Processing resource 1 0.4 0.5 0.6 0.7 0.8 0.9 1 Transmission resource 2 DI-DCNC S2L L2S (50%, 50%) (57%, 57%) (76%, 76%) (b) Resource occupation. Figure 4.8: Performance of DI-DCNC (under a given database placement). improve the throughput. We clarify that the throughput attained by S2L improves when increasing the communication resources, and can outperform L2S [16]. We also notice that the delay attained by DI-DCNC is very similar, but not lower, than the benchmarks in low-congestion regimes. As discussed in Section 4.5.4.3, DI-DCNC is designed to reduce the aggregate queuing delay of both live and static data pipelines; such objective, while closely related to (especially in high-congestion regimes), is not exactly equivalent to the actual service delay, which depends on the maximum delay between the two concurrent pipelines. In addition, DI-DCNC neglects the hop-distance of the selected path, which is the dominant component in the low-congestion regimes. 4.8.2.2 ResourceOccupation Next, we study the resource occupation of the algorithms. We assume that the available processing and transmission capacities of each node and link are given by α 1 and α 2 (in percentage) of corresponding maximum budgets, respectively. We then define the feasibleregion as the collection of(α 1 ,α 2 ) pairs under which the delay requirement is fulfilled. Let arrival rate λ = 500 Mbps and average delay requirement =30 ms. 126 Fig. 4.8b depicts the feasible regions attained by the algorithms. Since lower latency can be attained with more resources, i.e.,(α 1 ,α 2 )→(1,1), the feasible regions are to the upper-right of the border lines. §§ As we can observe, DI-DCNC can save the most network resources. In particular, whenα 1 =α 2 =α , the resource saving ratios, i.e.,1− α , of the algorithms are:50% (DI-DCNC)>43% (S2L)>24% (L2S). Another observation is: S2L is communication-constrained, compared to its sensitivity to computation resources. To wit: in the horizontal direction (whenα 2 =1), it can achieve a maximum saving ratio of≈ 70%, which is comparable to DI-DCNC; while the maximum saving ratio is≈ 25% for communication resources (when α 1 = 1), and there is a large gap between its performance and that of DI-DCNC (≈ 50%). The reason is that S2L neglects the communication load of static packet routing, leading to additional communication resource consumption. In contrast, L2S is processing-constrained, because only the processing resources at static sources are available for use. 4.8.3 Joint3CResourceOrchestration Next, we evaluate the proposed database placement and replacement policies, employing DI-DCNC for flow control. 4.8.3.1 FixedPlacementPolicy First, we focus on the setting of fixed database placement, assuming that each edge server can cache S databases. We evaluate the proposed max-throughput policy, employing two random policies as bench- marks: • Random selection: each edge server randomly selectsS different databases to cache. §§ We note that the feasible region achieved by S2L is not convex. 127 • Random placement: each edge server caches S different databases, which are jointly selected to maximize the diversity of databases stored at edge servers. ¶¶ Similar performance metrics, i.e., network stability region and resource occupation, are studied, and the results are shown in Fig. 4.9. (i) Network Stability Region: Fig. 4.9a shows the throughput performance of the three policies: for the proposed policy, we solve the MILP problem (4.36) and present the obtained result, as well as the observed stability region under the derived max-throughput database placement; for the benchmark policies, we plug in randomly generated placement x into (4.36), solve the remaining linear programs, and present averages and standard deviations of the results (in100 realizations). As we can observe, for each policy, the attained throughput grows with storage resource. Among the three policies, the proposed max-throughput policy outperforms the random benchmarks, especially when the storage resource is limited. For example, whenS =1, the proposed policy achieves the highest throughput (≈ 1.59 Gbps), which is37% better than random placement and far beyond random selection. The results validate the effects of caching policy design on the throughput performance, including (i) which databases to cache (comparing random placement and random selection), and (ii) where to store the databases (comparing proposed policy and random placement). Finally, we note that results given by the MILP (4.36) agree with the observed stability regions, validating Proposition 7. (ii) Resource Occupation: Next, we study the tradeoff between 3C network resources, assuming λ = 1 Gbps, average delay requirement= 30 ms, andα 1 = α 2 = α . For each random benchmark, we select a representative placement that attains a throughput performance closest to the corresponding mean value (shown in Fig. 4.9a). ¶¶ The cached databases at the edge servers are selected as follows. Generate a random permutation of sequence{1,··· ,|K|} and repeat it for⌈|V|S/|K|⌉ times. LetD andDi denote the resulting sequence and itsith element, respectively. Then, database D (i− 1)S+1 ,··· ,DiS are cached at edge serveri (note that these are different databases since every sub-sequence in D of length S≤|K| includes distinct values). 128 1 2 3 4 5 6 7 8 Storage resource 0 0.5 1 1.5 2 Network stability region [Gbps] Optimal placement (MILP) Optimal placement (dynamic) Random placement (MILP) Random selection (MILP) (a) Network stability region (with standard deviations). 1 2 3 4 5 6 7 8 Storage resource 0.6 0.7 0.8 0.9 1 Processing & transmission resource Optimal placement Random placement Random selection (b) Resource occupation. Figure 4.9: Performance of max-throughput database placement policy. As we can observe in Fig. 4.9b, increasing the storage resource at the edge servers leads to a larger com- putation/communication resource saving ratio. In particular, the performance of the policies converges when each node has sufficient space to cache all databases (i.e., S = |K|). When the storage resource is limited (e.g.,S =1), the proposed policy can achieve a computation/communication resource saving ratio (69%) that is close to the optimal value (≈ 67%), outperforming the random benchmarks. 4.8.3.2 ReplacementPolicies Finally, we evaluate the proposed database replacement policies. For each client, we model the arrivals of service requests by a Markov-modulated process, i.e., the arrival rate follows a Markov process (described in the following), and the number of arrivals at a single time slot is a Poisson variable. At each time slot, the service request distribution is apermutation of the Zipf distribution withγ =1 [61]: we sort the clients by the arrival rate in descending order, and w.p.10 − 6 , we randomly selectφ∈{1,2,3} and exchange the arrival rates of theφ-th and(φ+1)-th clients. ∗∗∗ ∗∗∗ Under this setting, the expected time for service request distribution to change is10 6 ms≈ 15 min. We note that the time average arrival rate for all clients are equal (i.e., uniform distribution). 129 0 200 400 600 800 1000 1200 Arrival rate [Mbps] 0 50 100 150 200 250 300 Delay [ms] Rate-replacement Score-replacement Fixed placement (a) Network stability regions (frame size=100 s). 0.1 0.2 0.3 Frame size [ 10 3 seconds] 0 0.3 0.6 0.9 1.2 Throughput [Gbps] 0 2 4 6 8 Replacement rate [Mbps] Rate-based Score-based Fixed placement 1 5 10 ms 1.1 1.15 (b) Effects of frame size on throughput and replacement rate. Figure 4.10: Performance of rate- and score-based replacement policies. Each edge server, except nodei = 4,5,6, is allowed to cacheS = 1 database. ††† The initial database placement is given by the proposed max-throughput policy assuming uniform service request distribution. The two proposed replacement policies are evaluated. For fair comparisons, we set the same running time for both of them (to calculate estimated quantities and solve corresponding problems). Fig. 4.10a shows the throughput performance of the two replacement policies, both of which effectively boost the throughput performance (≈ 1 Gbps) compared to fixed placement ( ≈ 0.5 Gbps), despite the slightly worse delay performance in low-congestion regimes (e.g.,λ ≤ 400 Mbps). Fig. 4.10b demonstrates the effects of time frame size on the policies’ throughput performance and replacement rate requirements (i.e., the average downloading rate from the cloud datacenter to all edge servers). For each policy, as frame size grows, the frequency of database replacement reduces, leading to reduced replacement rate, with sub- optimal throughput. ‡‡‡ Comparing the two proposed policies, we find that: rate-based policy achieves better throughput, which is also less sensitive to the frame size (note that the blue-solid curve is more flat); while score-based policy imposes a much lower requirement on replacement rate, due to the asynchronous update of database placement that is designed depending on current network states. ††† Under this setting, the total storage resources at the edge servers are 6, which cannot support caching all the databases (since|K|=8). ‡‡‡ The time frame size controls the tradeoff between the accuracy and timeliness of the estimates, as can be observed from the score-based policy (the frame size of5 ms leads to the largest throughput). 130 4.9 Conclusions We investigated the problem of joint 3C control for the efficient delivery of data-intensive services, com- posed of multiple service functions with multiple (live/static) input streams. We first characterized net- work stability regions based on the proposed ALG model, which incorporates multiple pipelines for input streams. Two problems are addressed: (i) multi-pipeline flow control, in which we derived a throughput- optimal policy, DI-DCNC, to coordinate packet processing, routing, and replication decisions for multi- ple pipelines, and (ii) joint 3C resource orchestration, in which we proposed a max-throughput database placement policy by jointly optimizing 3C decisions, as well as the rate- and score-based database re- placement policies. Via numerical experiments, we demonstrated the superior performance of multi- ple pipeline coordination and integrated 3C design in delivering next-generation data-intensive real-time stream-processing services. 131 Chapter5 Outlook This thesis is centered around the development of dynamic control policies for distributed computing networks to deliver emerging AgI services with challenging requirements. We studied three categories of AgI services receiving the most recent attention: delay-critical, mixed-cast, and data-intensive services, covering numerous use cases from industrial automation to Metaverse experiences. The goal is to design control policies with an increasing level of compute-communication integration to enable efficient delivery of such services. We studied a general setting, where AgI services are modeled by service DAGs (including the special case of SFCs), and distributed cloud networks are modeled as dynamic systems with heterogeneous de- vices, going beyond the relatively simplified network and service models in the current literature. With a carefully designed ALG (including the special case of layered graph), we can transform the AgI service delivery problem to the packet routing problem, and the latter is an extensively explored topic with many well-known results. Indeed, techniques developed for packet routing can be modified to handle packet processing as well, and we focus on those optimizing for delay, throughput, and cost performances, which are essential metrics of the QoE. Given the available network resources and service requirements, we provided an operational charac- terization of the network stability region for each category of AgI services. Compared to existing works, 132 our main contrition is to derive generalized flow conservation laws, taking into account strict deadlines, multiple destinations, and data merging. In addition, an important finding from the characterization is: there exists a stationary randomized policy to support any given point in the stability region. Due to the more challenging requirements raised by the AgI services, the data packets are designed to include additional information that can drive decision making, e.g., remaining lifetime in the delay-critical service. To accommodate such packets at each network node, new queuing systems are established. To be specific, we propose lifetime queues, duplication status queues, and paired packet queues for the three AgI services to keep track of data packet’s remaining lifetime, current destinations, and qualification for processing, respectively. Then, we leverage Lyapunov optimization theory to design the control policy, with the goal of alleviating network congestion (i.e., stabilizing the queues), while minimizing the overall operational cost. In many cases, we are able to derive throughput-optimal, fully-distributed control poli- cies, with a manually selected parameter that controls the tradeoff between the average delay ∼O (V) and the operational cost (the gap from the attained cost to the optimum∼O (1/V) can be pushed arbitrarily close to zero by settingV →∞). Last but not least, in the following, we point out some open problems in these studies: • (Complexity reduction) In Chapter 2, the RCNC algorithm operates in a centralized manner with a high complexity that increases with network size and largest packet lifetime. Development of distributed variants, as well as finding appropriate time slot length, to reduce the complexity of decision making is of interest for future work. • (Parameterselection) In Chapter 3, GDCNC-R is proposed to reduce the complexity of GDCNC, which focuses on a subset of duplication trees selected by a clustering procedure. However, the impacts of clustering method and the number of duplication trees are not thoroughly studied, as well as their joint optimization with the proposed approach. 133 • (3C integration) In Chapter 4, we focus on 3C integration resulting from placement of databases in distributed cloud networks. Another closely related model of practical relevance is to cache thedata objects, which have smaller file sizes (but a larger number of files), under which 3C operations are in the same timescale, whose joint optimization needs further investigation. • (Combinationoftheresults) Practical AgI services might belong to multiple categories, e.g., the social VR application (which is delay-critical, multi-cast, and data-intensive), making the problem more challenging. Combination of the derived results – in particular, resolving tradeoffs among different requirements – is an important aspect for the practical application of the developed techniques. 134 Bibliography [1] 3GPP. “Study on channel model for frequencies from0.5 to100 GHz”. In: 3rd Generation Partnership Project (3GPP), Tech. Rep. 38 (2018). [2] Bernardetta Addis, Dallal Belabed, Mathieu Bouet, and Stefano Secci. “Virtual network functions placement and routing optimization”. In: IEEE Int. Conf. Cloud Netw. (CLOUDNET). Niagara Falls, Canada, Oct. 2015, pp. 171–177. [3] Ansuman Adhikary, Junyoung Nam, Jae-Young Ahn, and Giuseppe Caire. “Joint spatial division and multiplexing – The large-scale array regime”. In: IEEE Trans. Inf. Theory 59.10 (Oct. 2013), pp. 6441–6463. [4] Ethem Alpaydin. Introduction to machine learning. Cambridge, MA, USA: MIT press, 2020. [5] Eitan Altman. Constrained Markov decision processes. Boca Raton, FL, USA: CRC Press, 1999. [6] Mahardeka Tri Ananta, Jehn-Ruey Jiang, and Muhammad Aziz Muslim. “Multicasting with the extended Dijkstra’s shortest path algorithm for software defined networking”. In: Int. J. Appl. Eng. Res. 9.23 (2014), pp. 21017–21030. [7] Carles Anton-Haro and Mischa Dohler. Machine-to-machine (M2M) communications: architecture, performance and applications. Boca Raton, FL, USA: Elsevier, 2014. [8] Marc Barcelo, Alejandro Correa, Jaime Llorca, Antonia M. Tulino, Jose Lopez Vicario, and Antoni Morell. “IoT-cloud service optimization in next generation smart environments”. In: IEEE J. Sel. Areas Commun. 34.12 (Oct. 2016), pp. 4077–4090. [9] Marc Barcelo, Jaime Llorca, Antonia M. Tulino, and Narayan Raman. “The cloud service distribution problem in distributed cloud networks”. In: Proc. IEEE Int. Conf. Commun. London, UK, May 2015, pp. 344–350. [10] Md. Faizul Bari, Shihabur Rahman Chowdhury, Reaz Ahmed, and Raouf Boutaba. “On orchestrating virtual network functions in NFV”. In: Int. Conf. on Netw. Service Manag. (CNSM). Barcelona, Spain, Nov. 2015, pp. 50–56. [11] Stephen P Boyd and Lieven Vandenberghe. Convex optimization. New York, NY, USA: Cambridge university press, 2004. 135 [12] Loc Bui, R. Srikant, and Alexander Stolyar. “Novel architectures and algorithms for delay reduction in back-pressure scheduling and routing”. In: Proc. IEEE INFOCOM. Rio de Janeiro, Brazil, Apr. 2009, pp. 2936–2940. [13] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Compute- and data-intensive networks: The key to the Metaverse”. In: 2022 1st International Conference on 6G Networking (6GNet). Paris, France, July 2022, pp. 1–8. [14] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Decentralized control of distributed cloud networks with generalized network flows”. In: submitted to IEEE Trans. Commun. (2022). [15] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. Decentralized control of distributed cloud networks with generalized network flows. arXiv:2204.09030. [Online]. Available: https://arxiv.org/abs/2204.09030. Apr. 2022. [16] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Dynamic control of data-intensive services over edge computing networks”. In: Proc. IEEE Global. Telecomm. Conf. Rio de Janeiro, Brazil, Dec. 2022, pp. 1–6. [17] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Joint compute-caching-communication control for online data-intensive service delivery”. In: submitted to IEEE Trans. Mobile Comput. (2022). [18] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Mobile edge computing network control: Tradeoff between delay and cost”. In: Proc. IEEE Global. Telecomm. Conf. Taipei, Taiwan, Dec. 2020, pp. 1–6. [19] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Optimal cloud network control with strict latency constraints”. In: Proc. IEEE Int. Conf. Commun. Montreal, Canada, June 2021, pp. 1–6. [20] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Optimal multicast service chain control: Packet processing, routing, and duplication”. In: Proc. IEEE Int. Conf. Commun. Montreal, Canada, June 2021, pp. 1–7. [21] Yang Cai, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Ultra-reliable distributed cloud network control with end-to-end latency constraints”. In: IEEE/ACM Trans. Netw. 1.99 (2022), pp. 1–16. [22] Yang Cai and Andreas F. Molisch. “On the multi-activation oriented design of D2D-aided caching networks”. In: Proc. IEEE Global. Telecomm. Conf. Waikoloa, HI, USA, Dec. 2019, pp. 1–6. [23] Kun Chen and Longbo Huang. “Timely-throughput optimal scheduling with prediction”. In: IEEE/ACM Trans. Netw. 26.6 (Sept. 2018), pp. 2457–2470. [24] Min Chen and Yixue Hao. “Task offloading for mobile edge computing in software defined ultra-dense network”. In: IEEE J. Sel. Areas Commun. 36.3 (Mar. 2018), pp. 587–597. 136 [25] Xu Chen, Lei Jiao, Wenzhong Li, and Xiaoming Fu. “Efficient multi-User computation offloading for mobile-edge cloud computing”. In: IEEE/ACM Trans. Netw. 24.5 (Oct. 2016), pp. 2795–2808. [26] Konstantinos Chorianopoulos and George Lekakos. “Introduction to social TV: Enhancing the shared experience with interactive TV”. In: Intl. Journal of Human-Computer Interaction 24.2 (2008), pp. 113–120. [27] Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, and Clifford Stein. Introduction to algorithms. Cambridge, MA, USA: MIT press, 2009. [28] Eduardo Cuervo, Krishna Chintalapudi, and Manikanta Kotaru. “Creating the perfect illusion: What will it take to create life-like virtual reality headsets?” In: HotMobile ’18. Tempe, AZ, USA, Feb. 2018, pp. 7–12. [29] Sheng Di, Derrick Kondo, and Walfredo Cirne. “Host load prediction in a Google compute cloud with a Bayesian model”. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. Salt Lake City, Utah, Nov. 2012, pp. 1–11. [30] Zhiguo Ding, Pingzhi Fan, George K. Karagiannidis, Robert Schober, and H. Vincent Poor. “NOMA assisted wireless caching: Strategies and performance analysis”. In: IEEE Trans. Commun. 66.10 (Oct. 2018), pp. 4854–4876. [31] Hao Feng, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “On the delivery of augmented information services over wireless computing networks”. In: Proc. IEEE Int. Conf. Commun. Paris, France, May 2017, pp. 1–7. [32] Hao Feng, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Optimal control of wireless computing networks”. In: IEEE Trans. Wireless Commun. 17.12 (Dec. 2018), pp. 8283–8298. [33] Hao Feng, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch. “Optimal dynamic cloud network control”. In: IEEE/ACM Trans. Netw. 26.5 (Oct. 2018), pp. 2118–2131. [34] Guo Freeman and Divine Maloney. “Body, avatar, and me: The presentation and perception of self in social virtual reality”. In: Proceedings of the ACM on Human-Computer Interaction 4.CSCW3 (Jan. 2021), pp. 1–27. [35] Michael R. Garey and David S. Johnson. A guide to the theory of NP-completeness. New York: WH Freemann, 1979. [36] Negin Golrezaei, Andreas F. Molisch, Alexandros G. Dimakis, and Giuseppe Caire. “Femtocaching and device-to-device collaboration: A new architecture for wireless video distribution”. In: IEEE Commun. Mag. 51.4 (Apr. 2013), pp. 142–149. [37] Maria Gregori, Jesús Gómez-Vilardebó, Javier Matamoros, and Deniz Gündüz. “Wireless content caching for small cell and D2D networks”. In: IEEE J. Sel. Areas Commun. 34.5 (May 2016), pp. 1222–1234. 137 [38] Ying He, Nan Zhao, and Hongxi Yin. “Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach”. In: IEEE Trans. Veh. Technol. 67.1 (Jan. 2018), pp. 44–55. [39] Liang Hu, Xi-Long Che, and Si-Qing Zheng. “Online system for grid resource monitoring and machine learning-based prediction”. In: IEEE Trans. Parallel Distrib. Syst. 23.1 (Jan. 2012), pp. 134–145. [40] Longbo Huang, Scott Moeller, Michael J. Neely, and Bhaskar Krishnamachari. “LIFO-backpressure achieves near-optimal utility-delay tradeoff”. In: IEEE/ACM Trans. Netw. 21.3 (Sept. 2013), pp. 831–844. [41] Meitian Huang, Weifa Liang, Yu Ma, and Song Guo. “Maximizing throughput of delay-sensitive NFV-enabled request admissions via virtualized network function placement”. In: IEEE Trans. Cloud Comput. 9.4 (Oct. 2021), pp. 1535–1548. [42] Stratis Ioannidis and Edmund Yeh. “Jointly optimal routing and caching for arbitrary network topologies”. In: IEEE J. Sel. Areas Commun. 36.6 (June 2018), pp. 1258–1275. [43] Salekul Islam, Nasif Muslim, and J. William Atwood. “A Survey on Multicasting in Software-Defined Networking”. In: IEEE Commun. Surveys Tuts. 20.1 (Firstquarter 2018), pp. 355–387. [44] Mingyue Ji, Giuseppe Caire, and Andreas F. Molisch. “Wireless device-to-device caching networks: Basic principles and system performance”. In: IEEE J. Sel. Areas Commun. 34.1 (Jan. 2016), pp. 176–189. [45] Mingyue Ji, Antonia M. Tulino, Jaime Llorca, and Giuseppe Caire. “Order-optimal rate of caching and coded multicasting with random demands”. In: IEEE Trans. Inf. Theory 63.6 (June 2017), pp. 3923–3949. [46] Khashayar Kamran, Edmund Yeh, and Qian Ma. “DECO: Joint computation scheduling, caching, and communication in data-intensive computing networks”. In: IEEE/ACM Trans. Netw. 1.99 (2021), pp. 1–15. [47] Jon Kleinberg and Éva Tardos. Algorithm design. Delhi, India: Pearson Education India, 2006. [48] Sina Lashgari and Amir Salman Avestimehr. “Timely throughput of heterogeneous wireless networks: Fundamental limits and algorithms”. In: IEEE Trans. Inf. Theory 59.12 (Sept. 2013), pp. 8414–8433. [49] Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. “Continuous control with deep reinforcement learning”. In: International Conference on Learning Representations (ICLR). San Juan, Puerto Rico, May 2016, pp. 1–10. [50] An Liu and Vincent K. N. Lau. “Exploiting base station caching in MIMO cellular networks: Opportunistic cooperation for video streaming”. In: IEEE Trans. Signal Process. 63.1 (Jan. 2015), pp. 57–69. 138 [51] Boxi Liu, Konstantinos Poularakis, Leandros Tassiulas, and Tao Jiang. “Joint caching and routing in congestible networks of arbitrary topology”. In: IEEE Internet Things J. 6.6 (Dec. 2019), pp. 10105–10118. [52] Xinchen Lyu, Hui Tian, Wei Ni, Yan Zhang, Ping Zhang, and Ren Ping Liu. “Energy-efficient admission of delay-sensitive tasks for mobile edge computing”. In: IEEE Trans. Commun. 66.6 (June 2018), pp. 2603–2616. [53] Pavel Mach and Zdenek Becvar. “Mobile edge computing: A survey on architecture and computation offloading”. In: IEEE Commun. Surveys Tuts. 19.3 (Mar. 2017), pp. 1628–1656. [54] Yuyi Mao, Changsheng You, Jun Zhang, Kaibin Huang, and Khaled B. Letaief. “A survey on mobile edge computing: The communication perspective”. In: IEEE Commun. Surveys Tuts. 19.4 (Fourthquarter 2017), pp. 2322–2358. [55] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. “Human-level control through deep reinforcement learning”. In: nature 518.7540 (2015), pp. 529–533. [56] Anselme Ndikumana, Nguyen H. Tran, Tai Manh Ho, Zhu Han, Walid Saad, Dusit Niyato, and Choong Seon Hong. “Joint communication, computation, caching, and control in big data multi-access edge computing”. In: IEEE Trans. Mobile Comput. 19.6 (June 2020), pp. 1359–1374. [57] Michael J. Neely. “Opportunistic scheduling with worst case delay guarantees in single and multi-hop networks”. In: Proc. IEEE INFOCOM. Shanghai, China, Apr. 2011, pp. 1728–1736. [58] Michael J. Neely. Stochastic network optimization with application to communication and queueing systems. San Rafael, CA, USA: Morgan & Claypool, 2010. [59] Michael J. Neely and Sucha Supittayapornpong. “Dynamic Markov decision policies for delay constrained wireless scheduling”. In: IEEE Trans. Autom. Control 58.8 (Aug. 2013), pp. 1948–1961. [60] Georgios S. Paschos, George Iosifidis, Meixia Tao, Don Towsley, and Giuseppe Caire. “The Role of Caching in Future Communication Systems and Networks”. In: IEEE J. Sel. Areas Commun. 36.6 (June 2018), pp. 1111–1125. [61] Konstantinos Poularakis, Jaime Llorca, Antonia M. Tulino, and Leandros Tassiulas. “Approximation algorithms for data-intensive service chain embedding”. In: Mobihoc ’20. Virtual Event, USA, Oct. 2020, pp. 131–140. [62] Konstantinos Poularakis, Jaime Llorca, Antonia M. Tulino, Ian Taylor, and Leandros Tassiulas. “Service placement and request routing in MEC networks with storage, computation, and communication constraints”. In: IEEE/ACM Trans. Netw. 28.3 (June 2020), pp. 1047–1060. [63] Thorsten Quandt and Sonja Kröger. Multiplayer: The social aspects of digital gaming. Routledge, 2013. 139 [64] Karthikeyan Shanmugam, Negin Golrezaei, Alexandros G. Dimakis, Andreas F. Molisch, and Giuseppe Caire. “FemtoCaching: Wireless content delivery through distributed caching helpers”. In: IEEE Trans. Inf. Theory 59.12 (Dec. 2013), pp. 8402–8413. [65] Rahul Singh and P. R. Kumar. “Adaptive CSMA for decentralized scheduling of multi-hop networks with end-to-end deadline constraints”. In: IEEE/ACM Trans. Netw. 29.3 (June 2021), pp. 1224–1237. [66] Rahul Singh and P. R. Kumar. “Throughput optimal decentralized scheduling of multihop networks with end-to-end deadline constraints: Unreliable links”. In: IEEE Trans. Autom. Control 64.1 (Oct. 2018), pp. 127–142. [67] Abhishek Sinha and Eytan Modiano. “Optimal control for generalized network flow problems”. In: IEEE/ACM Trans. Netw. 26.1 (Feb. 2018), pp. 506–519. [68] Jingzhou Sun, Lehan Wang, Zhiyuan Jiang, Sheng Zhou, and Zhisheng Niu. “Age-optimal scheduling for heterogeneous traffic with timely throughput constraints”. In: IEEE J. Sel. Areas Commun. 39.5 (May 2021), pp. 1485–1498. [69] Yaping Sun, Zhiyong Chen, Meixia Tao, and Hui Liu. “Communications, caching, and computing for mobile virtual reality: Modeling and tradeoff”. In: IEEE Trans. Commun. 67.11 (Nov. 2019), pp. 7573–7586. [70] Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018. [71] Leandros Tassiulas and Anthony Ephremides. “Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks”. In: IEEE Trans. Autom. Control 37.12 (Dec. 1992), pp. 1936–1948. [72] Tuyen X. Tran and Dario Pompili. “Joint task offloading and resource allocation for multi-server mobile-edge computing networks”. In: IEEE Trans. Veh. Technol. 68.1 (Jan. 2019), pp. 856–868. [73] Marcus Weldon. The future X network: A Bell Labs perspective. Boca Raton, FL, USA: CRC Press, 2016. [74] Weitao Xu, Yiran Shen, Neil Bergmann, and Wen Hu. “Sensor-assisted multiview face recognition system on smart glass”. In: IEEE Trans. Mobile Comput. 17.1 (Jan. 2018), pp. 197–210. [75] Zhiyuan Xu, Jian Tang, Chengxiang Yin, Yanzhi Wang, and Guoliang Xue. “Experience-driven congestion control: When multi-path TCP meets deep reinforcement learning”. In: IEEE J. Sel. Areas Commun. 37.6 (June 2019), pp. 1325–1336. [76] Zichuan Xu, Zhiheng Zhang, Weifa Liang, Qiufen Xia, Omer Rana, and Guowei Wu. “QoS-aware VNF placement and service chaining for IoT applications in multi-tier mobile edge networks”. In: ACM Trans. Sen. Netw. 16.3 (Aug. 2020), pp. 1–27. [77] Ji Xue, Feng Yan, Robert Birke, Lydia Y. Chen, Thomas Scherer, and Evgenia Smirni. “PRACTISE: Robust prediction of data center time series”. In: 2015 11th International Conference on Network and Service Management (CNSM). Barcelona, Spain, Nov. 2015, pp. 126–134. 140 [78] Lei Ying, Sanjay Shakkottai, Aneesh Reddy, and Shihuan Liu. “On combining shortest-path and back-pressure routing over multihop wireless networks”. In: IEEE/ACM Trans. Netw. 19.3 (June 2011), pp. 841–854. [79] Yi Yue, Bo Cheng, Xuan Liu, Meng Wang, Biyi Li, and Junliang Chen. “Resource optimization and delay guarantee virtual network function placement for mapping SFC requests in cloud networks”. In: IEEE Trans. Netw. Service Manag. 18.2 (June 2021), pp. 1508–1523. [80] Yi Yue, Bo Cheng, Meng Wang, et al. “Throughput optimization and delay guarantee VNF placement for mapping SFC requests in NFV-enabled networks”. In: IEEE Trans. Netw. Service Manag. 18.4 (Dec. 2021), pp. 4247–4262. [81] Jianan Zhang, Abhishek Sinha, Jaime Llorca, Antonia M. Tulino, and Eytan Modiano. “Optimal control of distributed computing networks with mixed-cast traffic flows”. In: IEEE/ACM Trans. Netw. 29.4 (Aug. 2021), pp. 1760–1773. [82] Jing Zhang, Jun Du, Yuan Shen, and Jian Wang. “Dynamic computation offloading with energy harvesting devices: A hybrid-decision-based deep reinforcement learning approach”. In: IEEE Internet of Things Journal 7.10 (2020), pp. 9303–9317. [83] Kaiqing Zhang, Zhuoran Yang, and Tamer Başar. “Multi-agent reinforcement learning: A selective overview of theories and algorithms”. In: Handbook of Reinforcement Learning and Control (2021), pp. 321–384. [84] Yuchen Zhou, Fei Richard Yu, Jian Chen, and Yonghong Kuo. “Communications, caching, and computing for next generation HetNets”. In: IEEE Wirel. Commun. 25.4 (Aug. 2018), pp. 104–111. 141 AppendixA ProofsinChapter2 A.1 ProofforProposition1 Consider any sample path (the randomness comes from both the arrival process and the policy). Denote byx(t) the decisions made by the policy, which satisfies the availability constraint (2.12d), i.e., x (l) i→ (t)≤ Q (l) i (t), ∀i∈V, l∈L. (A.1) Specially, we focus on the intermediate nodesi∈V\{d}. Recall the queuing dynamics (2.3), and sum up the equations ofℓ=l,··· ,L, which leads to Q (≥ l) i (t+1)=Q (≥ l+1) i (t)+x (≥ l+1) →i (t)− x (≥ l+1) i→ (t)+a (≥ l) i (t) (A.2) whereQ (≥ l) i (t)≜ P L ℓ=l Q (ℓ) i (t) (and the other terms are defined in the same way). By definition Q (≥ l) i (t+1)=Q (l) i (t+1)+Q (≥ l+1) i (t+1). (A.3) 142 By availability constraint (2.10d), we know Q (l) i (t+1)≥ x (l) i→ (t+1). (A.4) Applying the above relationship to (A.3), combining with (A.2) and rearranging the terms, we can obtain Q (≥ l+1) i (t+1)− Q (≥ l+1) i (t)≤ x (≥ l+1) →i (t)+a (≥ l) i (t) − x (l) i→ (t+1)+x (≥ l+1) i→ (t) . (A.5) The telescoping sum is then applied to the above inequality [58]. Fix someT , sum the inequalities of t=0,··· ,T − 1, and the result is (w.l.o.g., assumeQ(0)=0) Q (≥ l+1) i (T)≤ T− 1 X t=0 x (≥ l+1) →i (t)+a (≥ l) i (t) − T− 1 X t=0 x (l) i→ (t+1)+x (≥ l+1) i→ (t) (A.6) Some standard operations are performed, including taking expectation (since the above inequality holds for each sample path), dividing byT , and pushingT →∞. Use the fact thatQ(t) is bounded (due to the assumptions of bounded arrival and packet drop), and thus stable, and we can obtain n E n x (≥ l+1) →i (t) oo +λ (≥ l) i (t)− n E n x (l) i→ (t+1)+x (≥ l+1) i→ (t) oo ≥ 0. (A.7) Furthermore, we can replace x (l) i→ (t+1) by x (l) i→ (t) in the second line because the two expressions lead to identical long-term average (just one time slot shift), and the two terms on the right-hand-side can be combined, i.e., n E n x (≥ l+1) →i (t) oo +λ (≥ l) i ≥ n E n x (≥ l) i→ (t) oo (A.8) 143 which is true for alll = 1,··· ,L. It is also true forl = 0, but the relationship is trivial (which is implied by that ofl =1). This is the causality constraint (2.13d). A.2 StabilityregionofP 1 A.2.1 Necessity Suppose(λ ,γ )∈ Λ 1 . By definition there exists a policy under which the constraints (2.12b) – (2.12e) are satisfied, while achieving the optimal cost. Denote by X (l) ij (t) ≥ 0 the number of successfully delivered packets within the first t time slots, which are transmitted on link(i,j) at lifetimel, and define X ij (t)= X l∈L X (l) ij (t). (A.9) In addition, denote byA (l) i (t) the total number of lifetimel packets that arrives at nodei during the first t time slots. The basic facts are listed in the following X j∈δ − d X jd (t)+A d (t)≥ γ X i∈V A i (t) (A.10a) lim t→∞ X ij (t) t ≤ C ij (A.10b) X j∈δ − i X (≥ l+1) ji (t)+A (≥ l) i (t)≥ X j∈δ + i X (≥ l) ij (t) (A.10c) X (0) ij (t)=X (l) dj (t)=0 (A.10d) 144 where (A.10b) holds due to the average capacity constraint, and (A.10c) is implied by (A.6) (using the fact Q (≥ l+1) i (T)≥ 0). Divide byt, take the limitt→∞, and note that x ij ≜ lim t→∞ X ij (t) t , lim t→∞ A (l) i (t) t =λ (l) i . (A.11) Then we obtain the characterization (2.16). A.2.2 Sufficiency For a given pair(λ ,γ ), suppose there exists a flow assignment x that satisfies (2.16). We will prove (λ ,γ )∈ Λ 1 by constructing a randomized policy, and showing that the achieved flow assignment equals x. A.2.2.1 RandomizedPolicy The policy is construct as follows. For any nodei∈V and lifetimel ∈L, we define a set of probability values α (l) i (j):j∈δ + i with α (l) i (j)= x (l) i→ ˜ x (≥ l) →i − x (≥ l+1) i→ x (l) ij x (l) i→ (A.12) (the expression can be simplified, while we preserve this form for ease of exposition), where ˜ x (l) →i =x (l+1) →i +λ (l) i , and ˜ x (≥ l) →i =x (≥ l+1) →i +λ (≥ l) i (A.13) 145 We claim that this is a set of valid probability values, since by (2.16c), we have ˜ x (≥ l) →i ≥ x (≥ l) i→ = x (l) i→ + x (≥ l+1) i→ , and thus ∗ α (l) i ≜ X j∈δ + i α (l) i (j)= x (l) i→ ˜ x (≥ l) →i − x (≥ l+1) i→ ≤ 1. (A.14) The developed policy∗ operates in the following way: at every time slot, nodei makes independent transmission decisions for each packet of lifetimel (i.e., to transmit or not; if yes, through which outgoing interface) according to the pdf α (l) i (j) : j ∈ δ + i ∪{1− α (l) i }, where the complement part 1− α (l) i accounts for the idle operation. A.2.2.2 ValidatetheConstraints Next, we show that the decisions made by this policy satisfy (2.12). Since the policy makes independent decision for each packet in the queue, the availability constraint (2.12d) is satisfied. Next, we prove that the decisions µ (t) made by this policy satisfy the remaining con- straints, by showing{E{µ (t)}}=x. The decision of policy∗ can be decomposed into two steps, including 1) whether to transmit the packet or not, 2) if yes, through which interfacej. For the first step, define α (ℓ,l) i as the probability that a packet of lifetime ℓ (when first loaded to the queuing system at node i) gets transmitted at the lifetime of l. By definition, α (ℓ,l) i =α (ℓ) i whenl =ℓ. We first present some useful results for the following proofs. The transition probability follows the recurrent formula α (ℓ,l) i = 1− α (ℓ) i α (ℓ− 1,l) i ; (A.15) ∗ If the sum is strictly less than1, the complementary part corresponds to the idle operation (i.e., the packet is not scheduled for transmission at the current time slot). 146 on the other hand, for∀k≤ ℓ− 2 α (ℓ,k) i = ℓ Y l=k+1 1− α (l) i α (k) i = ℓ Y l=k+2 1− α (l) i α (k+1) i 1− α (k+1) i α (k+1) i α (k) i =α (ℓ,k+1) i 1− α (k+1) i α (k+1) i α (k) i (A.16) and the above relationship also holds fork =l− 1. Lemma1. The probabilityα defined as (A.12) leads to the following relationship x (l) i→ = X ℓ≥ l α (ℓ,l) i ˜ x (ℓ) →i . (A.17) Proof. We give a proof by induction onl. Base case: we show the result holds forl =L. By (A.14), X ℓ≥ L α (ℓ,L) i ˜ x (ℓ) →i =α (L) i ˜ x (L) →i = x (L) i→ ˜ x (≥ L) →i − 0 ˜ x (L) →i =x (L) i→ . (A.18) Inductive step: Assume that the result holds forl = k+1, and next we show it also holds forl = k. Note that X ℓ≥ k α (ℓ,k) i ˜ x (ℓ) →i =α (k) i ˜ x (k) →i + X ℓ≥ k+1 α (ℓ,k) i ˜ x (ℓ) →i (A.19a) =α (k) i h ˜ x (k) →i + 1− α (k+1) i α (k+1) i X ℓ≥ k+1 α (ℓ,k+1) i ˜ x (ℓ) →i i (A.19b) =α (k) i h ˜ x (k) →i + ˜ x (≥ k+1) →i − x (≥ k+1) i→ x (k+1) i→ x (k+1) i→ i (A.19c) =α (k) i ˜ x (≥ k) →i − x (≥ k+1) i→ =x (k) i→ (A.19d) where we use the result ofl =k+1, and substitute the definition (A.14) into (A.19b) to derive (A.19c). 147 To sum up, by mathematical induction, the expression (A.17) holds, and this concludes the proof. On the other hand, an equivalent way to describe the randomized policy∗ is, as soon as each packet (of lifetimeℓ) arrives, we make the decision of how many time slots to wait before transmission (and through which interface); more concretely, the probability for the wait time of τ is α (ℓ,ℓ− τ ) i . By exploiting this interpretation, for nodei, the amount of outgoing packets of lifetimel at time slott can be those of lifetime ℓ+1(ℓ≥ l) att ′ = t− (ℓ+1− l) (whereℓ is the lifetime of the packet when it is first loaded into the queuing system of nodei), leading to the following relationship E n µ (l) i→ (t) o =E n X ℓ≥ l α (ℓ,l) i µ (ℓ+1) →i (t ′ )+a (ℓ) i (t ′ ) o = X ℓ≥ l α (ℓ,l) i E n µ (ℓ+1) →i (t ′ ) o +λ (ℓ) i . (A.20) By taking long-term average, the effect of finite (at most L) time slot-shift vanishes, and we obtain n E n µ (l) i→ (t) oo = X ℓ≥ l α (ℓ,l) i n E n µ (ℓ+1) →i (t) oo +λ (ℓ) i . (A.21) By comparing (A.17) (where ˜ x (ℓ) →i = x (ℓ+1) →i + λ (ℓ) i ) and (A.21), we find that {E{µ (t)}} and x are the solutions to the same linear system, which implies n E n µ (l) ij (t) oo =x (l) ij . (A.22) Thus we can replacex by{E{µ (t)}}, in all the constraints (2.16) it satisfies (especially, (2.16a) and (2.16b)). Therefore, we can conclude that policy∗ is an admissible policy forP 1 . A.2.2.3 Optimality For any point within the stability region, assume x is the flow assignment corresponding to the cost- optimal policy. We can construct the randomized policy by the procedure introduced in Appendix A.2.2.1, 148 which leads to (A.22). Since the operational cost is a linear function of the flow assignment, the designed randomized policy achieves the same objective value as the cost-optimal policy. A.3 StabilityregionofP 2 A.3.1 Necessity Suppose(λ ,γ )∈ Λ 2 . By definition there exists a policy under which the constraints (2.12b) – (2.12e) are satisfied, while achieving the optimal cost. Define X(t) as in Appendix A.2.1. The basic facts are listed in the following X j∈δ − d X jd (t)+A d (t)≥ γ X i∈V A i (t) (A.23a) X ij (t)≤ C ij t (A.23b) lim t→∞ X j∈δ − i X (≥ l+1) ji (t) t + A (≥ l) i (t) t ≥ lim t→∞ X j∈δ + i X (≥ l) ij (t) t (A.23c) X (0) ij (t)=X (l) dj (t)=0 (A.23d) where (A.23c) holds due to the causality constraint. Divide byt, take the limitt→∞, and note that x ij ≜ lim t→∞ X ij (t) t , lim t→∞ A (l) i (t) t =λ (l) i . (A.24) Then we obtain the characterization (2.16). A.3.2 Sufficiency For a given pair(λ ,γ ), suppose there exists a flow assignment x that satisfies (2.16). We will prove (λ ,γ )∈ Λ 2 by constructing a stationary randomized policy, and show that the achieved flow assignment equals x. 149 A.3.2.1 RandomizedPolicy For any link(i,j)∈E, define a set of probability values α (l) ij :l∈L , with α (l) ij ≜x (l) ij C ij . (A.25) We claim that this is a set of valid probability values, since they sum up≤ 1 due to (2.16b). † The policy∗ operates in the following way: at every time slot, for each link (i,j), choose a lifetime l ∗ according to the probability values α (l) ij : l∈L independently. The assigned flow is given by ν (l) ij = C ij I{l≡ l ∗ }, i.e., we borrowC ij lifetimel ∗ packets from the reservoir to transmit on link(i,j). A.3.2.2 ValidatetheConstraints Next, we show that this policy makes decisions satisfying (2.13). Since the decisions are made in an i.i.d. manner over time slots, for any time slot, we have E n ν (l) ij (t) o =α (l) ij C ij =x (l) ij . (A.26) The above equation still holds when we take long-term average of it (since it is true for all time slots), i.e., n E n ν (l) ij (t) oo =x (l) ij . (A.27) Thus we can replacex by{E{ν (t)}}, in all the constraints (2.16) it satisfies (especially, (2.16a) and (2.16c)). Therefore, we can conclude that policy∗ is an admissible policy forP 2 . † If the sum is strictly less than1, the complementary part corresponds to the idle operation (i.e., no packet is transmitted). 150 A.3.2.3 Optimality The argument is the same as Appendix A.2.2.3 (the only difference is to construct the randomized policy according to Appendix A.3.2.1). A.4 Distributionofx ij (t) In the studied packet routing problem (where flow scaling is not relevant), we make the following assump- tions: (i) the arrival process of packets with any lifetime is Poisson, (ii) the randomized policies of the nodes are not time varying. We will show: under the above assumptions,x ij (t) follows Poisson distribution for ∀(i,j), t. We present two facts that will be used in the derivation. Fact A: Define X = P N k=1 X k , whereN is a Poisson random variable (r.v.), andX k are i.i.d. Bernoulli r.v.s, thenX is a Poisson r.v.. FactB: Suppose X k ’s are i.i.d. Poisson r.v.s, thenX = P n k=1 X k (wheren is a constant) is a Poisson r.v.. Note that the total flow size is the sum over flows of individual lifetimes: x ij (t)= L X l=1 x (l) ij (t)= L X l=1 h X s∈V L X l 0 =l y (l 0 ) s (t− (l 0 − l)) i = X s∈V L X l 0 =1 l 0 X l=1 y (l 0 ) s (t− (l 0 − l)) (A.28) where we exchange the order of summations in the last equation, and define y (l 0 ) s (t− (l 0 − l))= a (l 0 ) s (t− (l 0 − l)) X k=1 y k . (A.29) In the above expression, we define event A ij (s,l 0 ,l): a packet, which is of initial lifetimel 0 when arriving at nodes at time slott− (l 0 − l), crosses link(i,j) at time slott (when its lifetime isl); andp ij (s,l 0 ,l) as its probability (which is fixed given the randomized policies of all the nodes). Let y k ’s be Bernoulli r.v.s 151 indicating whether eventA ij (s,l 0 ,l) is true for packetk∈{1,··· ,a (l 0 ) s (t− (l 0 − l))}, and they are i.i.d. because the decisions for each packet are made independently. ApplyFactA to (A.29), and we obtain thaty (l 0 ) s (t− (l 0 − l)) is a Poisson r.v.. Next, for different s,l,l 0 : a (l 0 ) s (t− (l 0 − l))’s are independent by assumption, and thus y (l 0 ) s (t− (l 0 − l))’s are also independent; thus,x ij (t) is a Poisson r.v. byFactB, concluding the proof. A.5 ProofforProposition3 In this section, we analyze the performance of the proposedmax-weight algorithm for the virtual network. Note that the goal is to find the decision ν (t) that minimizes the upper bound of the LDP at any time slot t, and thus ∆( U(t))+Vh(ν (t))≤ B−⟨ ˜ a,U(t)⟩−⟨ w(t),ν (t)⟩≤ B−⟨ ˜ a,U(t)⟩−⟨ w(t),φ(t)⟩ (A.30) whereφ(t) is the decision chosen by any other feasible policy; specifically, we assume that it is the ran- domized policy introduced in Appendix A.3.2.1 for the point(λ +(ϵ ′ /γ )1,γ ). The existence ofϵ ′ > 0 is guaranteed by the assumption that(λ ,γ ) lies in the interior of the stability region). To recall, the policy makes i.i.d. decisions at each time slot (independent withU(t)), while achieving the optimal cost for that point, denoted byE{h 2 (φ(t))}=h ⋆ 2 (λ +(ϵ ′ /γ )1,γ )= ˜ h ⋆ 2 . 152 Take expectation, rearrange the terms on the right-hand-side of the inequality, and we obtain: E{∆( U(t))+Vh(ν (t))} ≤ B+VE{h 2 (φ(t))}− E φ →d (t)− γA (t) U d (t) − X i∈V\{d} X l∈L E nh φ (≥ l+1) →i (t)+a (≥ l) i (t)− φ (≥ l) i→ (t) i U (l) i (t) o ≤ B+V ˜ h ⋆ 2 − E{φ →d (t)}− γ ∥λ ∥ 1 E{U d (t)} − X i∈V\{d} X l∈L h E n φ (≥ l+1) →i (t) o +λ (≥ l) i − E n φ (≥ l) i→ (t) oi E n U (l) i (t) o (A.31) The second inequality is obtained by the fact that φ(t) and U(t) are independent. Besides, since the policy is admissible,φ(t) satisfies (2.13b) – (2.13d) (specifically, the reliability and causality constraint). It is straightforward to obtain E{φ →d (t)}− γ ∥λ ∥ 1 ≥ ϵ ′ , (A.32) E n φ (≥ l+1) →i (t) o +λ (≥ l) i − E n φ (≥ l) i→ (t) o ≥ ϵ ′ . (A.33) Note that in the above expression, we omit the long-term average operator, sinceφ(t) is i.i.d. at each time slot, and thus the long-term average equals the value of any time slot. Substituting (A.32) and (A.33) into (A.31) leads to E{∆( U(t))+Vh(ν (t))}≤ B+V ˜ h ⋆ 2 − ϵ ′ ∥U(t)∥ 1 . (A.34) 153 A.5.1 CostPerformance Fix someT >0. Apply the telescoping sum to (A.34), and w.l.o.g., assumeU(0)=0. We obtain V T− 1 X t=0 E{h(ν (t))}≤ BT +VT ˜ h ⋆ 2 − T− 1 X t=0 ϵ ′ ∥U(t)∥ 1 − E ∥U(T)∥ 2 2 2 + ∥U(0)∥ 2 2 2 (A.35) ≤ BT +VT ˜ h ⋆ 2 . (A.36) Divide the inequality by VT and push T → ∞; besides, note that the above inequality holds for any ϵ ′ >0, and specially, a sequence{ϵ ′ n ↓0}, which leads to {E{h(ν (t))}}≤ h ⋆ 2 + B V (A.37) which is (2.26). A.5.2 ε-ConvergenceTime First, we show that the proposed algorithm stabilizes the virtual queues. Similar to the previous section, we apply the telescoping sum for some fixed T >0, and obtain E ∥U(T)∥ 2 2 ≤ BT +VT ˜ h ⋆ 2 + E ∥U(0)∥ 2 2 − V T− 1 X t=0 E{h(ν (t))}− ϵ ′ ∥U(t)∥ 1 ≤ (B+V ˜ h ⋆ 2 )T (A.38) Furthermore, by the definition of the ℓ 2 -norm, the relationshipU d (t)≤∥ U(t)∥ 2 always holds, and thus E{U d (T)}≤ E{∥U(T)∥ 2 }≤ q E ∥U(T)∥ 2 2 ≤ q 2(B+V ˜ h ⋆ 2 )T (A.39) 154 where the second inequality follows from the Cauchy-Schwartz inequality (or the fact thatE{X} 2 ≤ E X 2 for any random variableX). As a result, for any finite V , lim T→∞ E{U d (T)} T ≤ lim T→∞ s 2(B+V ˜ h ⋆ 2 ) T =0 (A.40) which indicates thatU d (t) is mean rate stable, and hence the reliability constraint is satisfied. The same argument also applies to other elements ofU(t), i.e., U (l) i (t) is also mean rate stable, which implies the causality constraint. Next, we study the relationship between the ε-convergence time, denoted by t ε , and parameter V . Recall the queuing dynamics ofU d (t), i.e., (2.14a), which implies that U d (t+1)≥ U d (t)+γA (t)− ν →d (t). (A.41) Apply the telescoping sum to the above inequality from t = 0 to T − 1 (for any fixed T > 0) and take expectation, we obtain γλ − 1 T T− 1 X t=0 E{ν →d (t)}≤ E{U d (T)} T − E{U d (0)} T ≤ s 2(B+V ˜ h ⋆ 2 ) T (A.42) where the second inequality is obtained from (A.39). The left-hand-side can be interpreted as the gap between the desired and achieved reliability level, which is bounded by a non-increasing function converging to0. Hence, there must exist some time point, after which the gap is always less thanε, which justifies the definition of the ε-convergence time. 155 It is difficult to derive an exact form for the ε-convergence time under variousV parameters; instead, we can draw an upper bound on it (in the following, we fix the value of ε). When parameterV is used T(V): s 2(B+V ˜ h ⋆ 2 ) T(V) =ε (A.43) can serve as an upper bound for the convergence timet ε (V). When we useV ′ = αV (assumingα > 1), and evaluate the bound atT ′ =αT (V), we find that s 2(B+V ′˜ h ⋆ 2 ) T ′ = s 2(B/α +V ˜ h ⋆ 2 ) T(V) <ε (A.44) and therefore, T(αV ) ≤ αT (V) (since the bound is non-increasing). Therefore, the upper bound T(V) grows sub-linearly, and so is the exact convergence timet ε (V); in other words,t ε (V)∼O (V). A.6 ImpactsofEstimationError In this section, we analyze the impact of estimation error on the attained cost performance. Assume that the empirical averages of the (virtual and actual) network flows have converged to x, which represents the cost-optimal flow assignment obtained using the inaccurate estimate ˆ λ = λ 0 +∆ λ , withλ 0 and∆ λ 156 denoting the true rate and estimation error, respectively. In particular, x can be derived by solving the following LP problem (based on (2.13)): ‡ min x,λ ⟨e,x⟩ (A.45a) s.t. x →d ≥ γλ (A.45b) x (≥ l) s→ − x (≥ l+1) →s ≤ λ, ∀l∈L (A.45c) x (≥ l) i→ − x (≥ l+1) →i ≤ 0(i̸=s) (A.45d) x ij ≤ C ij , x (l) dj =0, x (l) ij ≥ 0 (A.45e) λ ≥ ˆ λ ⇔ λ 0 − λ ≤− ∆ λ, (A.45f) in which we introduce an auxiliary variable λ to represent the estimated rate, and separate (2.13d) into (A.45c) and (A.45d) for the source nodes and intermediate nodes, to clearly indicate the constraints involv- ingλ (for illustrative purposes). The optimal (cost) value, denoted byh ⋆ ( ˆ λ ), is affected by the estimation error via (A.45f), as presented in the following sensitivity analysis. § We define an unperturbed LP problem by setting∆ λ = 0 in (A.45f), and denote byh ⋆ (λ 0 ) its optimal value, i.e., the optimal cost. Consider the optimal solution to its dual problem, and lety ≥ 0 be the mul- tiplier associated with (A.45f), which depends on network (topology, link capacity) and service (deadline constraint, reliability level) parameters, as well as the true arrival rateλ 0 , in the studied problem. By the general inequality [11, (5.57)], we obtain: for any estimation error∆ λ , h ⋆ ( ˆ λ )≥ h ⋆ (λ 0 )+y∆ λ, (A.46) ‡ We consider one client, and assume that the packets arrive to the network at source nodes with maximum lifetime. § We introduce (A.45f) to summarize the effect of the approximation error, which does not change the optimal solution if we replaceλ by ˆ λ in (A.45b) and (A.45c). To wit, if (A.45f) holds with inequality ˆ λ>λ 0, more network resources will be consumed to handle additional packets. 157 leading to the following qualitative conclusions: • If the multiplier y ≥ 0 is large and we overestimate the rate (i.e., ∆ λ > 0), the attained cost is significantly higher than the optimal cost. • If the multipliery ≥ 0 is small and we underestimate the rate (i.e.,∆ λ < 0), the attained cost can be slightly less than the optimal cost. We note that the above analysis does not cover all the cases (e.g., if y is small and we overestimate the rate), under which the impact is indefinite and can vary case by case. A.7 HybridQueuingSystem Consider a scenario including two groups of users, collected inΦ andΨ , respectively. The usersϕ ∈Φ are deadline-constrained, which are treated as presented in the paper: for each user ϕ , we establish lifetime queuesQ (ϕ,l ) i (t) at each nodei, and denote byµ (ϕ,l ) ij (t) the actual flows transmitted from node i toj. The other users ψ ∈ Ψ are unconstrained (i.e., without deadline constraints), and for each user ψ , we create one queue ˜ Q (ψ ) i (t) at each node i to accommodate all the packets (regardless of the lifetimes), and the assigned flow transmitted from node i to j is denoted by ˜ µ (ψ ) ij (t). In general, the queuing dynamics of unconstrained users are given by [58]: ˜ Q (ψ ) i (t+1)≤ max h ˜ Q (ψ ) i (t)− X j∈δ + i ˜ µ (ψ ) ij (t),0 i + X j∈δ − i ˜ µ (ψ ) ji (t)+˜ a (ψ ) i (t) (A.47) 158 where˜ a (ψ ) i (t) is the number of arrivals of userψ at nodei, and the overall drift for the queues of uncon- strained users, i.e., ˜ Q(t)={ ˜ Q (ψ ) i (t):i∈V,ψ ∈Ψ }, can be derived as: ¶ ∆( ˜ Q(t))≤ B ′ + X i∈V X ψ ∈Ψ ˜ Q (ψ ) i (t)˜ a (ψ ) i (t)− X (i,j)∈E X ψ ∈Ψ ˜ Q (ψ ) i (t)− ˜ Q (ψ ) j (t) ˜ µ (ψ ) ij (t) (A.48) whereB ′ is a constant. Our goal is to stabilize the queuing system, including request queuesR(t) for deadline-constrained users, and ˜ Q(t) for unconstrained ones. We propose to minimize thesumdrift∆( R(t))+∆( ˜ Q(t)), leading to the following problem min µ, ˜ µ ∆( R(t))− X (i,j) X ψ ∈Ψ ˜ Q (ψ ) i (t)− ˜ Q (ψ ) j (t) ˜ µ (ψ ) ij (t) (A.49a) s.t. X ϕ ∈Φ X l∈L ϕ µ (ϕ,l ) ij (t)+ X ψ ∈Ψ ˜ µ (ψ ) ij (t)≤ C ij (A.49b) X j∈δ + i µ (ϕ,l ) ij (t)≤ Q (ϕ,l ) i (t); µ (t), ˜ µ (t)⪰ 0 (A.49c) where (A.49b) shows the interaction between the two groups of users in sharing the transmission resource, and ∆( R(t)) is given by (2.31). In addition, we find that the optimal decisions for unconstrained users, ˜ µ (t), follow the max-weight rule, and it suffices to focus on the interaction of the selected user (with maximum weight) with the deadline-constrained users. To sum up, we can solve the problem in two steps. First, find the unconstrained user with the max- weight for∀(i,j): ψ ⋆ ij =argmax ψ ˜ Q (ψ ) i (t)− ˜ Q (ψ ) j (t) . (A.50) ¶ We illustrate the approach using one-slot drift as an example, which can be extended ton look-ahead slots using multi-slot- drift [58, Lemma 4.11]. 159 Then, solve the following problem: min µ, ˜ µ ∆( R(t))− X (i,j) ˜ Q (ψ ⋆ ij ) i (t)− ˜ Q (ψ ⋆ ij ) j (t) ˜ µ (ψ ⋆ ij ) ij (t) (A.51a) s.t. X ϕ ∈Φ X l∈L ϕ µ (ϕ,l ) ij (t)+ ˜ µ (ψ ⋆ ij ) ij (t)≤ C ij (A.51b) X j∈δ + i µ (ϕ,l ) ij (t)≤ Q (ϕ,l ) i (t); µ (t), ˜ µ (t)⪰ 0 (A.51c) and other unconstrained users are not served, i.e., ˜ µ (ψ ) ij (t)=0 ifψ ̸=ψ ⋆ ij . Compared to the original problem, (A.51) includes one additional variable for each link and look-ahead slot (and no new constraints), which represents the entire group of unconstrained users. A.8 TheMulti-CommodityAgIProblem A.8.1 AgIServiceModel The cloud network offers a set of AgI services Φ . Each service ϕ ∈ Φ is modeled by an ordered chain of (M ϕ − 1) service functions, through which incoming packets must be processed to be transformed into results that are consumable by corresponding destination nodes. Service functions can be executed at different network locations. While, for ease of exposition, we assume every cloud node can host any service function, it is straightforward to extend our model to limit the set of functions available at each cloud node. There are two parameters associated with each function: for them-th function of serviceϕ , we define • ξ (m) ϕ : the scaling factor, i.e., the output data-stream size per unit of input data-stream; • r (m) ϕ : the workload, i.e., the required computation resource to process one unit of input data-stream. 160 In addition, the input and output data-streams of them-th function are referred to as the stagem and stage m+1 data-streams of the service, respectively. A.8.2 ConstructingtheLayeredGraph Denote the topology of the actual network byG =(V,E). For a particular serviceϕ (consisting ofM ϕ − 1 functions), the layered graphG (ϕ ) is constructed by the following steps: 1) makeM ϕ copies of the actual network, indexed as layer1,··· ,M ϕ from top to bottom; specifically, nodei∈V on them-th layer is denoted byi m ; 2) adddirected links connecting corresponding nodes between adjacent layers, i.e.,(i m ,i m+1 ) for∀i∈ V. To sum up, the layered graphG (ϕ ) = (V (ϕ ) ,E (ϕ ) ) withE (ϕ ) ={E (ϕ ) pr,i : i∈V}∪{E (ϕ ) tr,(i,j) : (i,j)∈E} is defined as V (ϕ ) ={i m :i∈V,1≤ m≤ M ϕ } (A.52a) E (ϕ ) pr,i ={(i m ,i m+1 ):,1≤ m≤ M ϕ − 1} (A.52b) E (ϕ ) tr,ij ={(i m ,j m ):(i,j)∈E,1≤ m≤ M ϕ }. (A.52c) Each layerm inG (ϕ ) only deals with packets of specific stage m. The edges inE (ϕ ) pr,i andE (ϕ ) tr,(i,j) indicate the processing and transmission operations in the actual network, respectively. More concretely, the flow on(i m ,i m+1 ) denotes the processing of stagem-packets by functionm at nodei, while(i m ,j m ) denotes the transmission of stage-m packets through the link(i,j). In addition,d (ϕ ) M ϕ is the only destination node (since only stageM ϕ -packets can be consumed), whered (ϕ ) is the destination in the actual network. 161 A B C D queue for stage 1 packet queue for stage 2 packet (a) Physical network. A2 B2 C2 D2 A1 B1 C1 D1 (b) Layered graph. Figure A.1: An example of delivering a packet, requiring one processing step, over a4-node network and its associated layered graph. The stage1 packet arrives at the source nodeA and it is transmitted to nodeB (along the red path), where it also gets processed. The produced stage2 packet is then transmitted to the destination nodeD (along the pink path). We use red and pink bars to represent queues for packets at different stages, and depict the packet trajectory over the queuing system using green arrows. We define two parameters (ζ (ϕ ) ıȷ ,ρ (ϕ ) ıȷ ) for each link(ı,ȷ) in the layered graph (ζ (ϕ ) ıȷ ,ρ (ϕ ) ıȷ )= (ξ (m) ϕ ,r (m) ϕ ) (ı,ȷ)=(i m ,i m+1 ) (1,1) (ı,ȷ)=(i m ,j m ) (A.53) The two parameters can be interpreted as the generalized scaling factor and workload; specifically, for transmission edges (the second case),ζ (ϕ ) ıȷ =1 since flow is neither expanded or compressed by the trans- mission operation, andρ (ϕ ) ıȷ = 1 since the flow and the transmission capacity are quantified on the same basis. A.8.2.1 Interpretation An example is presented in Fig. A.1. In the physical network, two queues are created at each node for packets of different stages. A transmitted packet moves between queues of the same stage (but) at different locations, while a processed packet moves between queues of different stages at the same node. In the associated layered graph, we create two layers to deal with stage 1 and 2 packets, respectively, and only one queue is created at each node, which accommodates packets of the corresponding stage. Packets traversing 162 nodes in layerm are stagem packets transmitted over the corresponding links in the physical network, and packets crossing from layerm tom+1 are stagem packets processed at the corresponding location to create stagem+1 packets. For example,(A 1 ,B 1 ) indicates that the stage1 packet is transmitted over link(A,B), and(B 1 ,B 2 ) indicates that the stage1 packet is processed at nodeB into a stage2 packet. A.8.3 RelevantQuantities The flow variable x (ϕ,l ) ıȷ (t) is defined for link (ı,ȷ) in the layered graph, which is the amount of packets sent to the corresponding interface. By this definition, for ∀ϕ ∈Φ andı∈G (ϕ ) , the queuing dynamics are modified as Q (ϕ,l ) ı (t+1)=Q (ϕ,l +1) ı (t)− x (ϕ,l +1) ı→ (t)+ X ȷ∈δ − ı ζ (ϕ ) ȷı x (ϕ,l +1) ȷı (t)+a (ϕ,l ) ı (t), (A.54) where the proposed framework takes the arrival of intermediate stage-packets into account. The causality constraint can be derived as n x (ϕ, ≥ l) ı→ (t) o ≤ X ȷ∈δ − ı n ζ (ϕ ) ȷı x (ϕ, ≥ l+1) ȷı (t) o +λ (ϕ, ≥ l) ı . (A.55) The capacity constraint is given by X ϕ ∈Φ X (ı,ȷ)∈E (ϕ ) pr,i X l∈L ρ ıȷ x (ϕ,l ) ıȷ (t)≤ C i (A.56a) X ϕ ∈Φ X (ı,ȷ)∈E (ϕ ) tr,ij X l∈L ρ ıȷ x (ϕ,l ) ıȷ (t)≤ C ij , (A.56b) 163 and the corresponding operational cost is h(t)= X ϕ ∈Φ X (ı,ȷ)∈E (ϕ ) e ıȷ X l∈L ρ ıȷ x (ϕ,l ) ıȷ (t) (A.57) where e imi m+1 = e i and e imjm = e ij , with C i and e i denoting the computation capacity and the corre- sponding cost at each network locationi, respectively. Finally, the reliability constraint is given by 1 Ξ (M ϕ ) ϕ X ı∈δ − d n E n ζ (ϕ ) ıd x (ϕ ) ıd (t) oo ≥ γ (ϕ ) ∥λ (ϕ ) ∥ 1 (A.58) where the overall scaling factor (for stagem packet of serviceϕ ) is defined as Ξ (m) ϕ = m− 1 Y s=1 ξ (m) ϕ , andΞ (1) ϕ =1 (A.59) and we abused = d (ϕ ) M ϕ for the simplicity of notation. Out of consideration for fairness, in this paper, we calculate the throughput on the basis of input flow size, i.e., the throughput can be interpreted as the rate of served requests, which is defined as the left-hand-side of (A.58). A.8.4 ModificationsofAlgorithm The major difference lies in deriving the solution to the virtual network P 2 . The modified constraints (A.58) and (A.55) lead to the following definition of the virtual queues U (ϕ ) d (t+1)=max n 0, U (ϕ ) d (t)+Ξ (M ϕ ) ϕ γ (ϕ ) A (ϕ ) (t)− X ı∈δ − d ζ (ϕ ) ıd x (ϕ ) ıd (t) o , (A.60) U (ϕ,l ) ı (t+1)=max n 0, U (ϕ,l ) ı (t)− a (ϕ, ≥ l) ı (t)+x (ϕ, ≥ l) ı→ (t)− X ȷ∈δ − ı ζ (ϕ ) ȷı x (ϕ, ≥ l+1) ȷı (t) o . (A.61) 164 The Lyapunov function is defined as L(U(t))= 1 2 ∥Σ U(t)∥ 2 2 (A.62) whereΣ =diag n β (ϕ ) d ,β (ϕ,l ) im o is a diagonal matrix with β (ϕ ) d = 1 Ξ (M ϕ ) ϕ , β (ϕ,l ) im =β (ϕ ) im = 1 Ξ (m) ϕ . (A.63) The reason to define the coefficients β as above is the following. For any service ϕ , the virtual queue U (ϕ,l ) im (t) deals with packets of stagem (and thus the virtual queues are of different scale); we define the coefficients to normalize the virtual queues (to the basis of the input flow size). The LDP in this case is given by (A.64) (where ˜ a is defined in (2.19)) ∆( t)+Vh(t) ≤ B+V X ϕ, (ı,ȷ),l e ıȷ ρ (ϕ ) ıȷ x (ϕ,l ) ıȷ (t)−⟨ ˜ a,Σ U(t)⟩ − X ϕ β (ϕ ) d U (ϕ ) d (t) X ı∈δ − d ζ (ϕ ) ıd x (ϕ ) ıd (t)− X ϕ,ı,l β (ϕ ) ı U (ϕ,l ) ı (t) h X ȷ∈δ − ı ζ (ϕ ) ȷı x (ϕ, ≥ l+1) ȷı (t)− x (ϕ, ≥ l) ı→ (t) i =B−⟨ ˜ a,Σ U(t)⟩− X ϕ, (ı,ȷ),l ρ (ϕ ) ıȷ x (ϕ,l ) ıȷ (t) | {z } ˜ x (ϕ,l ) ıȷ (t) × " − Ve ıȷ − β (ϕ ) ı U (ϕ, ≤ l) ı (t) ρ (ϕ ) ıȷ + ζ (ϕ ) ıȷ β (ϕ ) ȷ ρ (ϕ ) ıȷ U (ϕ ) d (t) ȷ=d (ϕ ) M ϕ U (ϕ, ≤ l− 1) ȷ (t) ȷ∈V (ϕ ) \ d (ϕ ) M ϕ # | {z } w (ϕ,l ) ıȷ (t) . (A.64) 165 As a result, themin LDP problem is equivalent to max X ϕ, (ı,ȷ),l w (ϕ,l ) ıȷ (t)˜ x (ϕ,l ) ıȷ (t) (A.65a) s.t. (A.56), ˜ x(t)⪰ 0. (A.65b) The solution to the problem is in the max-weight fashion. For each transmission link (i,j), we serve the packets of optimal commodity ϕ , (ı,ȷ) ∈ E (ϕ ) tr,ij and l (with the maximum, positive weight) with all the available resourceC ij . The processing decisions are made in the same way (note that the solution to the problem ˜ x represents the allocated resource, and the scheduled computation flow equals to the result divided by the corresponding workload parameter). 166 AppendixB ProofsinChapter3 B.1 ConditionsonReplicationOperation Conditiona) andb) are justified in the body text. Duplication: For any general policyP 1 , which can create (and operate on) multiple copies at each time slot, we construct a policyP 2 as follows: for each multicast packet,P 2 performs replication and transmits the resulting packets along the same paths as inP 1 , but creating two copies at each time slot and spending multiple slots to complete a replication producing multiple copies underP 1 . In the following, we show that the transmission rate of each link underP 2 is the same as that ofP 1 . Consider the single commodity setting, and denote byς a possible route to deliver the multicast packet. Let f (ς) ij be the average rate of packets traversing link (i,j) that are associated with route ς underP 1 , which satisfies P ς f (ς) ij ≤ C ij , and we divide the link capacity into separate components for each route, i.e.,{C (ς) ij =f (ς) ij :∀ς}. UnderP 2 , we focus on the transmission of packets associated with routeς, in the network where the capacity of link(i,j) isC (ς) ij . Since each packet is transmitted along the same path, it leads to an identical arrival ratef (ς) ij to every link(i,j), which can be supported by the capacityC (ς) ij , and therefore the queue collecting packets waiting to cross the link (i,j) and associated with route ς is stable [58, Theorem 2.4]. This result holds for each routeς and link(i,j), concluding the proof. 167 Non-consolidation: Assume that a general policyP 1 consolidates two copies (of status q 1 and q 2 ) at node i, and we construct a policyP 2 to eliminate this operation as follows. Suppose underP 1 , the ancestor packet splits into two copies of status(q 1 +s 1 ) and(q 2 +s 2 ) at nodej, sent along different routes, wheres 1 ands 2 denote the possible replication operations after the two packets leave nodej and before they rejoin at nodei. Then inP 2 , we can create the two copies with status(q 1 +q 2 +s 1 ) ands 2 at node j, and send them along the same routes. The system state remains unchanged afterP 1 consolidates the two packets, while it saves the network resources to transmit packetq 2 from where it is created to nodei. Therefore,P 2 can achieve the same performance asP 1 , if not better. B.2 MulticastNetworkStabilityRegion B.2.1 ProofofTheorem2 B.2.1.1 Necessity We need to show: for∀λ ∈Λ , there existf andβ satisfying (3.4). Consider the cost-optimal policy under the arrival vector λ . Let X (q,s) ij (t) be the number of pack- ets successfully delivered by time slot t (to all destinations), that underwent the (forwarding and dupli- cation) operation with forwarding choice (i,j) and duplication choice (q,s), during the delivery. Define lim t→∞ X (q,s) ij (t) t≜f (q,s) ij ≥ 0 (non-negativity), and we obtain: X j∈δ − i X s∈2 ¯q X (q+s,q) ji (t)+ X j∈δ + i X s∈2 ¯q X (q+s,s) ij (t)+ t X τ =1 a (q) i (τ )≤ X j∈δ + i X s∈2 q X (q,s) ij (t) (flow conservation) X (q,s) X (q,s) ij (t)≤ tC ij (link capacity); X (q,s) d k j (t)=0, ∀q :q k =1 (boundary) where the flow conservation constraint is w.r.t. status q packets, and it is an inequality because not all arrival packets (last term) are delivered by timet. Divide byt and we obtain (3.4). 168 B.2.1.2 Sufficiency We need to show: if there existf,β ,λ satisfying (3.4), thenλ ∈Λ . Consider the stationary randomized policy defined in Theorem 2 (using β (q,s) ij ≜ f (q,s) ij /C ij ), and denote byµ ∗ (q,s) ij (t) the associated decisions; then,E µ ∗ (q,s) ij (t) = f (q,s) ij C ij C ij = f (q,s) ij . Substitute it to (3.4a), and we obtain E n µ ∗ (q) i→ (t)− µ ∗ (q) →i (t)− λ (q) i o ≥ 0 ⇐⇒ ∃ϵ ≥ 0:E n µ ∗ (q) →i (t)+λ (q) i − µ ∗ (q) i→ (t) o ≤− ϵ (B.1) whereµ ∗ (q) i→ (t) andµ ∗ (q) →i (t) are defined in (3.6). Furthermore, in (B.13b) (in Appendix B.3), we show E n ∆ Q (q) i (t) o ≤ B+E n µ ∗ (q) →i (t)+λ (q) i − µ ∗ (q) i→ (t) Q (q) i (t) o ≤ B− ϵ E n Q (q) i (t) o , (B.2) which implies the stability ofQ (q) i (t) [58, Section 3.1.4], and thusλ ∈Λ . B.2.2 ProofofProposition5 B.2.2.1 UnicastStabilityRegion The unicast stability region is characterized in [33, Theorem 1], and we rephrase it as follows. Anarrivalvectorλ iswithinthestabilityregionΛ 0 ifandonlyifthereexistflowvariables ˆ f = ˆ f (k) ij ≥ 0 and probability values ˆ β = ˆ β (k) ij ≥ 0: P D k=1 ˆ β (k) ij ≤ 1 such that: ˆ f (k) →i +λ i ≤ ˆ f (k) i→ , ∀i,k, (B.3a) ˆ f (k) ij ≤ ˆ β (k) ij C ij , ∀(i,j), k (B.3b) ˆ f (k) d k → (t)=0, ∀k (B.3c) 169 where ˆ f (k) →i = P j∈δ − i ˆ f (k) ji and ˆ f (k) i→ = P j∈δ + i ˆ f (k) ij (throughout this section, we use the subscripts “→ i ” and “i→” to denote “ P j∈δ − i ” and “ P j∈δ + i ” operations on corresponding quantities). B.2.2.2 ProofforΛ 0 ⊂ Λ We aim to show: for anyλ , ˆ f, ˆ β satisfying (B.3), the flow variable f defined as follows, together with the probability valuesβ (q,s) ij =f (q,s) ij C ij andλ , satisfies (3.4): f (q,s) ij = D X k=1 ˆ f (k) ij / ˆ f (k) i→ λ i I{(q,s)=(1− b 1 −···− b k− 1 ,b k )}+ ˆ f (k) →i I{(q,s)=(b k ,b k )} . The flow variable defined above describes the unicast approach that “creates one copy for each des- tination of a multicast packet upon arrival, and treats them as individual unicast packets”, thus satisfying (3.4a). For any link(i,j), the associated transmission rate is given by f ij = X (q,s)∈Ω f (q,s) ij = D X k=1 ( ˆ f (k) ij / ˆ f (k) i→ )(λ i + ˆ f (k) →i )= D X k=1 ˆ f (k) ij = ˆ f ij , (B.4) thus satisfying (3.4b). In addition, it also indicates that the policies has a cost performance of h ⋆ 0 (λ ). Therefore, the optimal cost under the multicast frameworkh ⋆ (λ )≤ h ⋆ 0 (λ ). B.2.2.3 ProofforΛ ⊂ DΛ 0 Next, we show that: for anyλ ,f,β satisfying (3.4), the following flow variable ˆ f, together with probability values ˆ β (k) ij = ˆ f (k) ij C ij andλ ′ =(λ /D), satisfies (B.3): ˆ f (k) ij = 1 D X (q ′ ,s ′ )∈S 1 f (q ′ ,s ′ ) ij , (B.5) 170 in which S 1 ={(q+s,q):q k =1,s∈2 ¯q }, (B.6a) S 2 ={(q,s):q k =1,s∈2 q }, (B.6b) S 3 ={(q+s,s):q k =1,s∈2 ¯q }. (B.6c) To verify the unicast flow conservation law (B.3a), we note that: ˆ f (k) →i +λ ′ i = X (q ′ ,s ′ )∈S 1 f (q ′ ,s ′ ) →i D + λ i D (a) ≤ X (q ′ ,s ′ )∈S 2 f (q ′ ,s ′ ) i→ D − X (q ′ ,s ′ )∈S 3 f (q ′ ,s ′ ) i→ D (b) = X (q ′ ,s ′ )∈S 1 f (q ′ ,s ′ ) i→ D = ˆ f (k) i→ where (a) is obtained by first summing up (3.4a) over {q :q k =1} X {q:q k =1} h X s∈2 ¯q f (q+s,q) →i + X s∈2 ¯q f (q+s,s) i→ +λ (q) i i ≤ X {q:q k =1} X s∈2 q f (q,s) i→ , (B.7) and then plugging in the definition (B.6): X (q ′ ,s ′ )∈S 1 f (q ′ ,s ′ ) →i + X (q ′ ,s ′ )∈S 3 f (q ′ ,s ′ ) i→ +λ i ≤ X (q ′ ,s ′ )∈S 2 f (q ′ ,s ′ ) i→ , (B.8) and (b) results fromS 1 =S 2 \S 3 as proved in Lemma 2. For any link(i,j), the associated transmission rate is given by D X k=1 ˆ f (k) ij = 1 D D X k=1 X (q ′ ,s ′ )∈S 1 f (q ′ ,s ′ ) ij (c) ≤ 1 D D X k=1 X (q,s)∈Ω f (q,s) ij = X (q,s)∈Ω f (q,s) ij ≤ C ij (B.9) where (c) is becauseS 1 ⊂ Ω , and thus (B.3b) is verified. Lemma2. S 1 =S 2 \S 3 . 171 Proof. We show the following results: (i)S 1 ∩S 3 =∅, and (ii)S 13 ≜S 1 ∪S 3 =S 2 . To show (i), note that s k = 1 if (q,s) ∈ S 1 , while s k = 0 if (q,s) ∈ S 3 , and thusS 1 ∩S 3 = ∅. We prove (ii) in two steps: a) First, we showS 13 ⊆S 2 . Take any(q,s)∈S 13 , • If(q,s)∈S 1 , i.e.,q =q ′ +s ′ ,s=q ′ withq ′ k =1, which satisfies q k =1 ands∈2 q , and thus (q,s)∈S 2 . • If(q,s)∈S 3 , i.e.,q =q ′ +s ′ ,s=s ′ withq ′ k =1, which satisfies q k =1 ands∈2 q , and thus (q,s)∈S 2 . b) Next, we showS 2 ⊆S 13 . Take any(q,s)∈S 2 (and by definition q k =1), • Ifs k =1, then we can represent(q,s) as(q ′ +s ′ ,q ′ ) withq ′ =s ands ′ =q− s, which satisfy q ′ k =s k =1 ands ′ =q− s∈2 ¯s =2 q ′ , and thus(q,s)∈S 1 ⊂S 13 . • Ifs k =0, then we can represent(q,s) as(q ′ +s ′ ,s ′ ) withq ′ =q− s ands ′ =s, which satisfy q ′ k =q k − s k =1 ands ′ =s∈2 q− s =2 q ′ , and thus(q,s)∈S 3 ⊂S 13 . Combining (i) and (ii) leads toS 1 =S 2 \S 3 , concluding the proof. B.3 DerivationofLDPBound Square the queuing dynamics (3.5): Q (q) i (t+1) 2 ≤ Q (q) i (t)− µ (q) i→ (t) 2 + µ (q) →i (t)+a (q) i (t) 2 +2 µ (q) →i (t)+a (q) i (t) Q (q) i (t) (B.10a) = Q (q) i (t) 2 − 2 µ (q) i→ (t)− µ (q) →i (t)− a (q) i (t) Q (q) i (t) (B.10b) + µ (q) i→ (t) 2 + µ (q) →i (t)+a (q) i (t) 2 . (B.10c) 172 We first study the sum of (B.10c) over q∈{0,1} D : X q∈{0,1} D µ (q) i→ (t) 2 = X q∈{0,1} D X s∈2 q X j∈δ + i µ (q,s) ij (t) 2 ≤ X (q,s)∈Ω X j∈δ + i µ (q,s) ij (t) 2 ≤ X j∈δ + i C ij 2 . (B.11) Similarly, we can obtain X q∈{0,1} D µ (q) →i (t)+a (q) i (t) 2 ≤ X j∈δ − i C ji + X j∈δ + i C ij +A i,max 2 . (B.12) Therefore, we can define B =(2|δ max |C max +A i,max ) 2 /2 as a constant bound on the sum of them, where |δ max | is the maximum node degree, C max is the maximum link capacity, and A i,max is the maximum arrival at nodei. The Lyapunov drift of the entire network is given by ∆( Q(t))≜ X i∈V X q∈{0,1} D Q (q) i (t+1) 2 − Q (q) i (t) 2 2 (B.13a) ≤|V| B+ X i∈V X q∈{0,1} D a (q) i (t)Q (q) i (t) (B.13b) − X i∈V X q∈{0,1} D X s∈2 q X j∈δ + i µ (q,s) ij (t)− X s∈2 ¯q X j∈δ − i µ (q+s,q) ji (t)− X s∈2 ¯q X j∈δ + i µ (q+s,s) ij (t) | {z } =µ (q) i→ (t)− µ (q) →i (t) Q (q) i (t) (a) =|V|B+ X i∈V X q∈{0,1} D a (q) i (t)Q (q) i (t) − X (i,j)∈E X (q,s)∈Ω Q (q) i (t)− Q (s) j (t)− Q (q− s) i (t) µ (q,s) ij (t) (B.13c) where we rearrange the order of summations in (a). Combining the above result with the operational cost model (3.3) leads to the weightw (q,s) ij (t) given by (3.10). 173 B.4 ProofofTheorem3 Using (B.13b), we can derive the following LDP bound: ∆( Q(t))+Vh(t)≤|V| B− X i∈V X q∈{0,1} D µ (q) i→ (t)− µ (q) →i (t)− a (q) i (t) Q (q) i (t)+Vh(t) ≤|V| B− X i∈V X q∈{0,1} D E n µ ∗ (q) i→ (t)− µ ∗ (q) →i (t)− a (q) i (t) Q (q) i (t) o +VE{h ∗ (t)} (a) ≤|V| B− ϵ X i∈V X q∈{0,1} D E n Q (q) i (t) o +Vh ⋆ (λ +ϵ 1) (B.14) whereµ ∗ (t) denotes the decisions associated with the cost-optimal stationary randomized policy under arrival vectorλ +ϵ 1 (defined in Theorem 2). In (a), note that: (i) the decisions are independent with Q (q) i (t), and thus expectation multiplies, (ii)E n µ ∗ (q) i→ (t)− µ ∗ (q) →i (t)− (λ (q) i +ϵ ) o ≥ 0 because the randomized policy stabilizesλ +ϵ 1, (iii) the policy is cost-optimal underλ +ϵ 1. Fix any T > 0, and apply telescope sum [58] to the interval [0,T − 1] (assuming empty queues at t=0). Divide the result byT , pushT →∞, and we obtain: V{E{h(t)}}≤|V| B− ϵ {E{∥Q(t)∥ 1 }}+Vh ⋆ (λ +ϵ 1). (B.15) Based on this inequality, we can derive: (i) Average queue backlog: note that{E{h(t)}}≥ h ⋆ (λ ), and we obtain {E{∥Q(t)∥ 1 }}≤ |V|B ϵ + h ⋆ (λ +ϵ 1)− h ⋆ (λ ) ϵ V. (B.16) (ii) Operational cost: dropping the second term in the right hand side, we obtain {E{h(t)}}≤ h ⋆ (λ +ϵ 1)+ |V|B V {ϵ n}↓0 −−−−−→ {E{h(t)}}≤ h ⋆ (λ )+ |V|B V (B.17) 174 where we take a sequence{ϵ n } ↓ 0 and use the fact that the inequality holds for anyϵ n >0. B.5 AverageQueuingDelay Each duplication treeT is a possible way to accomplish the goal of multicast packet delivery. We divide the physical queueQ (q) (t) = P i∈V Q (q) i (t) into sub-queuesQ (q) T (t) associated with each duplication treeT , andQ (q) (t)= P T∈U(q) Q (q) T (t) whereU(q)={T :q∈T} denotes the set of duplication trees including statusq as a tree node. With this model, the average delay can be derived in two steps: (i) calculate the average delay∆ T for each duplication treeT , (ii) calculate the average delay of all trees (weighted byλ T , which is the rate of packets selecting each duplication treeT ): (i) Consider a given duplication treeT , with an associated packet rate of λ T . First, we focus on the delivery of the k-th copy, i.e., the path from the root node1 to the leaf node b k , denoted by ω k (which is composed of all tree nodes in the path). By Little’s Theorem, the average delay of this path is given by ∆ T (k) = P q∈ω k ¯ Q (q) T /λ T , in which ¯ Q (q) T = n E n Q (q) T (t) oo , and we average over all the copies k =1,··· ,D: ∆ T = 1 D D X k=1 ∆ T (k)= 1 D D X k=1 X q∈ω k ¯ Q (q) T λ T (a) = 1 Dλ T X q∈T ∥q∥ 1 ¯ Q (q) T (B.18) where we exchange the order of summations in (a), and use the fact that nodeq in the duplication treeT is included in∥q∥ 1 different paths (e.g., each leaf node b k belongs to one pathω k , the root node1 belongs to all pathsω 1 ,··· ,ω D ). 175 (ii) With the average delay of each duplication tree given by (B.18), the overall average delay can be derived as follows ∆= X T λ T ∥λ ∥ 1 ∆ T = X T λ T ∥λ ∥ 1 h 1 Dλ T X q∈T ∥q∥ 1 ¯ Q (q) T i = 1 D∥λ ∥ 1 X T X q∈T ∥q∥ 1 ¯ Q (q) T (b) = 1 D∥λ ∥ 1 X q∈{0,1} D ∥q∥ 1 X T∈U(q) ¯ Q (q) T = 1 D∥λ ∥ 1 X q∈{0,1} D ∥q∥ 1 ¯ Q (q) = 1 D∥λ ∥ 1 X q∈{0,1} D ∥q∥ 1 E Q (q) (t) (B.19) where we exchange the order of summations in (b). Therefore, the average delay is linear in the queue backlog, and the coefficient of the status q queue is proportional to∥q∥ 1 . B.6 TransformationofAgIServiceDeliveryintoPacketRouting B.6.1 ConstructingtheLayeredGraph Denote the topology of the actual network by G = (V,E). For any service ϕ (consisting of M ϕ − 1 functions), the layered graphG (ϕ ) is constructed as follows: (i) make M ϕ copies of the actual network, indexed as layer 1,··· ,M ϕ from top to bottom, and we denote nodei∈V on them-th layer byi m ; (ii) adddirected links connecting corresponding nodes between adjacent layers, i.e.,(i m ,i m+1 ) for∀i∈ V,m=1,··· ,M ϕ − 1. 176 To sum up, the layered graphG (ϕ ) =(V (ϕ ) ,E (ϕ ) ) is defined as V (ϕ ) ={i m :i∈V,1≤ m≤ M ϕ } (B.20a) E (ϕ ) pr,i ={(i m ,i m+1 ):,1≤ m≤ M ϕ − 1} (B.20b) E (ϕ ) tr,ij ={(i m ,j m ):(i,j)∈E,1≤ m≤ M ϕ } (B.20c) withE (ϕ ) ={E (ϕ ) pr,i :i∈V}∪{E (ϕ ) tr,ij :(i,j)∈E}. Physicalinterpretation: Layerm inG (ϕ ) accommodates packets of stagem. The edges inE (ϕ ) pr,i andE (ϕ ) tr,ij indicate the processing and transmission operations in the actual network, respectively. More concretely, the flow on (i m ,i m+1 ) denotes the processing of stagem packets by functionm at nodei, while(i m ,j m ) denotes the transmission of stagem packets over link(i,j). In addition,D (ϕ ) M ϕ denotes the destination set in the graph (since only stageM ϕ packets are consumable), whereD (ϕ ) is the destination set of serviceϕ in the actual network. We define two parameters (ζ (ϕ ) ıȷ ,ρ (ϕ ) ıȷ ) for each link(ı,ȷ) in the layered graph (ζ (ϕ ) ıȷ ,ρ (ϕ ) ıȷ )= (ξ (m) ϕ ,r (m) ϕ ) (ı,ȷ)=(i m ,i m+1 ) (1,1) (ı,ȷ)=(i m ,j m ) . (B.21) The two parameters can be interpreted as the generalized scaling factor and workload: for transmission edges (the second case), ζ (ϕ ) ıȷ = 1 since flow is neither expanded or compressed by the transmission op- eration, and ρ (ϕ ) ıȷ = 1 since the flow and the transmission capability are quantified based on the same unit. 177 B.6.2 RelevantQuantities We define flow variable µ (ϕ,q,s ) ıȷ (t) for link(ı,ȷ) in the layered graph, which is the flow sent to the corre- sponding interface, leading to the received flow by node ȷ given byζ (ϕ ) ıȷ µ (ϕ,q,s ) ıȷ (t). By this definition, for ∀ϕ andı∈G (ϕ ) , the queuing dynamics are modified by Q (ϕ,q ) ı (t+1)≤ max Q (ϕ,q ) ı (t)− µ (ϕ,q ) ı→ (t),0 +µ (ϕ,q ) →ı (t)+a (ϕ,q ) ı (t) (B.22) where the incoming and outgoing flows are given by µ (ϕ,q ) ı→ (t)= X ȷ∈δ + ı X s∈2 q µ (ϕ,q,s ) ıȷ (t), (B.23a) µ (ϕ,q ) →ı (t)= X ȷ∈δ − ı X s∈2 ¯q ζ (ϕ ) ȷı µ (ϕ,q +s,q) ȷı (t)+ X ȷ∈δ + ı X s∈2 ¯q µ (ϕ,q +s,s) ıȷ (t). (B.23b) The modified link capacity constraints are X (ı,ȷ)∈E (ϕ ) pr,i X ϕ X (q,s)∈Ω ρ (ϕ ) ıȷ µ (ϕ,q,s ) ıȷ (t)≤ C i , (B.24a) X (ı,ȷ)∈E (ϕ ) tr,ij X ϕ X (q,s)∈Ω ρ (ϕ ) ıȷ µ (ϕ,q,s ) ıȷ (t)≤ C ij , (B.24b) and the modified resource operational cost is h(t)= X (ı,ȷ)∈E (ϕ ) e ıȷ X ϕ X (q,s)∈Ω ρ (ϕ ) ıȷ µ (ϕ,q,s ) ıȷ (t) (B.25) wheree imi m+1 =e i ande imjm =e ij . Similarly, the goal is to minimize the time average cost while stabilizing the modified queues, and we can follow the procedure in Section 3.6.1, i.e., deriving the upper bound for LDP and formulating an 178 optimization problem to minimize the bound, and the derived solution is in the max-weight fashion as shown in Section 3.6.2 and 3.8.1. Remark23. Incontrasttodatatransmission,theprocessingoperationcanexpand/compressdatastreamsize, and queues of expanding data streams can attract more attention in the developed algorithm. To address this problem, we can normalize the queues by ˜ Q (ϕ,q ) ı (t)=Q (ϕ,q ) ı (t) Ξ (m) ϕ ifı=i m , in which Ξ (m) ϕ = m− 1 Y s=1 ξ (m) ϕ , andΞ (1) ϕ =1 (B.26) is interpreted as the cumulative scaling factor till stagem. The resulting design, i.e., optimize the drift of the normalized queues, can achieve a better balance among the queues, while preserving (throughput and cost) optimality. B.7 ComplexityAnalysis B.7.1 GDCNC The number of all duplication choices(q,s) is given by |Ω |= D X k=1 C(k;D)(2 k − 1)=3 D − 2 D ∼O (3 D ) (B.27) whereC(k;D) denotes the combinatorial number ofchoosingk fromD. To wit, we divide the elements of Ω ,(q,s), intoD groups based on the first element q: groupk(k =1,··· ,D) collects the(q,s) pairs such thatq hask entries equal to1, includingC(k;D) different q; in addition, fix q, there are|2 q |− 1=2 k − 1 different s (other than0). Therefore, the complexity of GDCNC, which is proportional to|Ω |, isO(3 D ). 179 B.7.2 GDCNC-R The key problem is to calculate the number of nodes in each duplication tree. We note that each duplication tree includes D− 1 duplication operations (which can be shown by mathematical induction); in addition, each duplication operation involves3 nodes, i.e., the parent nodeq and two child node s and r, and the total number of nodes is 3(D− 1). However, note that every node is counted twice – one time as the parentq (packet to duplicate), and the other time as a childs orr (the created copy) – other than the root node and theD leaf nodes, since there is no duplication operation with the root node1 as a child, or a leaf nodeb k as the parent, i.e., theseD+1 nodes are counted only once. As a result, the number of internal nodes other than the root node is: 3(D− 1)− (D+1) 2 =D− 2, (B.28) and together with the root node, each tree hasD− 1 = (D− 2)+1 internal nodes (and each of them is associated with3 duplication choices), andD leaf nodes (and each of them is associated with1 duplication choice), leading to a complexity ofO(D). B.8 NotesonGeneralizedFlowConservationConstraint In [20, Eq (4)], the following flow conservation law is presented: X {q:q k =1} ˘ f (q) →i +λ (q) i = X {q:q k =1} ˘ f (q) i→ (B.29) 180 where ˘ f (q) →i and ˘ f (q) i→ are the total incoming/outgoing flow rates of status q packets to/from nodei, given by ˘ f (q) →i = X s∈2 ¯q X j∈δ − i f (q+s,q) ji , ˘ f (q) i→ = X s∈2 ¯q X j∈δ + i f (q+s,q) ij . (B.30) Despite of the neat form and clear insight of this relationship, we will show that the generalized flow conservation constraint (3.2) is a more fundamental characterization for the in-network packet duplication operation. First, we show (3.2)⇒ (B.29). Substitute (B.30) into (B.29), and the result to be shown is given by: X {q:q k =1} h X s∈2 ¯q X j∈δ − i f (q+s,q) ji +λ (q) i i = X {q:q k =1} X s∈2 ¯q X j∈δ + i f (q+s,q) ij . (B.31) Sum up (3.4a) over{q :q k =1}, and we obtain X {q:q k =1} h X s∈2 ¯q X j∈δ − i f (q+s,q) ji + X s∈2 ¯q X j∈δ + i f (q+s,s) ij +λ (q) i i = X {q:q k =1} X s∈2 q X j∈δ + i f (q,s) ij . (B.32) Compare (B.31) and (B.32), and it remains to be shown that X j∈δ + i X {(q,s):q k =1,s∈2 q } f (q,s) ij = X j∈δ + i X {(q,s):q k =1,s∈2 ¯q } f (q+s,q) ij +f (q+s,s) ij , (B.33) or equivalently, X j∈δ + i h X (q,s)∈S 2 f (q,s) ij − X (q,s)∈S 1 f (q,s) ij − X (q,s)∈S 3 f (q,s) ij i =0 (B.34) whereS 1 ,S 2 ,S 3 are defined in (B.6) and satisfy S 1 =S 2 \S 3 as shown in Lemma 2. Therefore, each term in the summation (overj∈δ + i ) equals0. 181 Next, we present a counterexample to show (B.29) is not sufficient to guarantee the existence of a fea- sible policy: Consider D = 3 destinations. Let ˘ f (q) →i = 1 for q = 1 = (1,1,1) and q = b 1 = (1,0,0), ˘ f (q) i→ =1 forq =(1,1,0) andq =(1,0,1), and the other flow variables and λ (q) i be0. This flow assignment satisfies (B.29). The corresponding operation is described as follows: (i) first “consolidate” the incoming packets of (1,1,1) and(1,0,0), and (ii) then create (by duplication) outgoing packets of(1,1,0) and(1,0,1). How- ever, the operation in step (i) cannot be realized in the actual network, because the packets are not of identical content (since the status1 packet is prior to any duplication, whose content is different from the statusb 1 packet) and cannot be consolidated. In fact, there does not exist flow variable satisfying (3.2) to describe the above operation. 182 AppendixC ProofsinChapter4 C.1 NecessityofTheorem4 Consider an arrival processa(t)={a (c) (t):∀c} and the policy able to support it, which keeps the actual queues stable. LetA (c,σ ) (t) be the number of requests of clientc that are successfully delivered along ER σ by timet− 1. By the conservation of service requests: Y (c) (t)+ X σ ∈Fc(x) A (c,σ ) (t)= t− 1 X τ =0 a (c) (τ ) (C.1) whereY (c) (t) is the number of service requests of clientc that are not delivered by timet. Divide it byt, lett→∞, and note the following facts: lim t→∞ Y (c) (t) t =0, lim t→∞ 1 t t− 1 X τ =0 a (c) (τ )=λ (c) (C.2) 183 because the queues are stable, and the arrival process is i.i.d. (when the arrival is modeled by Markov- modulated process, the above relationship results from the definition of λ (c) , i.e., time average arrival rate). In addition, define λ (c,σ ) ≜ lim t→∞ A (c,σ ) (t) t , P c (σ )≜ λ (c,σ ) λ (c) , (C.3) and (C.1) becomes X σ ∈Fc(x) P c (σ )=1. (C.4) On the other hand, the total resource consumption at each interfacee (in the rest of the paper, we use interface e to refer to a node i ∈ V or link (i,j) ∈ E, at which packets are scheduled for processing or transmission) cannot exceed the corresponding capacity: X c X σ ∈Fc(x) ρ (c,σ ) e A (c,σ ) (t)≤ C e t. (C.5) Divide it byt, lett→∞: X c X σ ∈Fc(x) ρ (c,σ ) e lim t→∞ A (c,σ ) (t) t = X c X σ ∈Fc(x) ρ (c,σ ) e λ (c,σ ) = X c λ (c) X σ ∈Fc(x) ρ (c,σ ) e P c (σ )≤ C e , (C.6) concluding the proof. C.2 Throughput-optimalityofDI-DCNC We will prove the theorem in two steps, by showing (i) the stability of virtual queues under virtual routing decisions (4.21), and (ii) the stability of actual queues under DI-DCNC (that incorporates ENTO scheduling). 184 C.2.1 StabilityofVirtualQueues Suppose the arrival vectorλ satisfies (4.10), i.e., there exists probability values P c (σ ) such that for each interfacee: X c λ (c) X σ ∈Fc(x) ρ (c,σ ) e P c (σ )≤ C e , (C.7) or equivalently: there existsϵ ≥ 0, such that C e − X c λ (c) X σ ∈Fc(x) ρ (c,σ ) e P c (σ )≥ ϵC e (C.8) for∀e (note that the condition is valid forϵ =0). Define a reference policy ∗ , which operates as follows: at each time slot, select ER σ to deliver the requests of clientc w.p.P c (σ ). Leta ∗ (c,σ ) (t) denote the associated route selection decision. Then, in the virtual system, the additional resource load imposed on interfacee at timet is given by E{˜ a ∗ e (t)}= X c X σ ∈Fc(x) ρ (c,σ ) e E n a ∗ (c,σ ) (t) o = X c λ (c) X σ ∈Fc(x) ρ (c,σ ) e P c (σ ), (C.9) and thus: C e − E{˜ a ∗ e (t)} C e ≥ ϵ. (C.10) 185 On the other hand, the virtual queue drift can be derived as follows. Take the square of (4.14): ˜ Q e (t+1) 2 ≤ h ˜ Q e (t)− C e +˜ a e (t) i 2 = ˜ Q e (t) 2 +[C e − ˜ a e (t)] 2 − 2[C e − ˜ a e (t)] ˜ Q e (t) ≤ ˜ Q e (t) 2 +2B e − 2[C e − ˜ a e (t)] ˜ Q e (t) (C.11) where B e = 1 2 max ( C e , X c A (c) max max σ ∈Fc(x) ρ (c,σ ) e ) 2 . (C.12) withA (c) max denoting the maximum arrival number. Thus, ∆( ˜ Q e (t))≜ ˜ Q e (t+1) 2 − ˜ Q e (t) 2 2 ≤ B e − [C e − ˜ a e (t)] ˜ Q e (t), (C.13) and the overall drift of normalized queues is bounded by ∆( Q(t))≤ B− X e∈V∪E C e − ˜ a e (t) C e Q e (t) (a) ≤ B− X e∈V∪E C e − ˜ a ∗ e (t) C e Q e (t) (C.14) withB = P e∈V∪E B e /C 2 e , and inequality (a) is due to the route selection decisionminimizing the bound. C.2.1.1 I.I.D.Arrival Take expectation of (C.14), and we obtain: E{∆( Q(t))} (b) ≤ B− X e∈V∪E C e − E{˜ a ∗ e (t)} C e E{Q e (t)}≤ B− ϵ ∥E{Q(t)}∥ 1 . (C.15) 186 In inequality (b), we use the fact that the decisions made by∗ are independent with the queuing states, and thus expectation multiplies. In fact, (C.15) implies that the virtual queues are rate stable [58, Section 3.1.4]. C.2.1.2 Markov-ModulatedArrival Under Markov-modulated arrivals, we employ multi-slot drift [58, Section 4.9] to show the stability of virtual queues. Lemma3. The multi-slot drift of interval[t,t+T − 1] ensures: ∆ T (Q(t))≜ ∥Q(t+T − 1)∥ 2 −∥ Q(t)∥ 2 2 ≤ BT 2 − X e∈V∪E Q e (t) t+T− 1 X s=t C e − ˜ a ∗ e (s) C e . (C.16) Proof. For each interfacee, we can verify: ˜ Q e (t+1)≥ ˜ Q e (t)− C e − ˜ a e (t) ≥ ˜ Q e (t)− p 2B e , (C.17) ˜ Q e (t+1)≤ ˜ Q e (t)+ C e − ˜ a e (t) ≤ ˜ Q e (t)+ p 2B e . (C.18) in which we use the fact that C e − ˜ a e (t) ≤ √ 2B e . Iterating the queuing dynamics gives ˜ Q e (t)− p 2B e (s− t)≤ ˜ Q e (s)≤ ˜ Q e (t)+ p 2B e (s− t) (C.19) for anys≥ t, and furthermore (C e − ˜ a ∗ e (s)) ˜ Q e (s)≤ (C e − ˜ a ∗ e (s)) ˜ Q e (t)+ p 2B e (s− t)|C e − ˜ a ∗ e (s)| ≤ (C e − ˜ a ∗ e (s)) ˜ Q e (t)+2B e (s− t), (C.20) 187 therefore, t+T− 1 X s=t (C e − ˜ a ∗ e (s)) ˜ Q e (s)≤ t+T− 1 X s=t (C e − ˜ a ∗ e (s)) ˜ Q e (t)+2B e t+T− 1 X s=t (s− t) = ˜ Q e (t) t+T− 1 X s=t (C e − ˜ a ∗ e (s))+B e T(T − 1). (C.21) Divide the result byC 2 e and substitute it into the multi-slot drift, i.e., the sum of one-slot drifts (C.14), and we obtain: ∆ T (Q(t))= t+T− 1 X s=t ∆( Q(s)) ≤ BT − X e∈V∪E t+T− 1 X s=t C e − ˜ a ∗ e (s) C e Q e (s) ≤ BT 2 − X e∈V∪E Q e (t) t+T− 1 X s=t C e − ˜ a ∗ e (s) C e , (C.22) concluding the proof. Assume the state space (of the underlying Markov process) has a state “0” that we designate as a “renewal” state. Let sequence{t r : r≥ 0} represent the recurrence times to state 0, andT r = t r+1 − t r . By renewal theory, we know thatT r are i.i.d. random variables (and letE{T} andE T 2 denote its first and second moment, respectively), and E ( tr+Tr− 1 X s=tr ˜ a ∗ e (s) Q(t) ) =E ( T− 1 X s=0 ˜ a ∗ e (s) ) = X c E ( T− 1 X s=0 a (c) (s) ) X σ ∈Fc(x) ρ (c,σ ) e P c (σ ) =E{T} X c λ (c) X σ ∈Fc(x) ρ (c,σ ) e P c (σ ). (C.23) 188 Take (conditional) expectation of (C.16), use the above result, and we can obtain: E{∆ Tr (Q(t r ))|Q(t r )}≤ BE T 2 − E{T} X e∈V∪E C e − P c λ (c) P σ ∈Fc(x) ρ (c,σ ) e P c (σ ) C e Q e (t r ) (c) ≤ BE T 2 − E{T}ϵ ∥Q(t r )∥ 1 (C.24) where we plug in (C.8) to derive (c). Take expectation w.r.t. Q(t r ), and the result implies that virtual queues are rate stable [58, Theorem 4.12]. C.2.2 StabilityofActualQueues We will show that the total backlog of actual queues, i.e., R tot (t)=R ′ (t)+R(t) (C.25) is rate stable, where R ′ (t)= X i∈V R ′ i (t), R(t)= X i∈V R i (t)+ X (i,j)∈E R ij (t) (C.26) denote the backlogs of waiting queues and “processing and transmission” queues, respectively. C.2.2.1 AnEquivalentProblem First, we show that the waiting queueR ′ (t) is bounded by a linear function of the “processing and trans- mission” queueR(t), i.e., R ′ (t)≤ M max Z max R(t) (C.27) 189 whereM max =max ϕ M ϕ , andZ max =max ϕ Z ϕ in whichZ ϕ denotes the largest ratio of sizes of any two packets belong to serviceϕ . The reason is the following: for any packet held in the waiting queueR ′ (t), its associates must be in-transit and thus collected in some processing/transmission queue, included inR(t). In the coefficient, factor M max results from multi-step processing, i.e., for a given request, there can be multiple packets staying in the waiting queue, which is bounded byM ϕ (in which caseM ϕ static packets wait for one in-transit live packet); factorZ max accounts for the difference in the data packet size of the waiting and in-transit packets. Substitute it into (C.25), andR tot (t) is also bounded by a linear function of R(t), which implies thatR tot (t) andR(t) have the same stability property. Furthermore, we define processing/transmission resource load process for each node/link as follows: ˜ R i (t)= X ψ ∈R i (t) r ψ |ψ |, ˜ R ij (t)=R ij (t) (C.28) whereR i (t) denotes the set of paired-packets in the processing queue for nodei, andr ψ the workload to process packetψ . Due to the linear relationship betweenR(t) and ˜ R(t), it is also equivalent to show that ˜ R(t) is rate stable. C.2.2.2 StabilityofR(t) This part of the proof is an extension to [67, Appendix D], which deals with the simplified packet routing setting. Let ˜ R (κ ) (t) denote the resource load incurred by all hopκ packets (which crossκ hops in the ALG), ˜ R e (t) the resource load for interface e, and ˜ R (κ ) e (t) the resource load for interface e incurred by hop κ packets, respectively. 190 For a interfacee, and timet 0 andt, letA e (t 0 ,t) denote the resource load imposed on, andS e (t 0 ,t) the resource served, at interfacee during[t 0 ,t]. Then [81, Eq (16)], A e (t 0 ,t)≤ S e (t 0 ,t)+O(t) (C.29) whereO(t) is a non-decreasing function, satisfying lim t→∞ O(t)/t=0. (C.30) Let˜ a (κ ) e (t 0 ,t) denote the additional resource load imposed by exogenously arriving requests during[t 0 ,t], such that linke is theκ -th hop in the associated route. Note that A e (t 0 ,t)≥ X κ ˜ a (κ ) e (t 0 ,t) (C.31) because A e (t 0 ,t) also includes the resource loads incurred by packets that arrive to the network before timet 0 . Proposition: For anyκ ≥ 0, there exists a non-decreasing functionB (κ ) (t) satisfying lim t→∞ B (κ ) (t)/t=0, (C.32) such that ˜ R (κ ) (t)≤ B (κ ) (t). Proof. We assume empty queues at time0 without loss of generality, and show the result by induction. 191 Base case: Lett 0 be the largest time when there is no hop0 packets waiting for operation at interface e. Then, ˜ R (0) e (t)= ˜ a (0) e (t 0 ,t)− S e (t 0 ,t)≤ A e (t 0 ,t)− S e (t 0 ,t)≤ O(t) (C.33) and thus ˜ R (0) (t)= X e∈V∪E R (0) e (t)≤ (|V|+|E|)O(t)≜B (0) (t), (C.34) and by assumption (C.30), theProposition holds. Induction step: Assume ˜ R (j) (t) ≤ B (j) (t), where B (j) (t) is a non-decreasing function and satisfies lim t→∞ B (j) (t)/t = 0 for j = 0,··· ,κ − 1. Let t 0 be the largest time when there is no hop κ packets waiting for operation at interfacee. First, we focus on the resource load incurred by hop κ packets arriving at e during [t 0 ,t], which are either (i) packets that were hop0,··· ,κ − 1 at time slott 0 , or (ii) exogenously arriving requests during [t 0 ,t] that havee as itsκ -th hop. Therefore, the resource load incurred by hopκ packets at interfacee is bounded by: A (κ ) e (t 0 ,t)≤ κ − 1 X j=0 ˜ Ξ max B (j) (t 0 )+˜ a (κ ) e (t 0 ,t) (C.35) where ˜ Ξ max denotes the largest ratio of resource loads of any two edges in the ALG. Next, we consider the resource allocated to hop κ packet operation, denoted by S (κ ) e (t 0 ,t), which satisfies: S (κ ) e (t 0 ,t)≥ S e (t 0 ,t)− κ − 1 X j=0 ˜ Ξ max B (j) (t 0 )+˜ a (j) e (t 0 ,t) (C.36) 192 where the second term on the right-hand-side is the resource consumed by packets of a higher priority (i.e., hop0,··· ,κ − 1 packets). By definition, the remaining resource load incurred by hop κ packets satisfies: ˜ R (κ ) e (t)=A (κ ) e (t 0 ,t)− S (κ ) e (t 0 ,t) ≤ 2 ˜ Ξ max κ − 1 X j=0 B (j) (t 0 )+ κ X j=0 ˜ a (j) e (t 0 ,t)− S e (t 0 ,t) (a) ≤ 2 ˜ Ξ max κ − 1 X j=0 B (j) (t)+A e (t 0 ,t)− S e (t 0 ,t) (b) ≤ 2 ˜ Ξ max κ − 1 X j=0 B (j) (t)+O(t) (C.37) where (a) and (b) result from (C.31) and (C.29), respectively. Thus ˜ R (κ ) (t)= X e∈V∪E ˜ R (κ ) e (t)≤ B (κ ) (t) (C.38) in which B (κ ) (t)≜(|V|+|E|) 2 ˜ Ξ max κ − 1 X j=0 B (j) (t)+O(t) (C.39) satisfying (C.32). Therefore, theProposition holds for caseκ . By induction, ˜ R (κ ) (t) are stable for allκ , and so is the sum of them, i.e., ˜ R(t), concluding the proof. C.3 Flow-basedCharacterization We will show that an arrival vectorλ satisfies (4.10) (referred to as route-based characterization) if and only if it satisfies (4.26) – (4.30) (referred to as flow-based characterization). 193 C.3.1 Necessity First, we show: for anyλ satisfying (4.10), the flow variables defined as follows (take the live flows as an example): f (c) i m− 1 im =λ (c) X σ ∈Fc(x) w (c) i m− 1 im r (ϕ ) m 1 {(i m− 1 ,im)∈σ } P c (σ ), (C.40a) f (c) imjm =λ (c) X σ ∈Fc(x) w (c) imjm 1 {(im,jm)∈σ } P c (σ ). (C.40b) satisfy (4.26) – (4.30). C.3.1.1 CapacityConstraints First, we verify the capacity constraints. Note that processing resource consumption at nodei is given by: X c,m r (ϕ ) m f (c) i m− 1 im = X c,m λ (c) X σ ∈Fc(x) r (ϕ ) m w (c) i m− 1 im r (ϕ ) m 1 {(i m− 1 ,im)∈σ } P c (σ ) = X c λ (c) X σ ∈Fc(x) h X m w (c) i m− 1 im 1 {(i m− 1 ,im)∈σ } i P c (σ ) = X c λ (c) X σ ∈Fc(x) ρ (c,σ ) i P c (σ )≤ C i , (C.41) which satisfies the processing capacity constraint. The transmission capacity constraint can be verified in a similar way and thus is omitted. 194 C.3.1.2 ChainingConstraints Next, we verify the chaining constraints, taking an intermediate node i m (not source or destination) for example, and the goal is to show (4.27), i.e., ξ (ϕ ) m f (c) i m− 1 im + X j∈δ − (i) f (c) jmim − f (c) imi m+1 − X j∈δ + (i) f (c) imjm =λ (c) X σ ∈Fc(x) P c (σ )× h ξ (ϕ ) m w (c) i m− 1 im r (ϕ ) m 1 {(i m− 1 ,im)∈σ } + X j∈δ − (i) w (c) jmim 1 {(jm,im)∈σ } − w (c) imi m+1 r (ϕ ) m+1 1 {(im,i m+1 )∈σ } − X j∈δ + (i) w (c) imjm 1 {(im,jm)∈σ } i =0. (C.42) We will show that for every ERσ , the square bracket equals0. For each nodei m in the ALG, one of the following two cases must be true: 1. i m / ∈σ : all terms are equal to0, and (C.42) holds; 2. i m ∈ σ : the node must have exactly one incoming and one outgoing edge in the live data pipeline (i.e., one positive term and one negative), and by (4.12): ξ (ϕ ) m w (c) i m− 1 im /r (ϕ ) m =ξ (ϕ ) m Ξ (ϕ ) m− 1 =Ξ (ϕ ) m , w (c) imi m+1 /r (ϕ ) m+1 =Ξ (ϕ ) m , w (c) imjm =w (c) jmim =Ξ (ϕ ) m . Therefore, in any case, the square bracket equals toΞ (ϕ ) m − Ξ (ϕ ) m =0, and (C.42) holds. To sum up, (C.42) holds for each intermediate node in the ALG. The conservation of live flow at source and destination nodes, as well as the static flow conservation, can be verified in a similar way and thus is omitted. 195 C.3.2 Sufficiency Next, we will show: for any flow variables f,f ′ satisfying (4.26) – (4.30), the probability values (associated with each ER) derived in this section satisfy (4.10). C.3.2.1 Path-FindingforStagemLiveFlow We divide the delivery procedure by (i) live and static pipelines, and (ii) different processing stages. We first focus on the stage m∈{0,··· ,M ϕ } live flow of client c. Given the live flow f, we can construct a network as follows: the node and edge sets are given by V ′ =V∪{u,v} andE ′ =E∪{(u,i) : i∈V}∪{(i,v) : i∈V}, whereu andv are referred to as super source and destination nodes, respectively; the capacity of each edge is given by: ˜ C ui = ξ (ϕ ) m f (c) i m− 1 im , ˜ C iv =f (c) imi m+1 , and ˜ C ij =f (c) imjm . For the above network, take (u,v) as the source-destination pair; then, the max-flow is given by P i∈V ˜ C ui , because: on one hand, { ˜ C ij } (i.e., the flow of each edge is equal to the corresponding ca- pacity) is a set of feasible flow variables that satisfies flow conservation and capacity constraints imposed by the linear programming formulation of theMax Flow problem [27, Section 29.2], and thus P i∈V ˜ C ui is achievable; on the other hand, the max-flow cannot exceed the total capacity of outgoing edges from the source node, which also equals P i∈V ˜ C ui . By running the standard max-flow algorithms, we can find a set of paths (from u tov) achieving the max-flow, e.g., Edmonds-Karp algorithm, which finds the shortest augmented path [27, Section 26.2] in each iteration (and thus is acyclic). We denote a path found by the algorithm by p, with associated rate (i.e., bottleneck of the augmenting path in the algorithm) denoted by β (p) 1 . Let p src and p des denote the source and destination nodes of pathp (except nodeu andv), respectively. We can find paths for static flows by the same procedure, except that we use the super static source o ′ m instead of the super sourceu. A found path is denoted byq with an associated rate ofβ (q) 2 . 196 live static composed route p1 p2 q1 q2 q3 p1 + q1 p2 + q1 p2 + q2 p2 + q3 0 1 probability β ˜ 1 (p 1 ) β ˜ 2 (q 1 ) β ˜ 2 (q 1 ) +β ˜ 2 (q 2 ) Figure C.1: Composition of live and static packets’ paths into ERs. The segmentpi +qj denotes the ER that combines pathspi andqi for live and static packets, and its length represents the associated probability. C.3.2.2 CompositionofIndividualPaths Next, we compose the separate paths into ERs and define the associated probability values. First, we compose the paths of live and static flows of the same stage. For each node i, denote the set of all incoming live and static paths by: ς (m,i) 1 = p:p des =i , ς (m,i) 2 = q :q des =i (C.43) and the percentage of each path ˜ β (p) 1 = β (p) 1 P p ′ ∈ς (m,i) 1 β (p ′ ) 1 , ˜ β (q) 2 = β (q) 2 P q ′ ∈ς (m,i) 2 β (q ′ ) 2 . (C.44) As illustrated in Fig. C.1, we align the percentiles (i.e., cdfs) of live and static paths, resulting in|ς (m,i) 1 |+ |ς (m,i) 2 |− 1 non-overlapped segments. Each gray segment denotes a route composing the corresponding live and static paths. The route and the percentage (i.e., length of the segment) are denoted byz (m) and ˜ β (m) , and we define z (m) src =p (m) src andz (m) des =p (m) des wherep (m) is path of the live flow component of z (m) . 197 Then, we compose routes of different stages, say stage m andm+1. For each nodei, denote the set of all stagem incoming routes and stagem+1 outgoing routes by ς (m,→i) = z (m) :z (m) des =i , ς (m+1,i→) = z (m+1) :z (m+1) src =i (C.45) Similar to the procedure illustrated in Fig. C.1, we align the percentiles of stage m and m + 1 routes, and each resulting segment represents a composed route of stagem tom+1. Treat it as an aggregated stage, and repeat the above procedure to compose the routes of all stages. For example, we can treat routes composing stagem andm+1 as a single stage, and compose them with stagem− 1 routes (to generate routes of stage m− 1 to m+1). The results are a set of ERs{σ } and the associated probability values {P c (σ )} that are equal to the lengths of corresponding segments. It is clear that no additional resource loads are imposed on any node/link, and thus the incurred re- source consumption does not violate the capacity constraints (4.26). Therefore, the probability values defined above satisfy (4.10). C.4 StabilityRegionwithDynamicReplacement C.4.1 Necessity Consider an arrival processa(t)={a (c) (t):∀c} and the policy able to support it, which keeps the actual queues stable. Fix a time interval[0,t− 1], and letx(τ ) denote the database placement at timeτ ,T(x) the support of the time slots at which database placement isx, andA (c,σ ) (τ ) the number of requests of client c received at timeτ and successfully delivered along ERσ by timet− 1. Define P(x)≜ lim t→∞ |T(x)| t ⇒ X x∈X P(x)=1, (C.46) 198 and without loss of generality, we assumeP(x)>0 (otherwise, we can excludex fromX ). By the conservation of service requests received at timeτ ∈T(x): Y (c,x) (t)+ X τ ∈T(x) X σ ∈Fc(x) A (c,σ ) (τ )= X τ ∈T(x) a (c) (τ ) (C.47) where Y (c,x) (t) is the number of such requests that are not delivered by time t. Divide it by|T(x)|, let t→∞: lim t→∞ Y (c,x) (t) |T(x)| =0, lim t→∞ 1 |T(x)| X τ ∈T(x) a (c) (τ )=λ (c) . (C.48) In addition, define λ (σ ;c,x) ≜ lim t→∞ 1 |T(x)| X τ ∈T(x) A (c,σ ) (τ ), P c,x (σ )≜ λ (σ ;c,x) λ (c) , (C.49) and (C.47) becomes X σ ∈Fc(x) P c,x (σ )=1, ∀c. (C.50) On the other hand, the total resource consumption (up to timet− 1) at each interfacee satisfies: C e t≥ t− 1 X τ =0 X x∈X 1 {x(τ )=x} X c X σ ∈Fc(x) ρ (c,σ ) e A (c,σ ) (τ )= X x∈X X c X σ ∈Fc(x) ρ (c,σ ) e X τ ∈T(x) A (c,σ ) (τ ). (C.51) 199 Divide byt, lett→∞: C e ≥ X x∈X lim t→∞ |T(x)| t X c X σ ∈Fc(x) ρ (c,σ ) e lim t→∞ 1 |T(x)| X τ ∈T(x) A (c,σ ) (τ ) = X x∈X P(x) X c X σ ∈Fc(x) ρ (c,σ ) e λ (σ ;c,x) = X x∈X P(x) X c λ (c) X σ ∈Fc(x) ρ (c,σ ) e P c,x (σ ), (C.52) concluding the proof. C.4.2 Sufficiency Suppose the arrival vectorλ satisfies (4.13), i.e., there exists probability values P(x),P c,x (σ ) such that for each interfacee: C e − X x∈X P(x) X c λ (c) X σ ∈Fc(x) ρ (c,σ ) e P c,x (σ )≥ ϵC e (C.53) In the following, we define a route selection policy to keep virtual queues stable. ∗ C.4.2.1 InitialDesign First, we assume database replacement can be performed instantaneously, i.e., the caching vectorx(t)∈X can be different at each time slot. Define a reference policy *, which operates as follows: at each time slot, (i) select database placementx w.p.P(x), (ii) select ERσ to deliver the requests of clientc w.p.P c,x (σ ). Following the procedure in Appendix C.2.1, the normalized virtual queue drift is given by: E{∆( Q e (t))}≤ B ′ e − C e − E{˜ a ∗ e (t)} C e E{Q e (t)} (C.54) ∗ Then, following the same procedure in Appendix C.2.2, we can show that: when combined with ENTO policy for packet schedule, actual queues are also rate stable. 200 in which B ′ e = 1 2C 2 e max ( C e , max x∈X X c A (c) max max σ ∈Fc(x) ρ (c,σ ) e ) 2 (C.55) satisfies C e − ˜ a e (t) /C e ≤ p 2B ′ e , and E{˜ a ∗ e (t)}=E{E{˜ a ∗ e (t)|x(t)}}= X x∈X P(x) X c λ (c) X σ ∈Fc(x) ρ (c,σ ) e P c,x (σ ). (C.56) Substitute (C.56), together with (C.53), into (C.54), and we obtain E{∆( Q e (t))}≤ B ′ e − ϵ E{Q e (t)}, (C.57) which implies that the virtual queue of each interfacee is rate stable [58, Section 3.1.4]. C.4.2.2 Low-RateReplacement In this section, we design a policy operating with restricted replacement rate. Consider a two-timescale system, where processing and transmission decisions are made on a per time slot basis, while database replacement decisions are made on a per time frame basis, with each frame includingT consecutive slots. Define a reference policy *, which operates as follows: (i) at the beginning of each frame, select database placementx w.p.P(x), (ii) at each time slot, select ERσ to deliver the requests of clientc w.p.P c,x (σ ). Let{rT :r≥ 0} denote the starting time of each frame. Following the procedure in Appendix C.2.1.2, the multi-slot drift of interval[rT,(r+1)T − 1] is give by: E{∆ T (Q e (rT))}≜ E Q 2 e ((r+1)T)− Q 2 e (rT) 2 ≤ B ′ e T 2 − ϵT E{Q e (rT)}. (C.58) 201 Apply the telescope sum [58] forr∈[0,R− 1]: 0≤ E Q 2 e (RT) = R− 1 X r=0 E{∆ T (Q e (rT))}≤ RB ′ e T 2 − ϵT R− 1 X r=0 E{Q e (rT)}, (C.59) and therefore, for any arrival vector interior to the stability region, i.e.,ϵ> 0, R− 1 X r=0 E{Q e (rT)}≤ RB ′ e T ϵ . (C.60) In addition, we note that: T− 1 X s=0 E{Q e (rT +s)}≤ T− 1 X s=0 E{Q e (rT)}+ p 2B ′ e s ≤ TE{Q e (rT)}+ p 2B ′ e T 2 2 . (C.61) Therefore, the average virtual queue backlog over interval[0,t− 1], witht=RT +∆ t(0≤ ∆ t<T), is given by 1 t t− 1 X τ =0 E{Q e (τ )}= 1 RT +∆ t RT+∆ t− 1 X τ =0 E{Q e (τ )} ≤ 1 RT (R+1)T− 1 X τ =0 E{Q e (τ )}= 1 RT R X r=0 T− 1 X s=0 E{Q e (rT +s)} (a) ≤ 1 RT R X r=0 TE{Q e (rT)}+ p 2B ′ e T 2 2 (b) ≤ R+1 R B ′ e ϵ + p 2B ′ e 2 T (C.62) where (a) and (b) result from (C.61) and (C.60), respectively. Lett→∞: lim t→∞ 1 t t− 1 X τ =0 E{Q e (τ )}≤ lim R→∞ R+1 R B ′ e ϵ + p 2B ′ e 2 T = B ′ e ϵ + p 2B ′ e 2 T ∼O (T), (C.63) i.e., the virtual queues are mean rate stable for any fixed T . 202 On the other hand, policy * incurs a replacement rate of Replacement rate≤ |V| P k∈K F k T ∼O 1 T . (C.64) To sum up, policy * achieves an[O(T),O(1/T)] tradeoff between average (virtual) queue backlog and replacement rate. The policy is throughput-optimal for any given T , and the required replacement rate can be arbitrarily close to zero by pushing T → ∞, with a tradeoff in queue backlog (and thus delay performance). C.5 EquivalenceoftheMILPFormulation Fix the database placementx, and we will show that the remaining problems of (4.33) and (4.36) are equiv- alent. Supposex i,k =1, and we note that X j∈δ + (i) f ′(k) ij +f ′(k) i ≤ C max i,k (C.65) since the two terms on the left hand side are bounded by corresponding components ofC max i,k (4.35), given that the flow variables satisfy the capacity constraints (4.26). Therefore, no additional constraints are imposed on the flow variables by replacing (4.29) with (4.34). Supposex i,k =0, in which case (4.34) transforms to X j∈δ + (i) f ′(k) ij +f ′(k) i ≤ X j∈δ − (i) f ′(k) ji , (C.66) and it holds with equality forany solution satisfying (4.29). Reversely, consider any flow assignment under which (C.66) holds with inequality, i.e., the total incoming flow is strictly greater than the outgoing flow, 203 which essentially results from the excessive static flow produced by the static sources. Then, we can reduce the incoming flow rate f ′(k) ji to make it an equality at nodei, and repeat the same procedure for all neighbor nodesj∈δ − i impacted by the modification, which terminates at the static sources v∈V(k (ϕ ) m ). 204
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Enabling virtual and augmented reality over dense wireless networks
PDF
Backpressure delay enhancement for encounter-based mobile networks while sustaining throughput optimality
PDF
Performant, scalable, and efficient deployment of network function virtualization
PDF
Efficient processing of streaming data in multi-user and multi-abstraction workflows
PDF
Detecting and mitigating root causes for slow Web transfers
PDF
On scheduling, timeliness and security in large scale distributed computing
PDF
Efficient pipelines for vision-based context sensing
PDF
Building straggler-resilient and private machine learning systems in the cloud
PDF
Energy-efficient computing: Datacenters, mobile devices, and mobile clouds
PDF
Using formal optimization techniques to improve the performance of mobile and data center networks
PDF
Coded computing: a transformative framework for resilient, secure, private, and communication efficient large scale distributed computing
PDF
Cloud-enabled mobile sensing systems
PDF
Protecting online services from sophisticated DDoS attacks
PDF
Resource scheduling in geo-distributed computing
PDF
Distributed resource management for QoS-aware service provision
PDF
Domical: a new cooperative caching framework for streaming media in wireless home networks
PDF
Anycast stability, security and latency in the Domain Name System (DNS) and Content Deliver Networks (CDNs)
PDF
Hybrid methods for robust image matching and its application in augmented reality
PDF
Cyberinfrastructure management for dynamic data driven applications
PDF
Optimizing task assignment for collaborative computing over heterogeneous network devices
Asset Metadata
Creator
Cai, Yang
(author)
Core Title
Efficient delivery of augmented information services over distributed computing networks
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Degree Conferral Date
2022-12
Publication Date
10/25/2022
Defense Date
10/20/2022
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
augmented/virtual reality,caching,data-intensive,distributed cloud,fog computing,latency,metaverse,mobile edge computing,multicast,network control,OAI-PMH Harvest,operational cost,routing,stability region
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Molisch, Andreas (
committee chair
), Neely, Michael (
committee member
), Raghavan, Barath (
committee member
)
Creator Email
caiy.tsinghua@gmail.com,yangcai@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC112195840
Unique identifier
UC112195840
Identifier
etd-CaiYang-11285.pdf (filename)
Legacy Identifier
etd-CaiYang-11285
Document Type
Dissertation
Format
theses (aat)
Rights
Cai, Yang
Internet Media Type
application/pdf
Type
texts
Source
20221026-usctheses-batch-988
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
augmented/virtual reality
caching
data-intensive
distributed cloud
fog computing
latency
metaverse
mobile edge computing
multicast
network control
operational cost
routing
stability region