Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Joint routing, scheduling, and resource allocation in multi-hop networks: from wireless ad-hoc networks to distributed computing networks
(USC Thesis Other)
Joint routing, scheduling, and resource allocation in multi-hop networks: from wireless ad-hoc networks to distributed computing networks
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Joint Routing, Scheduling, and Resource Allocation in Multi-hop Networks: from Wireless Ad-hoc Networks to Distributed Computing Networks Hao Feng A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) Supervisor: Prof. Andreas F . Molisch December 2017 Copyright 2017 Hao Feng Dedication To my Mom and Dad: Kunxiang Zhang and Jianchang Feng Acknowledgements First and foremost, I would like to thank my adviser, Prof. Andreas Molisch. This set of Ph.D. works can never been achieved without the guidance and help from him over these years. I am so grateful to Prof. Molisch for taking me as his Ph.D. student, helping me to conquer technical challenges, and advising me to improve my working style. I remember that I first saw him when he came to visit my previous school for my Master’s degree, Zhejiang University, and gave a talk on cooperative communications in parallel relay networks. At that time, I thought that my future Ph.D. work should be done just like this kind of works. Throughout the later Ph.D. years, Prof. Molisch holds high criterion on the quality of research works, and correspondingly provided professional and dedicated guidance to me. I still remember that he ever spent a whole weekend staying with me going through the proof of the DIVBAR-RMIA work step-by-step for the final round, and in the later reviewing process, he made remarkable effort guiding me to accommodate the comments and requirements from the reviewers and editors, so on and so forth. These experience are really valuable to me, and what I learned from them, in both technical or non-technical aspects, will be very useful for my future career. I really would like to express my sincere thanks to my internship mentors at Bell Labs, Dr. Antonia Tulino and Dr. Jaime Llorca. The experiences of doing internships at Bell Labs are really memorable to me, and the further collaboration with them gradually opened up the exciting research topic of computing network control for my Ph.D. study. Throughout the whole collaboration process, they gave me dedicated and detailed help that considerably supported me to conquer the difficulties I met in research. I remember that, for many times when I confronted technical challenges, Antonia kept discussing with me, either in person or remotely, and kept tracking my status on the works and providing in-depth technical support and feedbacks. In retrospect of the collaborated works, Jaime provided so much effort and guidance to me that, from the initial system model to the final simulations, nearly every aspect is constructed and polished along with our frequent discussions and detailed collaborations. Because of the joint effort from Antonia and Jaime, my research progress was significantly accelerated. I would like to say thank you to Prof. Michael Neely. The fundamental drift-plus-penalty analysis methodology I used in most of my Ph.D. works were originally developed by Prof. Neely. In my first Ph.D. year, Prof. Molisch advised me to keep tracking on Prof. Neely’s works, which were promising to the research on cooperative communications in wireless Ad-hoc networks. Later I took Prof. Neely’s stochastic network optimization class EE649, and got so impressed by his teaching and the drift-plus-penalty techniques taught in that class, such that I made up my mind of further exploring along this direction to tackle the optimization problems in wireless ad-hoc networks. The influence from Prof. Neely was even more than I thought, it turned out to be that my works on distributed computing networks used the methodology extended from the drift-plus-penalty prototype as well. Throughout my Ph.D. study, I got many valuable advises and helps from him, which I really appreciate. My Ph.D. study was made more memorable by Prof. Stark Draper. My second work on OFDMA-based network was originally born during a discussion with him and Prof. Molisch, and his effort made in the following up collaborations greatly improved the quality of that work. In 2011, he invited me to visit his group, and we had inspirative discussions. Thanks to these pleasant and memorable experiences. I am thankful to Prof. Leana Golubchik for being my defense and qualify exam committee member; Prof. Konstantinos Psounis and Prof. Ashutosh Nayyar for being my qualification exam committee members. I am also thankful to Prof. Giuseppe Caire and Prof. Urbashi Mitra for advising and helping me at the beginning of my Ph.D. study. My experience of pursuing Ph.D. degree is always accompanied by my friends. First of all, I would like to thank Mingyue Ji, my friend and colleague who ever influenced my Ph.D. trajectory. Because of his refer, Antonia and Jaime got to know me. Then I had the chance to do intern at Bell Labs and start working on the topic of computing networks. Second, I would also thank my friend Haoshuo Chen, whom I got to know during my first internship at Bell Labs. Throughout my two internships and visit at Bell Labs, from the first settling down to the later life there, Haoshuo provided a lot of selfless helps to me. Gradually, we became good friends. Both Mingyue and Haoshuo are excellent researchers, and I have learned a lot from them. I feel lucky to know my friends Sucha Supittayapornpong, Run Chen, Ruisheng Wang, Guanbo Chen, Hao Yu, Xiaohan Wei, Srinivas Yerramalli and Sunav Choudhary during my Ph.D. study. Throughout the Ph.D, life, we shared a lot of good and bad experiences. Of course, my study at WiDeS group is one of the most important parts of my Ph.D. life, thanks to my colleagues Junyang Shen, Zheda Li, Rui Wang, Vinod Kristem, Celalettin Umit Bas, Daoud Burghal, Sundar Aditya, Vishnu Ratnam, Pei-Lan Hsu, and Seun Sangodoyin. Finally, I think it is not enough to express my gratefulness to my parents just by words. My Mom and Dad have always been providing the strongest force supporting me to go through the difficulties throughout my Ph.D. life. Without their love to me, the Ph.D. works and this thesis can never come out. Because they believe in me, I can finally go through the dark periods and break my “bottleneck” in my Ph.D. life. Also, I would like to say thank you to my aunt Kunsheng Zhang and my cousin Ying Han. They constantly concern and encourage me. I also feel so grateful to them. 4 Contents Abstract 10 1 Introduction 13 1.1 Motivations to Research on Multi-hop Networks . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2 Throughput Maximization of Multi-hop Wireless Networks . . . . . . . . . . . . . . . . . . 14 1.3 Routing, Scheduling and Resource Allocation with Advanced Physical Layer Techniques in Wireless Ad-hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.4 Service Distribution in Cloud Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.5 Delivery of Augmented Information Services . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.6 Organization of Later Chapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2 Diversity Backpressure Routing with Mutual Information Accumulation in Wireless Ad-hoc Networks 19 2.1 Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.1.1 Mutual Information Accumulation Technique . . . . . . . . . . . . . . . . . . . . . 21 2.1.2 Timing Diagram in One Timeslot and Queuing Dynamics . . . . . . . . . . . . . . 22 2.1.3 The RMIA and FMIA transmission schemes . . . . . . . . . . . . . . . . . . . . . 23 2.2 Network Capacity Region with Renewal Mutual Information Accumulation . . . . . . . . . 24 2.2.1 The network capacity region with RMIA . . . . . . . . . . . . . . . . . . . . . . . 25 2.2.2 Network capacity region: RMIA versus REP . . . . . . . . . . . . . . . . . . . . . 26 2.3 Diversity Backpressure Routing Algorithms with Mutual Information Accumulation . . . . 28 2.3.1 Diversity Backpressure Routing with Renewal Mutual Information Accumulation (DIVBAR-RMIA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.3.2 Diversity Backpressure Routing with Full Mutual Information Accumulation (DIVBAR- FMIA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.4 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.4.1 Throughput optimality of DIVBAR-RMIA among all possible policies with RMIA . 31 2.4.2 Throughput performance of DIVBAR-FMIA . . . . . . . . . . . . . . . . . . . . . 32 5 2.5 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3 Linearization-based Cross-Layer Design for Throughput Maximization in OFDMA Wireless Ad-hoc Networks 37 3.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.1.1 Network Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.1.2 System Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2 Linearization-based JRSPA Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.2.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.2.2 Linearization-based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2.3 Implementation of the Iterative Approach . . . . . . . . . . . . . . . . . . . . . . . 43 3.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.3.1 Throughput of Static Line Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.3.2 Throughput of Networks with Irregular Topologies . . . . . . . . . . . . . . . . . . 46 3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4 Dynamic Network Service Optimization in Distributed Cloud Networks 49 4.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.2 Model and Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.2.1 Cloud Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.2.2 Service Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.2.3 Queuing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.4 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3 Cloud Network Capacity Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.4 Dynamic Cloud Network Control Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4.1 Cloud Network Lyapunov drift-plus-penalty . . . . . . . . . . . . . . . . . . . . . . 57 4.4.2 Linear Dynamic Cloud Network Control (DCNC-L) . . . . . . . . . . . . . . . . . 58 4.4.3 Quadratic Dynamic Cloud Network Control (DCNC-Q) . . . . . . . . . . . . . . . 59 4.4.4 Dynamic Cloud Network Control with Shortest Transmission-plus-Processing Dis- tance Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.5 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.5.1 Average Cost and Network Stability . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.5.2 Convergence Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.6.1 Cost-Delay Tradeoff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.6.2 Convergence Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.6.3 Capacity Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.6.4 Processing Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.7 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 6 4.7.1 Location-Dependent Service Functions . . . . . . . . . . . . . . . . . . . . . . . . 69 4.7.2 Propagation Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.7.3 Service Tree Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5 Optimal Control of Wireless Computing Networks 73 5.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.1.1 Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.1.2 Augmented Information Service Model . . . . . . . . . . . . . . . . . . . . . . . . 75 5.1.3 Computing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.1.4 Wireless Transmission Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.1.5 Communication Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.1.6 Queuing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.1.7 Network Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.2 Wireless Computing Network Capacity Region . . . . . . . . . . . . . . . . . . . . . . . . 79 5.3 Dynamic Wireless Computing Network Control Algorithm . . . . . . . . . . . . . . . . . . 81 5.3.1 The DWCNC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.3.2 Computing Transmission Utility Weight with Independent Links and Discrete Code Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.4 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.5 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.5.1 Network Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.5.2 Service Setups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.5.3 Communication Setups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.5.4 Performances of DWCNC with the broadcast approach and the outage approach . . 87 5.5.5 Processing Distribution Across the Network under . . . . . . . . . . . . . . . . . . 88 5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6 Approximation Algorithms for the NFV Service Distribution Problem 91 6.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.2.1 Cloud network model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.2.2 Service model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.3 The NFV Service Distribution Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 6.4 Fractional NSDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.4.1 QNSD algorithnm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.4.2 Performance of QNSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.5 Integer NSDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 6.6 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.6.1 QNSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 7 6.6.2 C-QNSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.7 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 A Proofs in Chapter 2 107 A.1 Proof of Lemma 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 A.2 The Proof of Lemma 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 A.3 Proof of the necessity part of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 A.4 Proof of the sufficiency part of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 A.5 Proof of Theorem 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 A.6 Proof of Theorem 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 A.7 Proof of Corollary 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 A.8 Proof of Lemma 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 A.9 Proof of Theorem 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 A.9.1 Comparison on the key metric between ~ Policy andPolicy 0 . . . . . . . . . . . . . 118 A.9.2 Comparison on the key metric between ^ Policy and ~ Policy . . . . . . . . . . . . . . 121 A.9.3 Strong stability achieved under ^ Policy . . . . . . . . . . . . . . . . . . . . . . . . 123 A.10 Proof of Theorem 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 A.10.1 Comparison on the key metric between ~ Policy andPolicy 0 . . . . . . . . . . . . . 124 A.10.2 Comparison on the key metric between ^ ^ Policy and ~ Policy . . . . . . . . . . . . . . 124 A.10.3 Strong stability achieved under ^ ^ Policy . . . . . . . . . . . . . . . . . . . . . . . . 125 A.11 Proof of Lemma 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 A.12 Proof of Lemma 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 A.13 Proof of Lemma 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 A.13.1 Comparison between ^ Policy and ~ ~ Policy over a single epoch . . . . . . . . . . . . . 128 A.13.2 Comparison between ^ Policy and ~ ~ Policy over ^ M n (t 0 ;t) (or ~ ~ M n (t 0 ;t)) epochs . . . 129 A.14 Proof of Lemma 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 B Proofs in Chapter 4 133 B.1 Proof of Theorem 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 B.1.1 Proof of Necessity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 B.1.2 Proof of Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 B.2 Proof of Theorem 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 B.2.1 DCNC-L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 B.2.2 DCNC-Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 B.2.3 EDCNC-L and EDCNC-Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 B.2.4 Network Stability and Average Cost Convergence with Probability 1 . . . . . . . . . 138 B.3 Proof of Theorem 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 B.4 Proof of Lemma B.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 8 C Proofs in Chapter 5 141 C.1 Proof of Theorem 10:necessity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 C.1.1 Proof of Necessity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 C.1.2 Proof of Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 C.2 Proof of Theorem 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 C.3 Proof of Lemma C.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 D Proofs in Chapter 6 149 D.1 Appendix: Proof of Theorem 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Bibliography 155 9 Abstract This thesis includes the five works that I did during my Ph.D. study by now. In the first work, algorithms are suggested and analyzed for routing in multi-hop wireless ad-hoc networks that exploit mutual information accumulation as the physical layer transmission scheme, and are capable of routing multiple packet streams (commodities) when only the average channel state information is present, and that only locally. The proposed algorithms are modifications of the Diversity Backpressure (DIVBAR) algorithm, under which the packet whose commodity has the largest “backpressure metric” is chosen to be transmitted and is forwarded through the link with the largest differential backlog (queue length). In contrast to traditional DIVBAR, each receiving node stores and accumulates the partially received packet in a separate “partial packet queue”, thus increasing the probability of successful reception during a later possible retransmission. Two variants of the algorithm are presented: DIVBAR-RMIA, under which all the receiving nodes clear the received partial information of a packet once one or more receiving nodes firstly decode the packet; and DIVBAR-FMIA, under which all the receiving nodes retain the partial information of a packet until the packet has reached its destination. The network capacity region with the Renewal Mutual Information Accumulation (RMIA) transmission scheme is proposed and is proved to be (under certain mild conditions) strictly larger than the network capacity region with the Repetition (REP) transmission scheme that is used by the traditional DIVBAR. DIVBAR-RMIA is proved to be throughput-optimal among the polices with RMIA, i.e., it achieves the network capacity region with RMIA, which in turn demonstrates that DIVBAR-RMIA outperforms traditional DIVBAR on the achievable throughput. Moreover, DIVBAR-FMIA is proven to perform at least as well as DIVBAR-RMIA with respect to throughput. Simulations also confirm these results. In the second work, joint routing, scheduling, and resource allocation to maximize the throughput of OFDMA based wireless ad-hoc networks is considered subject to MAC layer and network layer constraints. Comparing with previous work (e.g., Rashtchi et al., ICC 2012) that assumes each subchannel to be orthogonally accessed by all network links through time sharing, the orthogonal multiple access (e.g., CDMA) is scheduled among the outgoing links of each node to each subchannel, and the interferences caused by other nodes is treated as noise. An iterative heuristic approach to decompose the original problem into subproblems is proposed, each of which can be solved approximately through linearization. Simulations demonstrate that, particularly in large networks, the proposed cross-layer design significantly outperforms the previously proposed “orthogonal-only” access. Distributed cloud networking enables the deployment of a wide range of services in the form of interconnected software functions instantiated over general purpose hardware at multiple cloud locations distributed throughout the network. The third work investigates the problem of optimal service delivery over a distributed cloud network, in which nodes are equipped with both communication and computation resources. The design of distributed online solutions that drive flow processing and routing decisions is addressed, along with the associated allocation of cloud and network resources. For a given set of services, each described by a chain of service functions, the cloud network capacity region is characterized and a family of dynamic cloud network control (DCNC) algorithms are designed that stabilize the underlying queuing system, while achieving arbitrarily close to minimum cost with a tradeoff in network delay. The proposed DCNC algorithms make local decisions based on the online minimization of linear and quadratic metrics obtained from an upper bound on the Lyapunov drift-plus-penalty of the cloud network queuing system. Minimizing a quadratic vs. a linear metric is shown to improve the cost-delay tradeoff at the expense of increased computational complexity. Our algorithms are further enhanced with a shortest transmission-plus-processing distance bias that improves delay performance without compromising throughput or overall cloud network cost. The throughput and cost optimality guarantees, convergence time analysis, and extensive simulations in representative cloud network scenarios are provided. Augmented information (AgI) services allow users to consume information that results from the execution of a chain of service functions that process source information to create real-time augmented value. Applications include real-time analysis of remote sensing data, real-time computer vision, personalized video streaming, and augmented reality, among 10 others. In the fourth work, the problem of optimal distribution of AgI services over a wireless computing network is further considered, where nodes are equipped with both wireless communication and computing resources. The wireless computing network capacity region is characterized and a joint flow scheduling and resource allocation algorithm is designed that stabilizes the underlying queuing system while achieving a network cost arbitrarily close to the minimum, with a tradeoff in network delay. The solution captures the unique chaining and flow scaling aspects of AgI services, while exploiting the use of the the broadcast approach coding scheme over the wireless channel. In the fifth work, the design of fast approximation algorithms for the network function virtualization (VNF) service distribution problem (NSDP) is addressed, whose goal is to determine, in the centralized way, the placement of VNFs, the routing of service flows, and the associated allocation of cloud and network resources that satisfy client demands with minimum cost. We show that in the case of load-proportional costs, the resulting fractional NSDP can be formulated as a multi-commodity-chain flow problem on a cloud-augmented graph, and design a virtual queue-length based algorithm, named QNSD, that provides anO() approximation in timeO(1=). We then address the case in which resource costs are a function of the integer number of allocated resources and design a variation of QNSD that effectively pushes for flow consolidation into a limited number of active resources to minimize overall cloud network cost. 11 Chapter 1 Introduction The research on multi-hop networks has importance in factory automation, sensor networks, security systems, cloud computing, and many other applications and therefore has drawn significant attention in recent years. A set of fundamental problems in such networks is the scheduling and routing of data packets, i.e., which nodes should transmit which packets in which sequence and directions, and the associated network resource allocation. These problems become specific and challenging in multi-hop networks under different application scenarios. In this thesis, five works of my Ph.D. research dealing with routing, scheduling and resource allocation respectively in two types multi-hop networking scenarios: wireless ad-hoc networks and cloud networks, are described. The later sections of this chapter describe the background of my research including summarizing related literature and summarize the organization of later chapters. 1.1 Motivations to Research on Multi-hop Networks Traditional communications are based on point-to-point communication, i.e., only a sender and a receiver are involved in the communication of data. The sender-receiver pair can be, e.g., the Base Station and Mobile Device in the wireless communications; two routers with a direct connection in a computer network. In contrast, my Ph.D. research focuses on the communications in a network with larger scale, where the information source and the intended destination are either geometrically far away, or the communication channel between them is not good enough. In these situations, some nodes consciously and possibly sequentially help other nodes to get the information from the source to the destination, and the information delivery requires communications over multiple hops in the network [1]. The communications in multi-hop network creates more degree of freedom in the system design, which can help to improve the performance, but also complicates the design process. An important task, which is also a challenge, is to determine which nodes should act as relays that forwards the information and in what sequence, i.e., what route to take for the delivery of the information from the source to the destination. In addition, a related issue is to determine what resources (e.g., power, bandwidth) should be allocated to those nodes. The joint optimization of routing and resource allocation involves not only the optimization 13 objective but also the constraints crossing communication layers, e.g., physcal layer (PHY), medium access layer (MAC), network layer, and application layer. Based on these motivations, my Ph.D. work focuses on the cross-layer design of multi-hop networks to maximize the throughput and/or minimize the resource costs. 1.2 Throughput Maximization of Multi-hop Wireless Networks The throughput performance becomes an issue when a single stream or multiple streams of packets intended for a single destination or multiple destinations (i.e., multiple commodities) flow through a multi-hop network. In wired networks, the single packet stream case has been well-explored by several approaches, such as Ford-Fulkerson algorithm and Preflow-Push algorithm (see Ref. [2], Chapter 7 and references therein), and Goldberg-Rao algorithm (see [3] and references therein); simultaneous routing of multiple packet streams have also been extensively explored (see Ref. [4], [5] and references therein). However, this problem becomes more challenging to solve in wireless scenarios, where the links are neither reliable or precisely predictable. To deal with this issue, several studies focus on the routing in the wireless network with unreliable channels and possible multiple commodities. The ExOR algorithm [6] takes advantage of the broadcast effect, i.e., the packet being transmitted by a node can be overheard by multiple receiving nodes. After confirming the successful receivers among all the potential receiving nodes after each attempt of transmission, the transmitting node determines the best node among the successful receivers to forward the packet in the future according to the Expected Transmission Count Metric (ETX) [7], which indicates the proximity from each receiving node to the destination node in terms of forward delivery probability. As a further improvement, the proactive SOAR algorithm [8] also uses ETX as the underling routing metric but leverages the path diversity by certain adaptive forwarding path selections. Both ExOR and SOAR have shown better throughput performance than the traditional routing methods, but neither theoretically provides a throughput-optimal routing approach for multi-hop, multi-commodity multi-hop wireless networks. Throughput maximization can be tackled by stochastic network optimization, which involves routing, scheduling and resource allocation in networks without reliable or precisely predictable links but with certain stochastic features. Refs. [9], [10] systematically analyze this kind of problems by using Lyapunov drift analysis originating from control theory, which follows and generalizes the Backpressure algorithm proposed in [11] [12]. The backpressure algorithm establishes a Max-weight-matching metric for each commodity on each available link that takes into account the local differential backlogs (queue lengths or the number of packets of the particular commodity at a node) as well as the channel state of the corresponding link observed in time. The packet of the commodity with the largest metric will be transmitted from each node. Thus, the backpressure algorithm achieves routing without ever designing an explicit route and without requiring centralized information, and therefore, is considered as a very promising approach to stochastic network optimization problems with multiple commodities. The idea of Backpressure routing was later extended to many other communication applications, e.g., power and server allocation in satellite downlink [13], routing and power allocation in time-varying wireless networks [14], and throughput optimal routing in cooperative two hop parallel relay networks [15]. 14 1.3 Routing, Scheduling and Resource Allocation with Advanced Physical Layer Techniques in Wireless Ad-hoc Networks Wireless Ad-hoc networks support communication and data transfer without fixed infrastructure, thereby avoiding expensive and time-consuming deployments. These characteristics make them to be an important category of multi-hop wireless networks. For throughput maximization in wireless ad-hoc networks, the design issue not only involves scheduling, routing and resource allocation, but also involve what physical layer technique should be used to relay the data among nodes. Specifically, the efficiency of transmissions can be greatly enhanced by Mutual Information Accumulation (MIA), where the receiving nodes store partial information of the packets that cannot be decoded at the previous transmission attempts. MIA can be implemented, e.g., by using Fountain Codes (or rateless codes) [16][17][18]. The transmitter encodes and transmits the source information in code stream of unbounded length, and the receiver can recover the original source information from any portions of the code streams, as long as the amount of total accumulated information exceeds the entropy of the source information. Moreover, Fountain codes can work at any SNR, and therefore, the same code design can be used for broadcasting from one transmitter to multiple receivers whose links to the transmitter have different channel gains. At the same time, Fountain codes can even accumulate the partial information from multiple transmitters. Fountain codes have been suggested for various applications, e.g., point-to-point communications with quasi-static and block fading channels [19], cooperative communications in single relay networks [20][21], cooperative communications in two hop multi-relay networks [22], incremental redundancy Hybrid-ARQ protocols used for Gaussian collision channel in non-routing settings[23], and incremental redundancy Hybrid-ARQ protocols used in the downlink scheduling of MU-MIMO systems [24]. In these applications, Fountain codes have been shown to enhance robustness, save energy, reduce transmission time and increase throughput. Refs. [25], [26] introduce MIA into the routing of multi-hop ad-hoc networks, and have shown that the delay performance can be enhanced with constrained power and bandwidth resources. Therefore, For multi-hop, multi-commodity wireless ad-hoc networks, the throughput might also be increased by merging the usage of MIA and routing and routing considerations. However, none of above papers touches the throughput performance of ad-hoc networks with MIA. Another important aspect of incorporating physical layer cooperation into wireless ad-hoc network scheduling and routing is interference management. The broadcast nature of wireless is double-edged. Although many nodes can overhear a single transmission, multiple transmissions with frequency-time overlapping can interfere. With this fact, throughput maximization with interference management is a a fundamental but challenging issue. Promising solutions require joint consideration of routing, scheduling and power allocations (JRSPA) across different communication layers. Throughput maximization with interference consideration has been investigated extensively over the past years. Some works characterize the achievable rates of multi-hop wireless network with interference avoidance, e.g., [27], [28] and [29], where hard decisions are made to determine the interfering links either by comparing the geometric distances or the SIRs to the corresponding thresholds. In other literature, e.g., [30], the JRSPA problem is formulated as a convex optimization problem. Ref. [30] models the link capacity as a 15 linear function of the SINR, which is only valid for the case of low SINRs on all the activated links. 1.4 Service Distribution in Cloud Networks Distributed cloud networking is a new multi-hop networking paradigm that builds on network functions virtualization (NFV) and software defined networking (SDN) to enable the deployment of network services in the form of elastic virtual network functions instantiated over commercial off the shelf (COTS) servers at multiple cloud locations and interconnected via a programmable network fabric [31][32][33]. In this evolved programmable virtualized environment, network operators can host a variety of highly adaptable services over a common physical infrastructure, reducing both capital and operational expenses, while providing quality of service guarantees. Together with the evident opportunities of this attractive scenario, there also come a number of technical challenges. A critical aspect that drives both cost and performance is the actual placement of the network services’ virtual functions. The ample opportunities for running network functions at multiple locations opens an interesting and challenging space for optimization. In addition, placement decisions must be accompanied with routing decisions that steer the network flows to the appropriate network functions, and with resource allocation decisions that determine the amount of resources (e.g., virtual machines) allocated to each function. The problem of placing virtual network functions in distributed cloud networks was first addressed in [34]. The problem is formulated as a generalization of Generalized Assignment (GA) and Facility Location (FA), and a (O(1);O(1)) bicriteria approximation with respect to both overall cost and capacity constraints is provided. Shortly after, [35] introduced the cloud service distribution problem (CSDP), where the goal is to find the placement of network functions and the routing of network flows that minimize the overall cloud network cost. The CSDP is formulated as a minimum cost network flow problem, in which flows consume both network and cloud resources as they go through the required virtual functions. The CSDP is shown to admit polynomial-time solutions under linear costs and fractional flows. 1.5 Delivery of Augmented Information Services With the increasing popularity of the applications of real-time analysis of remote sensing data, real-time computer vision, personalized video streaming, and augmented reality, etc., the Augmented information (AgI) services arises with the digitization and interconnection of virtually assets, and aim at optimizing physical systems and processes, as well as augmenting human knowledge, cognition, and life experiences. AgI services allow users to consume information that results from the execution of a chain of service functions that process source information to create real-time augmented value. One form of AgI service is the cloud network service, which has been described in the previous section. Another type of AgI service is the automation services, where information sourced at sensing devices in physical infrastructures such as homes, offices, factories, cities, is processed in real time in order to deliver instructions that optimize and control the automated operation of physical systems. Examples include industrial internet services (e.g., smart factories), automated traffic services, smart buildings, smart homes [36]. The third type of AgI service is the augmented experience services, which allows users to consume 16 multimedia streams that result from the combination of multiple live sources and contextual information of real-time relevance. here is a large variety of services that fall into this category, including telepresence, real-time computer vision, virtual classrooms/labs/offices, and augmented/virtual reality in general, which (after some false starts in the past) seems now poised to gain enormous importance in both the professional and the entertainment sectors [37]. 1.6 Organization of Later Chapters My first two works focus on throughput maximization of wireless ad-hoc network by incorporating adopting advanced physical techniques into routing, scheduling and resource allocation. Specifically, the contributions of the works are as follows: 1) Chapter 2 : Assuming using rateless coding, the first work incorporates the mutual information accumula- tion transmission scheme into backpressure routing, while considering receiver diversity in the wireless transmissions. The corresponding analysis quantitatively shows that the capacity region of the wireless ad-hoc network is enlarged, and the proposed algorithms are throughput optimal within the enlarged network capacity region. The publications on this topic is as follows: Hao Feng and Andreas F. Molisch, “Diversity Backpressure Scheduling and Routing With Mutual Information Accumulation in Wireless Ad-Hoc Networks” in IEEE Transactions on Information Theory, vol. 62, no. 12, pp. 7299-7323, Dec. 2016. Hao Feng, and Andreas F. Molisch, “Diversity Backpressure Routing with Mutual Information Accumulation in Wireless Ad-hoc Networks,” IEEE International Conference on Communications (ICC), Ottawa, Jun. 2012, pp. 4055-4060. 2) Chapter 3: Without queuing, the second work explores the throughput maximization issue of Orthogonal Frequency Multiplexing Access (OFDMA) based wireless ad-hoc networks by managing the interference through joint routing, scheduling and allocating transmission power. This work proposes an iterative heuristic algorithm that works under the scheme of scheduling orthogonal multiple access (e.g., CDMA) among the outgoing links of each node to each subchannel, and of treating the interferences caused by other nodes as noise. Simulations demonstrates an significantly enhanced throughput performance comparing with the previous work with the scheme of “orthogonal-only” access. The publication is as follow: Hao Feng, Andreas F. Molisch and Stark C. Draper, “Linearization-based cross-layer design for throughput maximization in OFDMA wireless ad-hoc networks,” IEEE International Conference on Communications (ICC), London, Jun. 2015, pp. 2674-2679. The later three works focus on the optimization of Network Service Distribution Problem (NSDP) over distributed computing networks, where the network nodes are equipped with both transmission and computing resources. Specifically, the contributions of the works are as follows: 3) Chapter 4: The third work is deigning and analyzing dynamic solutions to NSDP in wired cloud networks, where the transmission and processing is jointly optimized. The cloud network capacity is characterized 17 that depends on not only the network topology but also the service structure, and several versions of throughput optimal dynamic cloud network control (DCNC) algorithms are proposed which achieves arbitrarily close to minimum cost with a tradeoff in network delay. Different versions of the DCNC algorithms give different cost-delay tradeoff performances while with different computation complexities. The publications on this topic is as follow: Hao Feng, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch, “Optimal Dynamic Cloud Network Control,” submitted to IEEE Transaction on Networking. Hao Feng, Jaime Llorca, Antonia M. Tulino and Andreas F. Molisch, “Optimal dynamic cloud network control,” IEEE International Conference on Communications (ICC), Kuala Lumpur, Jun. 2016, pp. 1-7. Hao Feng, Jaime Llorca, Andreas M. Tulino and Andreas F. Molisch, “Dynamic network service optimization in distributed cloud networks,” IEEE INFOCOM WKSHPS, San Francisco, CA, May 2016, pp. 300-305. 4) Chapter 5: The fourth work investigates the dynamic solutions NSDP under wireless network settings. The solution captures the unique chaining of AgI services, while exploiting the use of the the broadcast approach coding scheme over the wireless channel. The network capacity region is characterized based on the given AgI service and the assumption of using broadcast approach coding scheme in the case of average channel state information available, and a throughput optimal dynamic wireless computing network control (DWCNC) algorithm is proposed. The use of broadcast approach is shown to improve the cost-delay tradeoff comparing with the traditional outage approach. The publications on this topic are listed as follows: Hao Feng, Jaime Llorca, Antonia M. Tulino, Andreas F. Molisch, “Optimal Control of Wireless Computing Networks,” to be submitted to IEEE Transaction on Wireless Communications. Hao Feng, Jaime Llorca, Antonia M. Tulino, and Andreas F. Molisch, “On the Delivery of Augmented Information Services over Wireless Computing Networks,” IEEE International Conference on Communications (ICC), Paris, Jun. 2017, pp. 1-7. 5) Chapter 6: The fifth work explores the centralized solution to NSDP and develops approximation algorithms that has fast approximation speed. Two cases are considered: the case where the resource allocation solution can take fractional values; the case where the resource allocation solution is restricted to limited integer values. Correspondingly, two versions of algorithms are proposed to tackle the two cases: for the case of fractional resources, the virtual queue-length based algorithm (QNSD) exhibits O() approximation in timeO(1=) if setting the history factor to zero; for the case of integer resources, the proposed discrete-QNSD effectively pushes for flow consolidation into a limited number of active resources to minimize the overall cloud network cost. The publication on this topic is listed as follows: H. Feng, J. Llorca, A. M. Tulino, D. Raz and A. F. Molisch, "Approximation algorithms for the NFV service distribution problem," IEEE Conference on Computer Communications (INFOCOM), Atlanta, 2017, pp. 1-9. 18 Chapter 2 Diversity Backpressure Routing with Mutual Information Accumulation in Wireless Ad-hoc Networks Based on the principle of Backpressure, [38] developed the Diversity Backpressure (DIVBAR) algorithm for routing in multi-hop, multi-commodity wireless ad-hoc networks. Similar to ExOR and SOAR, DIVBAR assumes a network with no reliable or precisely predictable channel states and exploits the broadcast nature of the wireless medium. In essence, each node under DIVBAR locally uses the backpressure concept to route packets in the direction of maximum differential backlog. Specifically, each transmitting node under DIVBAR chooses the packet with the commodity optimal to transmit by computing the Max-weight-matching metric, whose factors include the observed differential backlogs and the link success probabilities resulting from the fading channels; after getting the feedbacks indicating the successful receptions from all the receiving nodes, the transmitting node lets the successful recipient with the largest positive differential backlog get the forwarding responsibility. DIVBAR has been theoretically proved to be throughput-optimal in wireless ad-hoc networks under a set of assumptions, notably that any packet not correctly received by any receiving node needs to be completely retransmitted in future transmission attempts. Here we call the scheme of complete retransmission the Repetition (REP) transmission scheme. In this chapter, my first work also explores the multi-hop, multi-commodity routing in the case of unreliable and non-precisely predictable rates but with using MIA based transmission schemes. We assume that the network has stationary channel fading, i.e., the distribution of the channel realization remains the same, however, the realization varies with time; although no precise channel state information (CSI) at the transmitter is available, the distributions of the channel realization of each link can be obtained by the transmitter beforehand, i.e., each transmitting node has the average CSI; the transmitting node can obtain the receiving (decoding) results by some simple feedbacks sent by the receiving nodes through certain reliable control channels. The contributions of this work are summarized as follows: We analyze the network capacity region [9][10] of ad-hoc networks employing the Renewal Mutual 19 Information Accumulation (RMIA) transmission scheme. Here “Renewal” stands for a clearing oper- ation; and RMIA is the transmission scheme in which all the receiving nodes accumulate the partial information of a certain packet and try to decode the packet when receiving it, but clear the partial information of a packet every time the corresponding packet is firstly decoded by one or more receiving nodes in the network. We prove that the network capacity region with RMIA is strictly larger than the network capacity region with REP. under some mild assumptions, and quantitatively compute a guaranteed extension magnitude of the region boundary for each source-destination pair. We propose and analyze two new routing algorithms that combine the concept of DIVBAR with MIA. The first version, DIVBAR-RMIA is implemented with RMIA, and is shown to be throughput-optimum among all routing algorithms with RMIA. Under the second version, DIVBAR-FMIA, all the received partial information of a packet is retained at all the nodes in the network until that packet has reached its destination, which is called the Full Mutual information Accumulation (FMIA) transmission scheme. We prove that DIVBAR-FMIA’s throughput performance is at least as good as DIVBAR-RMIA’s. In summary, both proposed algorithms can achieve larger throughput limits than the original DIVBAR algorithm with REP. The remainder of this chapter is organized as follows: Section 2.1 presents the network model including the implementation of MIA, the timing diagram for one timeslot and queuing dynamics, and the RMIA and FMIA transmission schemes. Section 2.2 characterizes the network capacity region with RMIA and compares it with the network capacity region with REP. Section 2.3 describes the two proposed algorithms: DIVBAR-RMIA and DIVBAR-FMIA. Section 2.4 proves the throughput optimality of DIVBAR-RMIA with the RMIA assumption and proves the throughput performance guarantee of DIVBAR-FMIA. Section 2.5 presents the simulation results. Section 2.6 concludes the chapter. Mathematical details of the proofs are relegated to Appendices. 2.1 Network Model Consider a stationary wireless ad-hoc network withN nodes, denoted as node setN . Multiple packet streams indexed byc2f1;Ng are transmitted, possibly via multi-hop. Categorize all packets in the packet stream destined for a particular nodec as commodityc packets irrespective of their origin. Each directed wireless link in the network is denoted as (n;k), wheren2N is the transmitting node andk is the receiving node belonging to the receiver set, denoted asK n , of noden. Data flows through the network in units of packet, all of which have the same fixed (positive) amount of information (entropy)H 0 . Packets arriving at each node either exogenously or endogenously are stored in a queue waiting to be forwarded, except at the destination, where they leave the network immediately upon arrival/decoding. The transmission power of each node is constant. Time is slotted and normalized into integer units = 0; 1; 2; 3; . The timeslot length is assumed to be equal to the coherence time of the wireless medium in the network, so that we can adopt the common block-fading model: for each link, the (instantaneous) channel gain is constant within a timeslot duration, while it is i.i.d. (independent and identically distributed) across timeslots. Correspondingly, the amount 20 of information transmitted over link (n;k) in timeslot, denoted asR nk (), is i.i.d. across timeslots. Let F R nk (x) represent the cdf (cumulative distribution function) ofR nk (). We make the following assumption onR nk () andF R nk (x) for alln2N andk2K n : Assumption 1. R nk () is continuously distributed on [0;1), and 0<F R nk (H 0 )< 1. 1 Statistics of CSI of each link are known locally, i.e., at the node from which the link is emanating; however, instantaneous CSI (i.e., channel gains in a timeslot) is never known at any transmitting node. The exogenous packet arrival ratea (c) n () (packet/timeslot) at noden for commodityc is i.i.d. across timeslots and is upper bounded by a constant valueA max . When a packet is transmitted by a noden in each timeslot, it can be simultaneously overheard by all the nodes inK n (“multi-cast effect”). In this network model, a transmission of a packet over a link (n;k) can be interpreted as a process, in which a new copy of the packet is being created at the receiving node k while the original copy of packet is retained at the transmitting noden, and correspondingly, the multi-cast effect indicates that multiple copies of the same packet can be created at multiple receiving nodes simultaneously. However, after each forwarding decision is made among the nodes having a complete copy of the packet (including the transmitting node and successful receiving nodes), only one node that gets the forwarding responsibility can keep the packet, while others discard their copies. Here the forwarding decision result can be either choosing a successful receiver to forward the packet to or retaining the packet by the transmitting node itself. Thus, if definingb (c) nk () as the number of packets of commodityc that flow from noden to nodek2K n in timeslot, thenb (c) nk ()2f0; 1g and P c:c2N P k:k2Kn b (c) nk () 1, for alln2N . 2.1.1 Mutual Information Accumulation Technique Ref. [38] analyzes the routing algorithms implemented based on REP, i.e., for each transmission, the packet either is successfully received at another node, or has to be completely re-transmitted in a later timeslot. As has been described in Section 1.3, we suggest to avoid the inefficiencies of complete retransmission by enabling the MIA technique into the transmission scheme by using, e.g., Fountain codes. In our scenario, we assume that each link uses a capacity-achieving coding scheme, so that a packet is received correctly in timeslot if the amount of partial information of the packet received by the end of timeslot exceeds the entropy of the packetH 0 , i.e., a successful transmission from noden to nodek in timeslot occurs when log 2 (1 + nk ()) +I k () H 0 , where ( nk ) is the SNR over link (n;k) in timeslot , whose distribution depends on the average channel state of link (n;k); I k () is the pre-accumulated partial information before timeslot, i.e., amount of partial information of the corresponding packet already accumulated in the receiving nodek by timeslot 1. Moreover, despite that each receiving node may simultaneously overhear the signals transmitted from multiple neighbor nodes, we assume that there is no inter-channel interference among these signals and the successful receiving of each signal is independent of the signals transmitted through other links. 1 1 This assumption onR nk () andFR nk (H0) is mild and reasonable, since it is consistent with many practical wireless scenarios. For example, Rayleigh fading channels and Rice fading channels (see Ref. [1]) satisfy this assumption. 1 While this assumption is not practically realizable in wireless scenarios unless we use orthogonal channels, it is a standard assumption in the literature of stochastic network optimizations for wireless networks [14][38]. 21 • • Queuing Dynamic: ܳ ߬ ͳ ܳ ߬ െ ܾ ߬ א ǡ Ͳ ܾ ߬ א ܽ ߬ ߬ ߬ ߬ ͳ ߬ ͳ Sender Receiver Control Instruction Control Instruction Data Final Instruction ACK /NACK ߬ ͳ ߬ Figure 2.1: Timing diagram of the working protocol within one timeslot To implement routing with MIA technique, each node sets up two kinds of queues: the compact packet queue (CPQ) and partial packet queue (PPQ). CPQs are FIFO (first in, first out) buffers storing the packets that have already been decoded and are categorized by packets’ commodities; while the pieces of partial information stored in PPQ are distinguished by the packets they belong to. As soon as the partial information of a specific packet accumulated in PPQ of a receiving node exceeds the entropy of that packet, the packet is decoded and moved out of PPQ, and then put into the CPQ if this node gets the forwarding responsibility, or discarded otherwise. 2.1.2 Timing Diagram in One Timeslot and Queuing Dynamics The timing diagram of the communication protocol between each pair of sending node (sender) and receiving node (receiver) within one timeslot is illustrated in Fig. 2.1, which resembles the protocol in [38]. As is shown in this figure, at the beginning of each timeslot, the transmitting node and receiving node exchange their control instructions which include the backlog information of CPQs. Then the transmitting node makes the transmitting decision on which commodity to transmit or to keep silent in timeslot. After the decision is made, the data transmission starts and lasts for a fixed time period (less than the timeslot length), during which the coded bits of a packet with entropyH 0 are being transmitted and overheard by the receiving node(s). After the data transmission period ends, each receiving node sends an ACK/NACK signal back to the transmitting node through a stable control channel indicating whether the packet is successfully decoded by it (as will be shown in Section 2.3, each receiving node under the proposed DIVBAR-FMIA algorithm sends two kinds of ACK/NACK signals back to the transmitting node). Based on the ACK/NACK signals gathered from all the receiving nodes, noden may make the forwarding decision on which successful receiver to transfer the forwarding responsibility to or whether to retain the forwarding responsibility. This decision is related to the transmission schemes being used (described in Subsection 2.1.3). If a forwarding decision is made, a final instruction carrying the forwarding decision will be sent to the receiving nodes through the control channel; otherwise, no final instruction will be sent to the receiving nodes. The queuing dynamics over each timeslot is based on the above timing diagram. LetQ (c) n () represent the backlog of the CPQ of commodityc at noden in timeslot. The backlog of commodityc at noden is 22 updated over timeslot as follows: Q (c) n ( + 1) max 8 < : Q (c) n () X k:k2Kn b (c) nk (); 0 9 = ; + X k:k2Kn b (c) kn () +a (c) n (); (2.1) where the term P k:k2Kn b (c) nk () is the total output rate; P k:k2Kn b (c) kn () is the total endogenous input rate flowing from the neighbor nodes; a (c) n () is the exogenous input rate. The expression in (2.1) is an inequality instead of an equality because the endogenous input rate of the data-carried packets may be less than P k:k2Kn b (c) kn (). This occurs if a neighbor node k2K n has no data of commodity c to send (its CPQ of commodityc is empty), while the decision made by nodek under an algorithm (policy) is to send a commodityc packet, so nodek sends a fake commodityc packet that is counted into P k:k2Kn b (c) kn () while is not counted intoQ (c) n ( + 1). 2.1.3 The RMIA and FMIA transmission schemes For each transmitting noden2N in the network under an arbitrary algorithm using the MIA technique, define an epoch for a noden as the sequence of timeslots that noden uses to transmit a copy of packet: in the sequence of timeslots when noden transmits packets of a commodityc, an epoch of commodityc starts from the first timeslot after a forwarding decision is made for a previous copy of commodityc packet, and ends at the timeslot when the forwarding decision is made for the current copy of commodityc packet. In this chapter, we propose two versions of transmission schemes based on the MIA technique: the Renewal Mutual Information Accumulation (RMIA) transmission scheme and the Full Mutual Information Accumulation (FMIA) transmission scheme. Under both of the transmission schemes, each receiving node accumulates the received partial information to decode the corresponding packet. A key difference of the two schemes is whether they have the renewal operation: RMIA: once a transmitting noden confirms that one or more receiving nodes inK n first successfully decode the copy of a packet being transmitted, noden makes the forwarding decision, which indicates the end of an epoch; immediately after the forwarding decision is made, all the partial information of this packet accumulated at each receiving node is cleared, which is the renewal operation; the timeslot when the first successful reception occurs is called the first-decoding timeslot, and the set of successful receiving nodes is called the first successful receiver set. 2 FMIA: after the forwarding decision is made (end of an epoch), instead of clearing the partial infor- mation of the copy of packet being transmitted, each receiving nodek2K n that does not decode the packet retains the partial information of the packet and possibly uses it in the future decoding if nodek overhears another copy of the same packet in later transmissions; here the forwarding decisions made by each noden are in the first-decoding timeslots with RMIA, i.e., once one or more receiving nodes 2 In general, when to make the forwarding decision is part of the scheduling. But the RMIA transmission scheme analyzed in this chapter has a fixed rule on when to make forwarding decision, which is equivalently to form a restricted policy space, and correspondingly the theory with RMIA in this chapter is focused on this policy space. The case of a general policy space with controlling the time of making forwarding decision is one of the future research directions. 23 firstly accumulate enough partial information to decode the packet without using the partial information retained before the beginning of transmitting the current copy of packet, a forwarding decision is made. 2.2 Network Capacity Region with Renewal Mutual Information Accumulation In this section, we characterize the throughput potential of a stationary wireless ad-hoc network. For a multi-hop, multi-commodity network, let (c) n represent the matrix of exogenous time average input rates, where each entry (c) n =Efa (c) n ()g. LetY (c) n (t) represent the number of packets with source noden that have been successfully delivered to destination nodec within the firstt timeslots. Then a routing algorithm is rate stable if lim t!1 Y (c) n (t) t = (c) n ; with prob. 1;8n;c2N: (2.2) With the above definitions, Ref. [38] defines the network capacity region as the set of all exogenous input rate matrices (c) n that can be stably supported by the network using certain rate stable routing algorithms. However, considering the effect of transmission scheme on the throughput performance, in this chapter, we specify the network capacity region with different transmission schemes. For example, all the algorithms discussed in Ref. [38] are based on the Repetition (REP) transmission scheme, and therefore, we specify the network capacity region defined in Ref. [38] as REP network capacity region, denoted as REP . In our work, we define the RMIA network capacity region, denoted as RMIA , as the set of all exogenous input rate matrices that can be stably supported by the network using certain rate stable routing algorithms with the RMIA transmission scheme. Throughout this chapter, most of the analysis is based on analyzing the network’sd-timeslot average Lyapunov drift, which is defined as d (Q (t 0 )) = 1 d X n;c E ! Q (c) n (t 0 +d) 2 Q (c) n (t 0 ) 2 Q (t 0 ) ; (2.3) whered 1 is certain positive interval length (in units of timeslot);t 0 is an arbitrary timeslot; the vector Q (t 0 ) represents the CPQ backlog state of the network in timeslott 0 ;E ! is the expectation operator taken over!, which represents any realization of the ensemble of channel states, exogenous packet arrivals and policy decisions of the whole network over the whole time horizon. In the rest part of the chapter, the expectation operatorE representsE ! for notational simplification. With the definition ofd-timeslot average Lyapunov drift, the following lemma will be used in the proofs of later theorems: Lemma 1. If there exists a constant"> 0 and integerd> 0, such that for each timeslott 0 and the backlog stateQ (t 0 ) of the network in timeslott 0 , thed-timeslot average Lyapunov drift satisfies: d (Q (t 0 ))B 0 (d)" X n;c Q (c) n (t 0 ); (2.4) 24 whereB 0 (d) is a constant depending ond, then the mean time average backlog of the whole network satisfies: lim sup t!1 1 t t1 X =0 X n;c E n Q (c) n () o B 0 (d) " : (2.5) The proof of Lemma 1 is shown in Appendix A.12. A multi-hop, multi-commodity network is strongly stable when the mean time average total backlog is finite, as is shown in (2.5). LetF (m) R nk (x) represent the cdf of P m =1 R nk (), whereF (1) R nk (x) =F R nk (x); letF (0) R nk (x) = 1. Then we have the following lemma to characterize the basic statistical properties of the flow rates: Lemma 2. Based on Assumption 1, we have the following relations: F (m) R nk (H 0 )<F (m1) R nk (H 0 )F R nk (H 0 ); form 2; (2.6) F (m) R nk (H 0 )< [F R nk (H 0 )] m ; form 2; (2.7) F (m) R nk (H 0 )<F (m 0 ) R nk (H 0 ); form>m 0 0: (2.8) The proof of Lemma 2 is in Appendix A.2. 2.2.1 The network capacity region with RMIA To begin with, we re-state the characterization of the REP network capacity region derived in Ref. [38]: Theorem 1. The (REP) network capacity region REP consists of all the exogenous time average input rate matrices (c) n , for each of which there exists a stationary randomized policy (with REP), denoted as Policy , that chooses probabilities (c) n , (c) nk ( n ), and forms the time average flow rate taking value b (c) nk with prob. 1, for all nodesn;c2N ,k2K n and all the subsets n K n , such that: b (c) nk 0;b (c) cn = 0;b (c) nn = 0; forn6=c; (2.9) X k:k2Kn b (c) kn + (c) n X k:k2Kn b (c) nk ; forn6=c; (2.10) b (c) nk = (c) n X n: nKn q rep n; n (c) nk ( n ); (2.11) where (c) n is the probability that noden decides to transmit a packet of commodityc in each timeslot; q rep n; n is the probability that n is the successful receiver set for a packet transmitted by noden with REP; (c) nk ( n ) is the conditional probability that noden forwards a packet of commodityc to nodek, given that the successful receiver set is n . 25 In Theorem 1, the REP transmission scheme is used inPolicy belonging to the category of stationary randomized policy, under which each noden uses a fixed probability to choose each commodity to transmit in each timeslot (the transmission decision) and a fixed probability to forward the decoded packet to each successful receiver (the forwarding decision). The superscript of a variable indicates that the variable is specified byPolicy ; in the later part of the chapter, we use a similar notational convention for the variables specified by other specific policies. The detailed proof of Theorem 1 is given in Ref. [38]. In our work, an analogous statement can be made to characterize the RMIA network capacity region: Theorem 2. For a network satisfying Assumption 1, the RMIA network capacity region RMIA consists of all the exogenous time average input rate matrices (c) n , for each of which there exists a stationary randomized policy (with RMIA), denoted asPolicy , that chooses probability (c) n , (c) nk ( n ) and forms the time average flow rate taking valueb (c) nk with prob. 1, for all nodesn;c2N ,k2K n and all the nonempty subsets n K n , such that: b (c) nk 0;b (c) cn = 0;b (c) nn = 0; forn6=c; (2.12) X k:k2Kn b (c) kn + (c) n X k:k2Kn b (c) nk ; forn6=c; (2.13) b (c) nk = (c) n rmia n X n: nKn; n6=; q rmia n; n (c) nk ( n ); (2.14) where (c) n is the probability that node n decides to transmit a packet of commodity c in each timeslot; rmia n is the inverse value of the expected epoch length for noden; 4 q rmia n; n is the probability that n is the first successful receiver set in each epoch for noden; (c) nk ( n ) is the conditional probability that noden forwards a packet of commodityc to nodek, given that the first successful receiver set is n . The detailed proof of Theorem 2 consists of a necessity part and a sufficiency part, which are shown in Appendix A.3 and Appendix A.4, respectively. The necessity part is proven by showing that the given constraints (2.12)-(2.14) are necessary for network stability; the sufficiency part is proven by showing that strong stability is achieved underPolicy with the input rate matrix (c) n interior to rmia . 2.2.2 Network capacity region: RMIA versus REP In contrast with REP, RMIA potentially increases the success probability of a transmitting attempt over each link in the network by using the pre-accumulated information. Specifically, when a noden transmits a copy of a packet, if using RMIA, the first successful receiver set is n K n in the first-decoding timeslot, while if using REP with the same channel realizations, the successful receiver set in timeslot is n K n instead, and we have n n . In particular, n is non-empty while n may be empty in timeslot; n must be empty in the timeslots before the copy of packet is firstly decoded with RMIA, i.e., the decoding timeslots (when n 6=;) for noden with REP is a subset of the first-decoding timeslots for noden with RMIA. Based 4 With Assumption 1, the expectation of the epoch length for each node exists. 26 on these facts, we defineq rep;rmia n; n; n as the probability that the first successful receiver set for noden is n in the first-decoding timeslot of an epoch with RMIA, while the successful receiver set for noden in the same timeslot is n with REP.q rep;rmia n; n; n is used in the proof of the following theorem showing that the RMIA network capacity region covers the REP network capacity region: Theorem 3. For a network satisfying Assumption 1 and any input rate matrix (c) n 2 REP , there exists a stationary randomized policy with RMIA that can stably support (c) n , which indicates that REP RMIA . The detailed proof is in Appendix A.5. With Theorem 3, given a non-zero input rate matrix (c) n 2 REP stably supported byPolicy with REP, we can further quantitatively characterize the (time average) input rate increase that can be guaranteed to support by using RMIA instead of REP over a simple pathl n 0 ;c 0 from a source noden 0 to a destination nodec 0 , along which a positive time average flow has been formed underPolicy . 5 This characterization is summarized as the following theorem: Theorem 4. For a network satisfying Assumption 1, if an input rate matrix (c) n 2 REP has a positive entry (c 0 ) n 0 , where (n 0 ;c 0 ) is a source-destination pair, then there exists an input rate matrix 0 (c) n (l n 0 ;c 0 )2 RMIA , such that its (n;c)th entry satisfies 0 n (l n 0 ;c 0 ) = ( (c 0 ) n 0 + (c 0 ) ln 0 ;c 0 ; if n =n 0 ;c =c 0 (c) n ; otherwise : (2.15) In (2.15), we have (c 0 ) ln 0 ;c 0 = min (n;k)2ln 0 ;c 0 n (c 0 ) nk o , in which (c 0 ) nk = 8 > > > > > < > > > > > : (c 0 ) n rmia n 0 B @ P n:k2 n q rep;rmia n;;; n + P n: n k;p ln 0 ;c 0 (n) o n q rep;rmia n;fp ln 0 ;c 0 (n)g; n 1 C A; if n6=n 0 (c 0 ) n 0 rmia n 0 P n 0 :k2 n 0 q rep;rmia n 0 ;;; n 0 ; if n =n 0 ; (2.16) wherep ln 0 ;c 0 (n) is the predecessor node of noden on pathl n 0 ;c 0 . The intuition of Theorem 4 is based on characterizing the flow increase over a link (n;k) on pathl n 0 ;c 0 : in the first-decoding timeslots (with RMIA) when n =; or n = n p ln 0 ;c 0 (n) o with REP, while n is nonempty andk2 n or n k;p ln 0 ;c 0 (n) o n with RMIA, then the transmitting noden will retain the packet in these timeslots with REP, 6 while it can forward the packet to nodek with RMIA. Therefore, a time average flow increase can be obtained on link (n;k). The detailed proof is shown in Appendix A.6. With the result of Theorem 4, it immediately follows that the RMIA network capacity region is strictly larger than the REP network capacity region under certain mild assumptions: 5 A path having a positive time average flow under a stationary randomized policy is defined as the path, on which each link has a positive time average flow rate (with prob. 1) under the policy. 6 Because the time average packet flow over pathln 0 ;c 0 is positive under the policy with REP, the policy with REP can be assumed to assign zero forwarding probability on the reverse link n;p ln 0 ;c 0 (n) ; otherwise, a time average flow loop will be formed between noden and nodep ln 0 ;c 0 (n), which can be eliminated without affecting the net flow value. 27 Corollary 1. For a network satisfying Assumption 1, if REP 6=fO NN g, the RMIA network capacity region RMIA is strictly larger than the REP network capacity region REP , i.e., RMIA REP . Corollary 1 demonstrates that the network can have the potential of supporting larger input data rates by using RMIA in comparison of using REP, and its proof is given in Appendix A.7. 2.3 Diversity Backpressure Routing Algorithms with Mutual Information Accumulation In this section, we propose two routing algorithms using the MIA technique: DIVBAR-RMIA and DIVBAR- FMIA, which respectively use the RMIA and FMIA transmission schemes, in order to further enhance the throughput performance in comparison with the traditional DIVBAR algorithm with the REP transmission scheme in [38]. 2.3.1 Diversity Backpressure Routing with Renewal Mutual Information Accumulation (DIVBAR-RMIA) We summarize the implementation of the DIVBAR-RMIA algorithm at each noden for itsith epoch in the following steps, where the variables with the notation form ^ x are specified by DIVBAR-RMIA: 1. In the starting timeslot ^ u n;i of each epochi for the transmitting noden, noden observes the CPQ backlog of each commodity c2N at each of its potential receivers k2K n . With its own CPQ backlogs, noden computes the differential backlog coefficient ^ W (c) nk (^ u n;i ) as follows: ^ W (c) nk (^ u n;i ) = max n ^ Q (c) n (^ u n;i ) ^ Q (c) k (^ u n;i ); 0 o : (2.17) 2. For each commodityc, the (receiving) nodes inK n are ranked according to their differential backlog coefficients sorted in descending order. Define ^ R high;(c) nk (^ u n;i ) and ^ R low;(c) nk (^ u n;i ) respectively as the set of the nodes inK n with higher and lower ranks than nodek2K n in timeslot ^ u n;i . Define ^ ' (c) nk (i) as the probability that, in a first decoding timeslot, nodek2K n belongs to the first successful receiver set, while the nodes in ^ R high;(c) nk (^ u n;i ) do not successfully decode, i.e, nodek has the highest priority among the successful receivers in the first successful receiver set. 3. Define ^ c n (i) as the optimal commodity that maximizes the following backpressure metric: X k:k2Kn ^ W (c) nk (^ u n;i ) ^ ' (c) nk (i): (2.18) Define ^ n (i) as the corresponding maximum value: ^ n (i) = X k:k2Kn ^ W (^ cn(i)) nk (^ u n;i ) ^ ' (^ cn(i)) nk (i): (2.19) 28 4. If ^ n (i)> 0, noden chooses a packet stored at the head of its CPQ of commodity ^ c n (i) to transmit for the epochi: noden keeps transmitting a copy of packet of commodity ^ c n (i) with the MIA technique in a contiguous sequence of timeslots starting from timeslot ^ u n;i to a timeslot (the first-decoding timeslot), in which one or more nodes inK n firstly accumulate enough partial information and decode the packet (this can be detected by checking the ACK/NACK feedbacks in each timeslot); else, noden transmits a null packet for epochi with the same procedure representing a silent epoch. 5. In the first-decoding timeslot (after the data transmission period in this timeslot), the forwarding decision is made: If the transmitted packet is not null, noden finds the successful receiver ^ k (i) with the largest differ- ential backlog coefficient ^ W (^ cn(i)) n ^ k(i) (^ u n;i ) and checks the coefficient’s value. If ^ W (^ cn(i)) n ^ k(i) (^ u n;i )> 0, noden shifts the forwarding responsibility to node ^ k(i), while other successful receivers dis- card their copies of the packet; else, noden retains the forwarding responsibility, while all the successful receivers discard their copies of the packets. If the transmitted packet is null, each successful receiver discards its copy of the null packet. 6. By the end of the first-decoding timeslot, all the partial information accumulated at the nodes inK n is cleared. In step 3) of the above summary of the DIVBAR-RMIA algorithm, ^ ' (c) nk (i) can be computed with the knowledge of the average CSI: ^ ' (cn) nk (i) = 1 X m=1 h F (m1) R nk (H 0 )F (m) R nk (H 0 ) i Y j:j2 ^ R high;(c) nk (u n;i ) F (m) R nj (H 0 ) Y j:j2 ^ R low;(c) nk (u n;i ) F (m1) R nj (H 0 ): (2.20) The DIVBAR-RMIA algorithm is a distributed algorithm because each noden makes local scheduling and routing decisions based on the CPQ backlog information of itself and of its neighbor nodes inK n and the average channel state information of its outgoing links. 2.3.2 Diversity Backpressure Routing with Full Mutual Information Accumulation (DIVBAR-FMIA) In contrast with DIVBAR-RMIA, DIVBAR-FMIA does not have the “regular” renewal operations by the end of each epoch but retains the partial information of each packet in the network until the packet has been delivered to the destination. Retaining the partial information can further facilitate the decoding of this packet if the retaining receiving node overhears the transmissions of this packet later, possibly from a different transmitting node. In this chapter, we propose a version of the DIVBAR-FMIA algorithm that is associated with the DIVBAR-RMIA algorithm, such that it is set to perform in synchronized epochs with DIVBAR-RMIA, i.e., each node makes the forwarding decisions under DVIBAR-FMIA in the same timeslots as when the node makes the forwarding decisions if under DIVBAR-RMIA. 29 Define ^ ^ p n;i as the packet being transmitted by node n in its ith epoch under DIVBAR-FMIA. Let ^ ^ I pre k ^ ^ p n;i ; ^ ^ u n;i represent the amount of pre-accumulated partial information of packetp n;i before timeslot ^ ^ u n;i stored at nodek. Additionally, define ^ ^ I rmia nk ^ ^ p n;i ; ^ ^ u n;i ; as the amount of partial information of packet ^ ^ p n;i purely accumulated by the transmissions in the time interval from timeslot ^ ^ u n;i to the timeslot, where timeslot is in epochi for noden. Then the DIVBAR-FMIA algorithm summarization for each noden and each epochi is as follows, where we notate its specified variables in the form of ^ ^ x: 1. At the beginning of timeslot ^ ^ u n;i , noden executes the similar steps as Step 1) - 4) of DIVBAR-RMIA based on observations of ^ ^ Q ^ ^ u n;i and the ^ ^ ' (cn) nk (i), wherek2K n , in order to make the transmission decisions on whether to choose a packetp n;i stored at the head of its CPQ of commodity ^ ^ c n (i) to transmit for the epochi. 2. On the receiver side, in each timeslot of epoch i, each receiving node sends two feedback sig- nals back to node n after the data transmission period in timeslot : (ACK=NACK) FMIA and (ACK=NACK) RMIA . (ACK=NACK) FMIA indicates whether nodek successfully decodes the packet with the FMIA transmission scheme, which is true if ^ ^ I pre k ^ ^ p n;i ; ^ u n;i + ^ ^ I rmia nk ^ ^ p n;i ; ^ ^ u n;i ; H 0 ; (ACK=NACK) RMIA indicates whether the partial information accumulated at nodek purely during the current epoch had been enough to decode if using the RMIA transmission scheme, which would be true if ^ ^ I rmia nk ^ ^ p n;i ; ^ ^ u n;i ; H 0 . 3. After gathering all the (ACK=NACK) FMIA and (ACK=NACK) RMIA feedbacks from all the receiving nodes in timeslot, noden firstly checks (ACK=NACK) RMIA feedbacks to confirm if there is any receiving node whose accumulated information ^ ^ I rmia nk ^ ^ p n;i ; ^ ^ u n;i ; exceedsH 0 . If not, noden will keep transmitting in timeslot + 1; otherwise, timeslot is the ending timeslot of epochi for noden (also the first-decoding timeslot with RMIA), and the forwarding decision will be made: If the transmitted packet is not null, noden further checks the gathered (ACK=NACK) FMIA feedbacks, based on which noden finds the successful receiver ^ ^ k (i) with the largest differential backlog coefficient ^ ^ W ( ^ ^ cn(i)) n ^ ^ k(i) ^ ^ u n;i and checks the coefficient’s value. If ^ ^ W ( ^ ^ cn(i)) n ^ ^ k(i) ^ ^ u n;i > 0, noden shifts the forwarding responsibility to node ^ ^ k(i), while other successful receivers discard their copies of the packet; else, noden retains the forwarding responsibility, while all the successful receivers discard their copies of the packets. If the transmitted packet is null, each successful receiver discards its copy of the null packet, and each unsuccessful receiver also discards the partial information of the null packet they have received. Additional to the steps in each epoch shown above, after the packet ^ ^ p n;i is delivered to its destination, all the partial information of packet ^ ^ p n;i stored in the network is cleared in order to free up the memory. According to above algorithm summary, DIVBAR-FMIA can also be a distributed algorithm, because each noden makes local scheduling and routing decisions based on the CPQ backlog information of itself 30 and of its neighbor nodes inK n and the average channel state information of its outgoing links. However, the practical implementation for DIVBAR-FMIA is more challenging than that for DIVBAR-RMIA, for instances, efficiently accumulating the partial information of the same packet from two different transmitting nodes requires extra coordination in the implementation of the rateless codes; a notification signal has to be broadcasted to inform the nodes in the network to clear the partial information of a delivered packet; extra storage and efficient retrieving mechanism for the retained partial information in the PPQs have to be properly set up. In the later part of the chapter, we assume these implementation issues have been properly done for DIVBAR-FMIA and analyze its throughput performance. 2.4 Performance Analysis In this section, the performances of DIVBAR-RMIA and DIVBAR-FMIA are evaluated. As will be shown in Theorem 5, we first prove the throughput optimality of DIVBAR-RMIA among all possible algorithms with RMIA. Secondly, as will be shown in Theorem 6 and Corollary 3, we prove that DIVBAR-FMIA’s throughput performance is at least as good as DIVBAR-RMIA’s. 2.4.1 Throughput optimality of DIVBAR-RMIA among all possible policies with RMIA In this subsection, our goal is to analyze the throughput performance of DIVBAR-RMIA algorithm and show that it is throughput optimal among all possible policies with RMIA. To begin with, we give an initial analysis on the backpressure metric under DIVBAR-RMIA over a single epoch. Consider a policy under which each epoch consists of contiguous timeslots. DefineZ n (i;Q ()) as the following metric over theith epoch under such a policy: Z n i; ^ Q () = X c u n;i+1 1 X 0 =u n;i X k:k2Kn b (c) nk 0 h Q (c) n ()Q (c) k () i ; (2.21) whereu n;i is the starting timeslot of theith epoch. LetP represent the set of policies with RMIA consisting of DIVBAR-RMIA and all the policies having synchronous epochs with DIVBAR-RMIA. With the definitions ofZ n (i;Q ()) andP, we propose Lemma 3 as follows to demonstrate the origin of the backpressure metric formulation (2.18) in step 3) of the DIVBAR-RMIA algorithm summary. Lemma 3. For each noden, the metricE n Z n i; ^ Q (u n;i ) ^ Q (u n;i ) o under an arbitrary policy within policy setP is upper bounded by ^ n (i), and this upper bound is achieved under the DIVBAR-RMIA algorithm. Lemma 3 characterizes the key feature of DIVBAR-RMIA over a single epoch, and its detailed proof is given in Appendix A.8. Based on Lemma 3, the following theorem shows that strong stability can be achieved by DIVBAR-RMIA for any input rate matrix in the interior of RMIA , which demonstrates that DIVBAR-RMIA is a throughput optimal algorithm among all the algorithms with the RMIA transmissions scheme. 31 Theorem 5. For a network satisfying Assumption 1, DIVBAR-RMIA is throughput optimal among the algorithms with RMIA: for an exogenous input rate matrix (c) n , if there exists an " > 0 satisfying (c) n +" 2 RMIA , then there exists an integerD> 0, such that the mean time average CPQ backlog of the whole network can be upper bounded as follows: lim sup t!1 1 t t1 X =0 X n;c E n ^ Q (c) n () o 2 [B (D) +C (D)] " ; (2.22) when implementing the DIVBAR-RMIA algorithm, where B (D) =N 2 D h 1 + (N +A max ) 2 i ; C (D) = 4ND (N +A max + 1): The proof of Theorem 5 is given in Appendix A.9. The proof follows the strategy of comparing the upper bounds (containing the backpressure metrics) formulated for theD-timeslot average Lyapunov drift under the DIVBAR-RMIA algorithm and under the stationary randomized policy with RMIA (Policy ) that stably supports (c) n . With Theorem 5, we claim the following corollary: Corollary 2. For a network satisfying Assumption 1, if REP 6=fO NN g, the throughput performance of DIVBAR-RMIA is strictly larger than that of DIVBAR (with REP). Proof. According to Theorem 5, RMIA is the set of input rate matrices that can be supported by DIVBAR- RMIA. Moreover, reviewing the result of Corollary 1, RMIA is strictly larger than REP given Assumption 1 and that REP 6=O NN . Thus, with the same assumptions, DIVBAR-RMIA has strictly better throughput performance than DIVBAR, since REP is the set of input rate matrices that can be supported by DIVBAR (see [38]). 2.4.2 Throughput performance of DIVBAR-FMIA Since the proposed DIVBAR-FMIA algorithm is set to have synchronous epochs with DIVBAR-RMIA, with the help of the pre-accumulated information in the receiving nodes by the beginning of each epoch, the successful receiver set at the end of each epoch under DIVBAR-FMIA should include the first successful receiver set under DIVBAR-RMIA. This intuition indicates that the throughput performance of DIVBAR- FMIA should be at least as good as DIVBAR-RMIA, and yields the following theorem: Theorem 6. For a network satisfying Assumption 1, for any (c) n interior to RMIA , DIVBAR-FMIA yields the strong stability of the network: there exists an integerD > 0, such that the mean time average CPQ backlog of the whole network can be upper bounded as follows: lim sup t!1 1 t t1 X =0 X n;c E n ^ ^ Q (c) n () o 2 [B (D) +C (D)] " ; (2.23) when implementing the DIVBAR-FMIA algorithm, where the positive" satisfies: (c) n +" 2 RMIA . 32 1 2 3 6 5 8 9 4 10 7 3 5 3 4 4 1 6 3 3 1 1 2 5 3 2 3 ࡿ ࡿ ࡰ ࡰ ߣ ଶ ଵ ߣ ଵ ଽ Figure 2.2: The ad-hoc network being simulated The proof of Theorem 6 is given in Appendix A.10, and the proof strategy is similar to that of Theorem 5. With Theorem 6, retaining the partial information does not affect the stability of the network under DIVBAR-FMIA either, assuming that the retained partial information of a packet is cleared once the packet is delivered. To explain the reason, consider that in each timeslot, a packet stored at a noden must have at mostN 1 pieces of partial information respectively stored in the PPQs at the otherN 1 nodes, and the backlog of each piece of partial information is less than 1 (in unit of packet). Therefore, the total PPQ backlog in timeslot is no more than (N 1) P n;c ^ ^ Q (c) n (), and according to (2.23), the mean time average PPQ backlog is also upper bounded. Moreover, based on Theorem 6, we can further compare DIVBAR-FMIA and DIVBAR-RMIA in the throughput performance by showing the following corollary: Corollary 3. For a network satisfying Assumption 1, the throughput performance of DIVBAR-FMIA is at least as good as DIVBAR-RMIA. Proof. According to Theorem 6, DIVBAR-FMIA is able to support any input rate matrix within RMIA , which indicates that any input rate matrix that can be stably supported by DIVBAR-RMIA can also be stably supported by DIVBAR-FMIA, i.e., the throughput performance of DIVBAR-FMIA is at least as good as DIVBAR-RMIA. 2.5 Simulations Example simulations are carried out in the ad-hoc wireless network shown in Fig. 2.2. All the links in the network are independent non-interfering links, each of which is subject to Rayleigh fading (independent among links and timeslots), while the average channel states are static. The number on each link represents the mean SNR value (linear scale) over that link; the time average exogenous input rates (9) 1 and (10) 2 are set to be the same. 33 0 0.2 0.4 0.6 0.8 1 1.2 1.4 10 −1 10 0 10 1 10 2 10 3 10 4 10 5 Time Average Input Rate (normalized−units/timeslot) Time Average Occupancy (normalized−units) DIVBAR−RMIA, H0=1 normalized−unit DIVBAR−FMIA, H0=1 normalized−unit DIVBAR, H0=1 normalized−unit DIVBAR−RMIA, H0=2 normalized−units DIVBAR−FMIA, H0=2 normalized−units DIVBAR, H0=2 normalized−units Figure 2.3: Throughput performance comparison among DIVBAR-RMIA, DIVBAR-FMIA and DIVBAR algorithms with different packet lengths Simulations are conducted comparing throughput performance of the three algorithms: DIVBAR-FMIA, DIVBAR-RMIA, and traditional DIVBAR (with REP). Fig. 2.3 shows the time average occupancy (total time average backlog in the network measured in normalized-units) vs. exogenous time average input rate measured in normalized-units/timeslot. Here a normalized-unit has to be long enough (contain sufficient number of bits) to allow the application of a capacity achieving code. The maximum supportable throughput corresponds to the exogenous time average input rate at which the occupancy goes towards very large values (due to a finite simulation time, it does not approach infinity in our simulations). As is shown in the figure, the throughput under DIVBAR-RMIA algorithm is smaller than that of DIVBAR-FMIA; the throughput under both algorithms are larger than that of the regular DIVBAR algorithm. These observations are in line with the theoretical analysis. The simulation of the throughput comparison is carried out under different packet entropy conditions. The entropy contained in each packet is denoted byH 0 as is shown in the figure. WhenH 0 = 1 normalized-unit, Fig. 2.3 shows that the throughput under the three algorithms are nearly identical. This phenomenon is caused by the fact that the packet length is generally small compared to the transmission ability of the links in the network. Therefore nodes in the network can usually achieve a successful transmission over a link at the first attempt, which results in that using the MIA technique in the transmissions has little benefit. However, asH 0 increases to 2 normalized-units, the success probability in a single attempt decreases. Nodes under regular DIVBAR increase the chance of successful transmission just through trying more times, while DIVBAR-FMIA and DIVBAR-RMIA accumulate information in each attempt, which will facilitate the future transmissions. Thus the throughput difference between DIVBAR and DIVBAR-(R)FMIA becomes obvious. 34 2.6 Conclusions In this chapter, we proposed two distributed routing algorithms: DIVBAR-RMIA and DIVBAR-FMIA, which exploit the MIA technique for the routing in multi-hop, multi-commodity wireless ad-hoc networks with unreliable and non-precisely predictable links. After setting up a proper network model, including designing the queue structure of each network node to implement the two proposed transmission schemes: RMIA and FMIA, and the working diagram within each timeslot, we analyze the throughput potential of the network with RMIA by characterizing and analyzing the RMIA network capacity region. We prove that, with certain mild assumptions consistent with practical wireless scenarios, it covers and extends the network capacity region with the REP transmission scheme which is traditionally used. Moreover, under the same assumptions, the proposed DIVBAR-RMIA algorithm is proven to be throughput optimal among all the policies with RMIA, and the proposed DIVBAR-FMIA has throughput performance at least as good as DIVBAR-RMIA. Therefore, the proposed two algorithms have superior throughput performance compared to the original DIVBAR with REP. This fact is confirmed by simulations. 35 Chapter 3 Linearization-based Cross-Layer Design for Throughput Maximization in OFDMA Wireless Ad-hoc Networks In this chapter, joint routing, scheduling and power allocation (JRSPA) for throughput maximization in Orthogonal Frequency Division Multiple Access (OFDMA) wireless ad-hoc networks is explored. OFDMA allows the efficient use of spectrum in frequency-selective fading environments, and therefore is promising for wireless ad-hoc networking. Some papers have explored the cross-layer design of OFDMA-based wireless networks. For instance, resource allocation has been studied in [39] for multi-source, single destination, two-hop OFDMA relay networks subject to the restriction of exclusive (orthogonal) subchannel access, i.e., each subchannel can only be used once in the whole network. With the same subchannel access restriction, a heuristic approach has been proposed in [40] to maximize the throughput of OFDMA wireless mesh backhaul networks. Ref. [41] extends the framework to OFDMA ad-hoc networks; it allows all links in the network to access each subchannel through Time Division Multiple Access (TDMA). However, the model still assumes that each time-frequency resource can be used only once in the whole network. The original problem is finally transformed into a convex optimization problem. Similar to [41], we focus on throughput maximization in OFDMA wireless ad-hoc network, using a similar network model and assuming that the outgoing links of each node are scheduled orthogonally. However, in contrast to [41], the outgoing links from different nodes need not to be orthogonal, and thus may cause interference to one another. The motivation for the change in setting is to expand the policy space for choosing the scheduling variables. Correspondingly, we can then trade off spatial frequency reuse with interference. The main contribution of this chapter is to propose a linearization-based heuristic approach to tackle the new JRSPA problem. Simulation results demonstrate a significant throughput performance enhancement compared to the results of [41] in particular for large networks. The remainder of this chapter is organized as follows. Section 3.1 describes the system model. Section 3.2 formulates the optimization problem and describes the linearization-based heuristic approach. Simulation results are provided in Section 3.3. The chapter is concluded in Section 3.4. 37 3.1 System Model 3.1.1 Network Settings We consider an OFDMA wireless ad-hoc network withN nodes denoted as setN . The network operates over a frequency band equally divided intoK subchannels, denoted as setK. Each subchannel has bandwidth W . LetP i represent the power budget of theith node. We assume that the signal transmitted from any node i can be overheard by all other nodes in the network at various strengths, depending on the channel gains. The network is thus fully connected. Let node pair (i;j) represent the link from nodei to nodej;g k ij is the channel (power) gain of link (i;j) on subchannelk. Assume theg k ij to be constant during multiple frames of transmission. This can be approximately true if the duration of communication is less than the coherence time of the wireless medium. We assume full channel state information (CSI) known by a centralized coordinator. Each node in the network can serve as source, relay and/or destination simultaneously; we allow each node to work in full-duplex mode 1 , i.e., transmissions and receptions of a node can overlap in time without causing self-interference. Multiple data streams (“commodities”), indexed byd2D and intended for different destination nodes, may flow through the network concurrently. 3.1.2 System Constraints In order to maximize the throughput of the network, we jointly schedule subchannel usage for each link, route the data streams, and allocate the power to each transmission, all subject to the constraints of different layers. Scheduling Constraints The scheduling of subchannels is characterized by the set of variablesfc k ij g, wherei andj are the transmitting and receiving node, respectively, andk is the subchannel index. First consider the case where thec k ij are restricted to take binary values: c k ij takes value 1 to indicate the exclusive usage of subchannelk by link (i;j), and 0 otherwise. In the OFDMA ad-hoc network, the constraints on thec k ij are X j:j6=i c k ij 1;8i2N; k2K; (3.1) c k ij 2f0; 1g;8i;j2N; j6=i; k2K: (3.2) Eq. (3.1) restricts each subchannel to be used by at most one of the outgoing links of each node. With the binaryc k ij , the JRSPA problem for a network with a general topology is an integer programming problem that falls into the category of NP-hard problems [39]. To make the problem tractable, we relax the c k ij to take values in the continuous interval [0; 1]. Technologically, this models the application of multiple access techniques among the outgoing links of each transmitting node on each subchannel. Retaining (3.1), 1 Note that recent researches [42] [43] have made full-duplex implementation practically feasible as well as theoretically interesting. The treatment of the half-duplex mode will be subject of future work. 38 constraints (3.2) can then be replaced by 0c k ij 1;8i;j2N; j6=i; k2K: (3.3) Then (3.1) and (3.3) imply that the outgoing links of each node access each subchannel orthogonally. Correspondingly,c k ij represents the fraction that the amount of resource used by link (i;j) on subchannelk is divided by the total amount of resource under the multiple access scheme, e.g., the fraction of the time used under TDMA or the fraction of the spreading codes used under Code Division Multiple Access (CDMA). In later parts of the chapter, we adopt CDMA as the multiple access scheme to analyze, and briefly call the use of CDMA with the constraints (3.1), (3.3) the “local CDMA” scheme. The motivation is two fold: a) As will be shown later, the interference can be characterized by thec k ij and the power allocation variables in a closed form, c.f., (3.9). This makes our problem tractable. b) For the transmission over each link (i;j), all the signals overheard by nodej from nodes other than nodei on the same subchannel are whitened as noise. In contrast, under other local multiple access schemes, some interferers are possibly avoided because of being allocated with the resource, e.g., timeslots under “local TDMA”, orthogonal to the one allocated to link (i;j). Therefore, applying local CDMA on each subchannel achieves a lower bound of the throughput of the OFDMA network. We assume that the total number of orthogonal spreading codes generated to be sufficiently large such that the c k ij can approximately take any values in interval [0; 1]. Power Allocation Constraints Under the local CDMA scheme, let p k ij represent the amount of power allocated to link (i;j) for the transmission on subchannelk if link (i;j) used all the spreading codes. Then X k X j:j6=i c k ij p k ij P i ;8i2N; (3.4) p k ij 0;8i;j2N; j6=i; k2K; (3.5) wherec k ij p k ij is the actual amount of power allocated to link (i;j) for the transmission on subchannelk. Eq. (3.4) describes the constraint of the power consumed by nodei, summed over all outgoing links and all subchannels. Routing Constraints Lets (d) i represent the exogenous input rate for commodityd arriving at nodei; letx k(d) ij represent the data rate for commodityd that flows through link (i;j) on subchannelk. Then thes (d) i and thex k(d) ij satisfy: x k(d) ij 0;8i;j2N; j6=i; k2K; d2D; (3.6) s (d) i 0;8i2N; d2D; i6=d: (3.7) 39 Eq. (3.7) does not hold fori = d sinces (d) d can be negative, which represents that data flows out of the network. Moreover, flow conservation requires that X j:j6=i X k x k(d) ij X j:j6=i X k x k(d) ji s (d) i = 0;8i2N; d2D: (3.8) Link Capacity Constraints Consistent with the scheduling constraints shown in (3.1) and (3.3), interference may exist among the transmissions over the links on the same subchannel but emanating from different nodes. Under the local CDMA scheme, different transmitting nodes use different sets of orthogonal spreading codes created, e.g., by multiplying a set of Walsh-Hadamard codes by m-sequence “scrambling” codes [1]. The result is that orthogonality among the spreading codes used on the outgoing links of each node can be guaranteed, while the spreading codes used to transmit by different nodes are non-orthogonal but distinguishable. The aggregate interference to each link (i;j) on each subchannelk is X n:n6=i X z:z6=n c k nz p k nz g k nj ; 8i;j2N; k2K: (3.9) Treating interferences as noise, it follows that the link capacities satisfy the following non-convex constraints: X d x k(d) ij Wc k ij log 2 0 B @1 + p k ij g k ij N 0 W + P n:n6=i P z:z6=n c k nz p k nz g k nj 1 C A; 8i;j2N; j6=i; k2K; d2D; (3.10) whereN 0 W is the power of the additive white Gaussian noise. 3.2 Linearization-based JRSPA Optimization The goal of this section is to determine the routing, scheduling and resource allocation variables, i.e.,x k(d) ij ,c k ij andp k ij , respectively, to maximize the throughput of the network by formulating and approximately solving the JRSPA problem. 3.2.1 Problem Formulation Throughput performance can be expressed as maximizing the weighted summation of the supportable exogenous input ratess (d) i subject to the constraints (3.1), (3.3)-(3.8) and (3.10). The resulting optimization problem can be formulated as max n s (d) i o ;fc k ij g; n x k(d) ij o ;fp k ij g X d X i:i6=d ! (d) i s (d) i (3.11a) subject to: Eq. (3.1), Eq. (3.3)-(3.8) and Eq. (3.10); (3.11b) 40 where the! (d) i are non-negative constant weights. The constraints (3.4) and (3.10) are non-convex, and therefore theoretically achieving the global optimum of the resulting problem (3.11) is not guaranteed. In this chapter, we instead develop a heuristic approach to (3.11). 3.2.2 Linearization-based Approach Our approach is to iterate a sequence of subproblems, each of which has a simpler form. We keep iterating, updating the parameters of each subproblem according to the solutions of the previous one, until they converge to a local optimum. We aim to transform (3.4) and (3.10) into linear forms. To do this, we setfc k ij g andfp k ij g as two alternatively updating sets of parameters of the subproblems. We introduce the c k ij and p k ij as new variables, defined respectively as the changes in thec k ij andp k ij between two successive iterations, i.e., c k ij [l + 1] =c k ij [l] + c k ij [l]; 8i;j2N; i6=j; k2K; l 0; (3.12) p k ij [l + 1] =p k ij [l] + p k ij [l]; 8i;j2N; i6=j; k2K; l 0; (3.13) wherel is the index of the iteration. Updating the two sets of parametersfc k ij [l]g andfp k ij [l]g alternatively forms two consecutive sub-steps in each iterationl: the “scheduling update” and the “power allocation update”, denoted as sub-stepl c and l p , respectively. The update sequence of each iteration followed in this chapter is: starting withfc k ij [l]g andfp k ij [l]g, sub-stepl c updatesfc k ij [l]g tofc k ij [l + 1]g; then, continuing withfc k ij [l + 1]g andfp k ij [l]g, sub-stepl p updatesfp k ij [l]g tofp k ij [l + 1]g. Having clarified the update sequence, in the remainder of this subsection and for notational convenience, we omit the iteration indices in reformulating certain constraints. By replacing thec k ij in (3.4) byc k ij + c k ij , the power constraint in sub-stepl c becomes linear: X k X j:j6=i c k ij + c k ij p k ij P i ; 8i2N: (3.14) DefineI k ij = P n:n6=i P z:z6=n c k nz p k nz g k nj . We perform Taylor expansion on (3.10) with thec k ij replaced by the c k ij + c k ij , and keep the constant and linear terms. The link capacity constraints in sub-step l c are approximately X d x k(d) ij W c k ij log 2 1 + p k ij g k ij N 0 W +I k ij ! Wc k ij p k ij g k ij P n:n6=i P z:z6=n c k nz p k nz g k nj (ln 2) N 0 W +I k ij +p k ij g k ij N 0 W +I k ij +Wc k ij log 2 1 + p k ij g k ij N 0 W +I k ij ! ; 8i;j2N; j6=i; k2K: (3.15) Additionally, we use a parameter to control the approximation accuracy by restricting the ranges of the c k ij 41 relative to thec k ij according to the following relations: X n:n6=i X z:z6=n c k nz p k nz g k nj N 0 W +I k ij ; 8i;j2N;j6=i;k2K; (3.16a) c k ij p k ij g k ij N 0 W +I k ij +p k ij g k ij ; 8i;j2N;j6=i;k2K; (3.16b) Similarly, the power budget constraints in sub-stepl p are X k X j:j6=i c k ij p k ij + p k ij P i ;8i2N: (3.17) The link capacity constraints in sub-stepl p are approximately X d x k(d) ij Wc k ij p k ij g k ij (ln 2) N 0 W +I k ij +p k ij g k ij Wc k ij p k ij g k ij P n:n6=i P z:z6=n c k nz p k nz g k nj (ln 2) N 0 W +I k ij +p k ij g k ij N 0 W +I k ij +Wc k ij log 2 1 + p k ij g k ij N 0 W +I k ij ! ; 8i;j2N; j6=i; k2K; (3.18) and the corresponding range restrictions on the p k ij are X n:n6=i X z:z6=n c k nz p k nz g k nj N 0 W +I k ij ; 8i;j2N;j6=i; k2K; (3.19a) p k ij g k ij + X n:n6=i X z:z6=n c k nz p k nz g k nj N 0 W +I k ij +p k ij g k ij ; 8i;j2N;j6=i; k2K: (3.19b) Based on (3.14)-(3.19), the two subproblems corresponding to sub-steps l c and l p , respectively, are formulated as follows: Optimization Subproblem of Sub-stepl c : max n s (d) i o ; n x k(d) ij o ;fc k ij g X d X i:i6=d ! (d) i s (d) i (3.20a) subject to: Eq. (3.6)-(3.8) and Eq. (3.14)-(3.16); (3.20b) X j:j6=i c k ij + c k ij 1;8i2N; k2K (3.20c) c k ij + c k ij 0;8i;j2N; j6=i; k2K: (3.20d) Optimization Problem of Sub-stepl p : max n s (d) i o ; n x k(d) ij o ;fp k ij g X d X i:i6=d ! (d) i s (d) i (3.21a) subject to: Eq. (3.6)-(3.8) and Eq. (3.17)-(3.19); (3.21b) p k ij + p k ij 0;8i;j2N; j6=i; k2K: (3.21c) 42 It can be seen that both (3.20) and (3.21) are linear programs. We conclude that the original problem (3.11) can be tackled by iteratively solving these linear programming subproblems with controllable approximations. Since the original problem is non-convex, convergence to a global optimum cannot be guaranteed. However, the simulations in Sec. 3.3 show that for some special cases where an optimum solution can be found analytically, the solution of our iterative approach is close to the global optimum. 3.2.3 Implementation of the Iterative Approach In this section, we describe certain considerations that arise in the algorithmic implementation. Setting the Initial Values of thec k ij andp k ij We must avoid setting inappropriate initial values of thec k ij andp k ij , which would increase the likelihood of convergence to a “small value” local maximum. We set the initial values of thec k ij andp k ij to zeros so as not to bias the optimization process. Confirming the Stopping Criterion Because of the approximation error arising from linearization, results achieved by the subproblems can keep fluctuating as the iteration evolves and no theoretical convergence is guaranteed. Therefore, we choose a threshold valueT and a positive integerL to parameterize the termination of the iterations. Specifically, as the algorithm iterates, we keep monitoring and updating the maximum peak-to-peak variation of the throughput values achieved in the latestL consecutive iterations, denoted asV (l; ;lL + 1), wherel is the index of the latest iteration. WheneverV (l; ;lL + 1) falls below the thresholdT , we stop the iterations and output the throughput value of iterationl as the desired one. By settingL> 2, we avoids falsely terminating the iteration when the two latest achieved throughput values are accidentally similar. Reducing the Iteration Count A small value of, in (3.16) and (3.19), guarantees a high linear approximation accuracy in each iteration but increases the iteration count, i.e., the number of required iterations. Therefore, we use a set of monotonically decreasing parametersf m g M m=1 matched with a set of monotonically decreasing threshold valuesfT m g M m=1 . Then the m andT m are traversed as the iterations evolve. Together, the use of the m andT m reduces the overall iteration count and controls the final approximation accuracy. The pseudo code of the linearization-based approach, denoted as L-JRSPA, can be summarized as follows: 43 1 N 2 …… Source Destination ܦ ൌ ͳ Figure 3.1: Line network topology,K = 8,d =N Setl = 0; Setfc k ij [0]g andfp k ij [0]g to zeros; Set the values off m g M m=1 ,fT m g M m=1 andL; for 1mM do whileV (l; ;lL + 1)T m do Optimize (3.20) with m ,fc k ij [l]g andfp k ij [l]g; Updatefc k ij [l]g tofc k ij [l + 1]g; Optimize (3.21) with m ,fc k ij [l + 1]g andfp k ij [l]g; Updatefp k ij [l]g tofp k ij [l + 1]g; l(l + 1; end while end for Outputfs (d) i [l]g;fc k ij [l]g;fx k(d) ij [l]g;fp k ij [l]g. 3.3 Simulation Results The JRSPA proposed in [41] assumes each subchannel to be accessed at a given time by only a single link in the network, and sharing of this channel among links through TDMA (we call this setting as the global TDMA scheme). In contrast, our proposed L-JRSPA assumes the use of the local CDMA scheme and treats interference as noise. In this section, we compare the two approaches through numerical simulations. 3.3.1 Throughput of Static Line Networks This subsection describes the simulations for line networks with a variable numberN of nodes, as shown in Fig. 3.1. The distance between any two adjacent nodes is a constant unit distanceD 0 ; the source is the leftmost node; the destination is the rightmost node. There are 8 subchannels available, each of which has bandwidthW normalized to 1. Assume that there is no small scale fading in the transmissions, but that the channel gain is completely determined by power-law pathloss with coefficient = 3:5 2 . The transmission power is normalized by the additive white Gaussian noise power on each subchannel. We assume the antenna used in each node is isotropic and require that the SNR on each subchannel at distanceD 0 away from the transmitter is no more than 10dB, which determines the power budget of each node. Although routing in the above line-network is obviously from left to right, scheduling and resource allocation are still challenging issues in maximizing the throughput. With the network ranging in size from 2 This setting implies that all the subchannels over each link have the same power gain, which can happen when the line of sight is available. 44 2 4 6 8 10 12 14 16 18 20 1 2 3 4 5 6 7 8 9 10 Number of Nodes, N Throughput JRSPA with Global TDMA L−JRSPA with Local CDMA, [η 1 , η 2 ] = [0.1, 0.01] L−JRSPA with Local CDMA, [η 1 , η 2 , η 3 ] = [0.2, 0.1, 0.01] Figure 3.2: Throughput of a line network as function of number of nodes,N … … 1 2 1 2 5 6 7 8 5 6 7 8 1 2 1 2 5 6 3 4 3 4 3 4 Figure 3.3: An optimum set of reuse patterns for 8 subchannels in an infinite-node line network N = 2 toN = 20 nodes, the throughput achieved under L-JRSPA with local CDMA and under JRSPA with global TDMA are compared in Fig. 3.2. In the figure, the throughput under JRSPA decreases as the network size grows. This is consistent with the following intuition: more nodes sharing the communication time resource means that each node has less time to transmit. In contrast, in each L-JRSPA curve, the throughput first decreases as network size grows (from 2 nodes to 4 nodes) but then gradually approaches a constant (more than 4 nodes). This can be comprehended as a result of subchannel allocation compensated with spatial reuse. AsN increases from 2 to 4, the throughput decreases significantly due to allocating subchannels among more links. AfterN increases beyond 4, two transmissions far apart can use the same subchannel without causing significant interference to each other, which gradually becomes a dominant factor in preventing further throughput decrease asN increases. In Fig. 3.2, we experientially set the parameterL = 10 for both of the two curves plotted under L-JRSPA, but they have different sets of controlling parametersf m g M m=1 andfT m g M m=1 . One has [ 1 ; 2 ] = [0:1; 0:01] and [T 1 ;T 2 ] = [0:02; 0:005]; the other has [ 1 ; 2 ; 3 ] = [0:2; 0:1; 0:01] and [T 1 ;T 2 ;T 3 ] = [0:2; 0:02; 0:005]. Note that the two curves do not completely overlap. This can be explained by the fact that iterations with differentf m g M m=1 evolve along different paths in the policy space and finally converges to distinct local optima. However, the solutions are very close numerically. Since the throughput achieved under L-JRSPA approaches a constant as the network size grows, it is interesting to explore the difference between our (numerically obtained) stable throughput and the maximum throughput (under the local CDMA assumption) that can be achieved as the number of nodes goes to infinity. 45 1 2 3 4 5 6 7 8 9 10 11 12 Y X Source Destination Figure 3.4: The network with an irregular topology,N = 12,K = 4,d = 12 4 5 6 7 8 9 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Throughput CDF JRSPA with Global TDMA L−JRSPA with Local CDMA Figure 3.5: CDFs of the throughput achieved in the network with the topology shown in Fig. 3.4 The analytically computed throughput of an infinite-node line network with settings in Sec. 3.3.1 is 6.225, which is shown as the height of the horizontal line in Fig. 3.2; the corresponding optimal set of reuse patterns for the 8 subchannels are shown in Fig. 3.3, where the numbers represent subchannel indices. It can be seen in Fig. 3.2 that the obtained stable throughput is very close to the maximum throughput of the infinite-node line network. 3.3.2 Throughput of Networks with Irregular Topologies In this subsection, we first simulate a network with the topology shown as Fig. 3.4. The network has 12 nodes with node 1 being the source and node 12 being the destination; the position information ((X;Y ) coordinates) of all the nodes is listed by the following setG: G =f(1:0; 0:5); (1:0; 2:0); (3:0; 1:0); (2:0; 3:0); (4:5; 1:0); (3:3; 4:2); (4:7; 4:0); (5:3; 1:8); (6:0; 2:7); (7:0; 1:7); (8:0; 3:0); (7:0; 4:1)g: (3.22) Additionally, assume the network has 4 subchannels to use, each of which experiences pathloss with coefficient = 3:5 and independent Rayleigh fading. The mean SNR at unit distance on each subchannel 46 is no more than 20dB. We assume that the communication duration is smaller than the coherence time of the wireless medium in the network. For the iterations, we setL = 10, [ 1 ; 2 ; 3 ] = [0:2; 0:1; 0:01] and [T 1 ;T 2 ;T 3 ] = [0:2; 0:02; 0:005]. We again compare the throughput performances of L-JRSPA and JRSPA. Fig. 3.4 plots the Cumulative Distribution Function (CDF) curves of the throughput under L-JRSPA and JRSPA, respectively. Each of the curves is formed by 1000 throughput values, independently generated from 1000 channel realizations. The CDF of L-JRSPA is located to the right of the CDF of JRSPA, and by computing the sample average, the mean throughput of the former is approximately 6:91 and the mean throughput of the latter is approximately 5:10. However, L-JRSPA does not always outperform JRSPA. In certain small scale networks, where the numbers of hops of the flow paths from the source to the destination are small, simulations show that the throughput under L-JRSPA is not necessarily higher than that under JRSPA. The reason is that interferences among links, rather than spatial reuse, dominate the network throughput under L-JRSPA in such situations; the throughput performance under JRSPA benefits from the orthogonality among the small number of links. 3.4 Conclusion We developed the L-JRSPA algorithm for the throughput maximization of OFDMA-based wireless ad-hoc networks. The networks allow the outgoing links of each node to orthogonally access each subchannel, e.g., through CDMA, and spatially reuse the subchannel while treating the arising interference among the transmissions from different nodes as noise. A local optimum is computed based on a novel, iterative approach with linearization in each iteration. After simulating the line networks and the networks with irregular topologies, we find that the throughput performance can be enhanced significantly when compared with the strategy of accessing each subchannel orthogonally among all links, particularly in large networks. Moreover, simulations on line networks with increasing sizes show the throughput convergence to a constant value close to the theoretical maximum throughput of a line network with infinite number of nodes. L-JRSPA implicitly assumes centralized knowledge of the channel states and centralized control. This can be realized when the channel shows little or no time variance, since then the relative overhead for feedback and control signaling is negligible. Such situations occur, e.g., in fixed-terminal mesh networks and industrial wireless sensor networks. Distributed versions of our algorithm are topics for future research. 47 Chapter 4 Dynamic Network Service Optimization in Distributed Cloud Networks As described in Section 1.4, recent works, such as Ref. [34] and [35], have addressed the network service distribution problem (NSDP) from a static global optimization point of view. These studies, however, focus on the design of centralized solutions that assume global knowledge of service demands and network conditions. With the increasing scale, heterogeneity, and dynamics inherent to both service demands and the underlying cloud network system, we argue that proactive centralized solutions must be complemented with distributed online algorithms that enable rapid adaptation to changes in network conditions and service demands, while providing global system objective guarantees. In this chapter, we address the service distribution problem in a dynamic cloud network setting, where service demands are unknown and time-varying. We provide the first characterization of a cloud network’s capacity region and design throughput-optimal dynamic cloud network control (DCNC) algorithms that drive local transmission, processing, and resource allocation decisions with global performance guarantees. The proposed algorithms are based on applying the Lyapunov drift-plus-penalty (LDP) control methodology [44]-[45] to a cloud network queuing system that captures both the transmission and processing of service flows, consuming network and cloud resources. We first propose DCNC-L, a control algorithm based on the minimization of a linear metric extracted from an upper bound of a quadratic LDP function of the underlying queuing system. DCNC-L is a distributed joint flow scheduling and resource allocation algorithm that guarantees overall cloud network stability, while achieving arbitrarily close to the minimum average cost with a tradeoff in network delay. We then design DCNC-Q, an extension of DCNC-L that uses a quadratic metric derived from the same upper bound expression of the LDP function. DCNC-Q preserves the throughput optimality of DCNC-L, and can significantly improve the cost-delay tradeoff at the expense of increased computational complexity. Finally, we show that network delay can be further reduced by introducing a shortest transmission-plus-processing distance (STPD) bias into the optimization metric. The generalizations of DCNC-L and DCNC-Q obtained by introducing the shortest STPD bias are referred to as EDCNC-L and EDCNC-Q, respectively. Our contributions can be summarized as follows: 49 We introduce a queuing system for a general class of multi-commodity-chain (MCC) flow problems that include the distribution of network services over cloud networks. In our MCC queuing model, the queue backlog of a given commodity builds up, not only from receiving packets of the same commodity, but also from processing packets of the preceding commodity in the service chain. For a given set of services, we characterize the capacity region of a cloud network in terms of the set of exogenous input flow rates that can be processed by the required service functions and delivered to the required destinations, while maintaining the overall cloud network stability. Importantly, the cloud network capacity region depends on both the cloud network topology and the service structure. We design a family of throughput-optimal DCNC algorithms that jointly schedule computation and communication resources for flow processing and transmission without knowledge of service de- mands. The proposed algorithms allow pushing total resource cost arbitrarily close to minimum with a [O();O(1=)] cost-delay tradeoff, and they converge to withinO() deviation from the optimal solution in timeO(1= 2 ). Our DCNC algorithms make local decisions via the online minimization of linear and quadratic metrics extracted from an upper bound of the cloud network LDP function. Using a quadratic vs. a linear metric is shown to improve the cost-delay tradeoff at the expense of increased computational complexity. In addition, the use of a STPD bias yields enhanced algorithms that can further reduce average delay without compromising throughput or cost performance. The rest of the paper is organized as follows. We review related work in Section 4.1. Section 4.2 describes the system model and problem formulation. Section 4.3 is devoted to the characterization of the cloud network capacity region. We present the proposed DCNC algorithms in Section 4.4, and analyze their performance in Section 4.5. Numerical experiments are presented in Section 4.6, and possible extensions are discussed in Section 4.7. Finally, we summarize the main conclusions in Section 4.8. 4.1 Related Work The problem of dynamically adjusting network resources in response to unknown changes in traffic demands has been extensively studied in previous literature in the context of stochastic network optimization. In particular, Lyapunov drift control theory is particularly suitable for studying the stability properties of queuing networks and similar stochastic systems. The first celebrated application of Lyapunov drift control in multi-hop networks is the backpressure (BP) routing algorithm [46]. The BP algorithm achieves throughput-optimality without ever designing an explicit route or having knowledge of traffic arrival rates, hence being able to adapt time-varying network conditions. By further adding a penalty term (e.g., related to network resource allocation cost) to the Lyapunov drift expression, [44]-[45] developed the Lyapunov drift-plus-penalty control methodology. LDP control preserves the throughput-optimality of the BP algorithm while also minimizing average network cost. LDP control strategies have shown effective in optimizing traditional multi-hop communication networks (as opposed to computation networks). Different versions of LDP-based algorithms have been developed. 50 Most of them are based on the minimization of a linear metric obtained from an upper bound expression of the queueing system LDP function [44]-[45]. Subsequently, the inclusion of a bias term, indicative of network distance, into this linear metric was shown to reduce network delay (especially in low congested scenarios) [47], [14]. Furthermore, [48] proposed a control algorithm for single-commodity multi-hop networks based on the minimization of a quadratic metric from the LDP upper bound, shown to improve delay performance in the scenarios explored in [48]. In contrast to these prior works, this paper extends the LDP methodology for the dynamic control of network service chains over distributed cloud networks. The proposed family of LDP-based algorithms are suitable for a general class of MCC flow problems that exhibit the following key features: (i) Flow chaining: a commodity, representing the flow of packets at a given stage of a service chain, can be processed into the next commodity in the service chain via the corresponding service function; (ii) Flow scaling: the flow size of a commodity can differ from the flow size of the next commodity in the service chain after service function processing; (iii) Joint computation/communication scheduling: different commodities share and compete for both processing and transmission resources, which need to be jointly scheduled. To the best of our knowledge, this is the first attempt to address the service chain control problem in a dynamic cloud network setting. 4.2 Model and Problem Formulation 4.2.1 Cloud Network Model We consider a cloud network modeled as a directed graphG = (V;E) withjVj =N vertices andjEj =E edges representing the set of network nodes and links, respectively. In the context of a cloud network, a node represents a distributed cloud location, in which virtual network functions (VNFs) can be instantiated in the the form of, e.g., virtual machines (VMs) over general purpose servers, while an edge represents a logical link (e.g., IP link) between two cloud locations. We denote by + (i)2V and (i)2V the set of outgoing and incoming neighbors ofi2V inG, respectively. We remark that in our model, cloud network nodes may represent large datacenters at the core network level, smaller edge datacenters at the metro and/or aggregation networks, or even fog [49] or cloudlet [50] nodes at the access network. We consider a time slotted system with slots normalized to integer unitst2f0; 1; 2;g, and characterize the cloud network resource capacities and costs as follows: K i =f0; 1; ;K i g: the set of processing resource allocation choices at nodei K ij =f0; 1; ;K ij g: the set of transmission resource allocation choices at link (i;j) C i;k : the capacity, in processing flow units (e.g., operations per timeslot), resulting from the allocation ofk processing resource units (e.g., CPUs) at nodei C ij;k : the capacity, in transmission flow units (e.g., packets per timeslot), resulting from the allocation ofk transmission resource units (e.g., bandwidth blocks) at link (i;j) w i;k : the cost of allocatingk processing resource units at nodei 51 …… Commodities: ሺ ࢊ ǡ ࣘ ǡ ሻ ሺ ࢊ ǡ ࣘ ǡ ሻ ሺ ࢊ ǡ ࣘ ǡ ሻ ሺ ࢊ ǡ ࣘ ǡ ࡹ ࣘ െ ሻ ሺ ࢊ ǡ ࣘ ǡ ࡹ ࣘ ሻ Functions: ሺ ࣘ ǡ ሻ ሺ ࣘ ǡ ሻ ሺ ࣘ ǡ ࡹ ࣘ ሻ Figure 4.1: A network service chain2 . Service takes the source commodity (d;; 0) and delivers the final commodity (d;;M ) after going through the sequence of functionsf(; 1); ; (;M )g. VNF (;m) takes commodity (d;;m 1) and generates commodity (d;;m). w ij;k : the cost of allocatingk transmission resource units at link (i;j) e i : the cost per processing flow unit at nodei e ij : the cost per transmission flow unit at link (i;j) 4.2.2 Service Model A network service2 is described by a chain of VNFs. We denote byM =f1; 2; ;M g the ordered set of VNFs of service. Hence, the pair (;m), with2 andm2M , identifies them-th function of service. We refer to a client as a source-destination pair (s;d), withs;d2V. A client requesting service 2 implies the request for the packets originating at the source nodes to go through the sequence of VNFs specified byM before exiting the network at the destination noded. We adopt a multi-commodity-chain (MCC) flow model, in which a commodity identifies the packets at a given stage of a service chain for a particular destination. Specifically, we use the triplet (d;;m) to identify the packets that are output of them-th function of service for destinationd. The source commodity of service for destinationd is denoted by (d;; 0) and the final commodity that are delivered to destinationd by (d;;M ), as illustrated in Fig. 4.1. Each VNF has (possibly) different processing requirements. We denote by r (;m) the processing- transmission flow ratio of VNF (;m) in processing flow units per transmission flow unit (e.g., operations per packet). We assume that VNFs are fully parallelizable, in the sense that if the total processing capacity allocated at nodei,C i;k , is used for VNF (;m), thenC i;k =r (;m) packets can be processed in one timeslot. In addition, our service model also captures the possibility of flow scaling. We denote by (;m) > 0 the scaling factor of VNF (;m), in output flow units per input flow unit. That is, the size of the output flow of VNF (;m) is (;m) times as large as its input flow. We refer to a VNF with (;m) > 1 as an expansion function, and to a VNF with (;m) < 1 as a compression function. 1 We remark that our service model applies to a wide range of services that go beyond NFV services, and that includes, for example, Internet of Things (IoT) services, expected to largely benefit from the proximity and elasticity of distributed cloud networks [51, 52]. 1 We assume arbitrarily small packet granularity such that arbitrary positive scaling factors can be defined. 52 4.2.3 Queuing Model We denote by a (d;;m) i (t) the exogenous arrival rate of commodity (d;;m) at node i during timeslot t, and by (d;;m) i its expected value. We assume thata (d;;m) i (t) is independently and identically distributed (i.i.d.) across timeslots, and thata (d;;m) i (t) = 0 for 0<mM , i.e., there are no exogenous arrivals of intermediate commodities in a service chain. 2 At each timeslott, every node buffers packets according to their commodities and makes transmission and processing flow scheduling decisions on its output interfaces. Cloud network queues build up from the transmission of packets from incoming neighbors and from the local processing of packets via network service functions. We define: Q (d;;m) i (t): the number of commodity (d;;m) packets in the queue of nodei at the beginning of timeslott (d;;m) ij (t): the assigned flow rate at link (i;j) for commodity (d;;m) at timet (d;;m) i;pr (t): the assigned flow rate from nodei to its processing unit for commodity (d;;m) at timet (d;;m) pr;i (t): the assigned flow rate from nodei’s processing unit to nodei for commodity (d;;m) at timet The resulting queuing dynamics satisfies: Q (d;;m) i (t+1) 2 4 Q (d;;m) i (t) X j2 + (i) (d;;m) ij (t) (d;;m) i;pr (t) 3 5 + + X j2 (i) (d;;m) ji (t)+ (d;;m) pr;i (t)+a (d;;m) i (t); (4.1) where [x] + denotes maxfx; 0g, andQ (d;;M ) d (t) = 0;8d;;t. The inequality in (4.1) is due to the fact that the actual number of packets transmitted/processed is the minimum between the locally available packets and the assigned flow rate. We assume that the processing resources of nodei are co-located with nodei and hence the packets of commodity (d;;m 1) processed during timeslott are available at the queue of commodity (d;;m) at the beginning of timeslott + 1. We can then describe the service chaining dynamics at nodei as follows: (d;;m) pr;i (t) = (;m) (d;;m1) i;pr (t); 8d;;m>0: (4.2) The service chaining constraints in (4.2) state that, at timet, the rate of commodity (d;;m) arriving at nodei from its processing unit is equal to the rate of commodity (d;;m 1) leaving nodei to its processing unit, scaled by the scaling factor (;m) . Thus, Eqs. (4.1) and (4.2) imply that the packets a commodity (d;;m 1) processed during timeslott are available at the queue of commodity (d;;m) at the beginning of timeslott + 1. 2 The setting in whicha (d;;m) i (t)6= 0 for 0<mM , while of little practical relevance, does not affect the mathematical analysis in this paper. 53 1 3 4 Service ࣘ : 2 ሺ ࣘ ǡ ሻ ሺ ǡ ࣘ ǡ ሻ ሺ ǡ ࣘ ǡ ሻ ࢇ ǡ ࣘ ǡ ሺ ࢚ ሻ ࣆ ǡ ܘ ܚ ǡ ࣘ ǡ ሺ ࢚ ሻ ࣆ ܘ ܚ ǡ ǡ ࣘ ǡ ሺ ࢚ ሻ ࣆ ǡ ࣘ ǡ ሺ ࢚ ሻ ࣆ ǡ ࣘ ǡ ሺ ࢚ ሻ ࣆ ǡ ࣘ ǡ ሺ ࢚ ሻ Processing Unit Figure 4.2: Cloud network queuing model for the delivery of a single-function service for a client with source node 2 and destination node 4. Packets of both commodity (4;; 0) and commodity (4;; 1) can be forwarded across the network, where they are buffered in separate commodity queues. In addition, cloud network nodes can process packets of commodity (4;; 0) into packets of commodity (4;; 1), which can exit the network at node 4. As an example, the cloud network queuing system of an illustrative 4-node cloud network is shown in Fig. 4.2. In addition to processing/transmission flow scheduling decisions, at each timeslott, cloud network nodes can also make transmission and processing resource allocation decisions. We use the following binary variables to denote the resource allocation decisions at timet: y i;k (t) = 1 ifk processing resource units are allocated at nodei at timet;y i;k (t) = 0, otherwise y ij;k (t) = 1 ifk transmission resource units are allocated at link (i;j) at timet;y ij;k (t) = 0, otherwise 4.2.4 Problem Formulation The goal is to design a dynamic control policy, defined by a flow scheduling and resource allocation action vectorf(t);y(t)g, that supports all average input rate matrices,f (d;;m) i g that are interior to the cloud network capacity region (as defined in Section 4.3), while minimizing the total average cloud network cost. Specifically, we require the cloud network to be rate stable (see Ref. [44]), i.e., lim t!1 Q (d;;m) i (t) t = 0 with prob. 1; 8i;d;;m: (4.3) The dynamic cloud network control problem can then be formulated as follows: min lim sup t!1 1 t X t1 =0 Efh()g (4.4a) s.t. The cloud network is rate stable; (4.4b) (d;;m) pr;i ()= (;m) (d;;m1) i;pr (); 8i;d;;m;; (4.4c) X (d;;m) (d;;m) i;pr ()r (;m+1) X k2K i C i;k y i;k ();8i;; (4.4d) X (d;;m) (d;;m) ij () X k2K ij C ij;k y ij;k (); 8(i;j);; (4.4e) 54 (d;;m) i;pr (); (d;;m) pr;i (); (d;;m) ij ()2R + ; 8i; (i;j);d;;m;; (4.4f) y i;k (); y ij;k ()2f0; 1g; 8i; (i;j);d;;m;; (4.4g) whereh(), P i2V h i (), with h i () = X k2K i w i;k y i;k ()]+e i X (d;;m) (d;;m) i;pr ()r (;m+1) + X j2 + (i) 2 4 X k2K ij w ij;k y ij;k ()+e ij X (d;;m) (d;;m) ij () 3 5 ; (4.5) denotes the cloud network operational cost at time. In (4.4), Eqs. (4.4c), (4.4d), and (4.4e) describe instantaneous service chaining, processing capacity, and transmission capacity constraints, respectively. Remark 1. As in Eqs. (4.4c), (4.4d), (6.1a), throughout this paper, it shall be useful to establish relationships between consecutive commodities and/or functions in a service chain. For ease of notation, unless specified, we shall assume that any expression containing a reference tom 1 will only be applicable form> 0 and any expression with a reference tom + 1 will only be applicable form<M . In the following section, we characterize the cloud network capacity region in terms of the average input rates that can be stabilized by any control algorithm that satisfies constraints (4.4b)-(4.4g), as well as the minimum average cost required for cloud network stability. 4.3 Cloud Network Capacity Region The cloud network capacity region (G; ) is defined as the closure of all average input rates that can be stabilized by a cloud network control algorithm, whose decisions conform to the cloud network and service structurefG; g. Theorem 7. The cloud network capacity region (G; ) consists of all average input rates for which, for alli;j;k;d;;m, there exist MCC flow variablesf (d;;m) ij ,f (d;;m) pr;i ,f (d;;m) i;pr , together with probability values ij;k , i;k , (d;;m) ij;k , (d;;m) i;k such that X j2 (i) f (d;;m) ji +f (d;;m) pr;i + (d;;m) i X j2 + (i) f (d;;m) ij +f (d;;m) i;pr ; (4.6a) f (d;;m) pr;i = (;m) f (d;;m1) i;pr ; (4.6b) f (d;;m) i;pr 1 r (;m+1) X k2K i i;k (d;;m) i;k C i;k ; (4.6c) f (d;;m) ij X k2K ij ij;k (d;;m) ij;k C ij;k ; (4.6d) f (d;;M ) i;pr = 0; f (d;;0) pr;i = 0; f (d;;M ) dj = 0; (4.6e) 55 f (d;;m) i;pr 0; f (d;;m) ij 0; (4.6f) X k2K ij ij;k 1; X k2K i i;k 1; (4.6g) X (d;;m) (d;;m) ij;k 1; X (d;;m) (d;;m) i;k 1: (4.6h) Furthermore, the minimum average cloud network cost required for network stability is given by h = min f ij;k ; i;k ; (d;;m) ij;k ; (d;;m) i;k g h; (4.7) where h = X i X k2K i i;k 0 @ w i;k +e i C i;k X (d;;m) (d;;m) i;k 1 A + X (i;j) X k2K ij ij;k 0 @ w ij;k +e ij C ij;k X (d;;m) (d;;m) ij;k 1 A : (4.8) Proof. The proof of Theorem 7 is given by Appendix B.1 in the supplementary material. In Theorem 7, (4.6a) and (5.12) describe generalized computation/communication flow conservation constraints and service chaining constraints, essential for cloud network stability, while (4.6c) and (4.6d) describe processing and transmission capacity constraints. The probability values i;k , ij;k , (d;;m) i;k , (d;;m) ij;k define a stationary randomized policy as follows: i;k : the probability thatk processing resource units are allocated at nodei; ij;k : the probability thatk transmission resource units are allocated at link (i;j); (d;;m) i;k : the probability that nodei processes commodity (d;;m), conditioned on the allocation ofk processing resource units at nodei; (d;;m) ij;k : the probability that link (i;j) transmits commodity (d;;m), conditioned on the allocation ofk transmission resource units at link (i;j). Hence, Theorem 7 demonstrates that, for any input rate2 (G; ), there exists a stationary randomized policy that uses fixed probabilities to make transmission and processing decisions at each timeslot, which can support the given, while minimizing overall average cloud network cost. However, the difficulty in directly solving for the parameters that characterize such a stationary randomized policy and the requirement on the knowledge of, motivates the design of online dynamic cloud network control solutions with matching performance guarantees. 4.4 Dynamic Cloud Network Control Algorithms In this section, we describe distributed DCNC strategies that account for both processing and transmission flow scheduling and resource allocation decisions. We first propose DCNC-L, an algorithm based on minimizing a 56 linear metric obtained from an upper bound of the quadratic LDP function, where only linear complexity is required for making local decisions at each timeslot. We then propose DCNC-Q, derived from the minimization of a quadratic metric obtained from the LDP bound. DCNC-Q allows simultaneously scheduling multiple commodities on a given transmission or processing interface at each timeslot, leading to a more balanced system evolution that can improve the cost-delay tradeoff at the expense of quadratic computational complexity. Finally, enhanced versions of the aforementioned algorithms, referred to as EDCNC-L and EDCNC-Q, are constructed by adding a shortest transmission-plus-processing distance (STPD) bias extension that is shown to further reduce network delay in low congested scenarios. 4.4.1 Cloud Network Lyapunov drift-plus-penalty LetQ(t) represent the vector of queue backlog values of all the commodities at all the cloud network nodes. The cloud network Lyapunov drift is defined as (Q (t)), 1 2 E n kQ (t + 1)k 2 kQ (t)k 2 Q (t) o ; (4.9) wherekk indicates Euclidean norm, and the expectation is taken over the ensemble of all the exogenous source commodity arrival realizations. The one-step Lyapunov drift-plus-penalty (LPD) is then defined as (Q (t)) +VEfh(t)jQ (t)g; (4.10) whereV is a non-negative control parameter that determines the degree to which resource cost minimization is emphasized. After squaring both sides of (4.1) and following standard LDP manipulations (see Ref. [45]), the LDP can upper bound as (Q (t))+VEfh(t)jQ (t)gVEfh(t)jQ (t)g+Ef(t)+Z(t)jQ(t)g+ X i X (d;;m) (d;;m) i Q d;;m i (t); (4.11) where (t), 1 2 X i X (d;;m) 8 < : 2 4 X j2 + (i) (d;;m) ij (t)+ (d;;m) i;pr (t) 3 5 2 + 2 4 X j2 (i) (d;;m) ji (t)+ (d;;m) pr;i (t)+a (d;;m) i (t) 3 5 2 9 = ; ; Z(t), X i X (d;;m) Q (d;;m) i (t) 2 4 X j2 (i) (d;;m) ji (t) + (d;;m) pr;i (t) X j2 + (i) (d;;m) ij (t) (d;;m) i;pr (t) 3 5 : Our DCNC algorithms extract different metrics from the right hand side of (4.11), whose minimization leads to a family of throughput-optimal flow scheduling and resource allocation policies with different cost-delay tradeoff performance. 57 4.4.2 Linear Dynamic Cloud Network Control (DCNC-L) DCNC-L is designed to minimize, at each timeslot, the linear metricZ(t) +Vh(t) obtained from the right hand side of (4.11), equivalently expressed as min X i2V 2 4 Vh i (t) X (d;;m) 0 @ X j2 + (i) Z (d;;m) ij;tr (t) +Z (d;;m) i;pr (t) 1 A 3 5 (4.12a) s.t. (4.4d) (4.4g); (4.12b) where, Z (d;;m) ij;tr (t), (d;;m) ij (t) h Q (d;;m) i (t)Q (d;;m) j (t) i ; Z (d;;m) i;pr (t), (d;;m) i;pr (t) h Q (d;;m) i (t) (;m+1) Q (d;;m+1) i (t) i : The goal of minimizing (4.12a) at each timeslot is to greedily push the cloud network queues towards a lightly congested state, while minimizing cloud network resource usage regulated by the control parameter V . Observe that (4.12a) is a linear metric with respect to (d;;m) i;pr (t) and (d;;m) ij (t), and hence (4.12) can be decomposed into the implementation of Max-Weight-Matching [2] at each node, leading to the following distributed flow scheduling and resource allocation policy: Local processing decisions: At the beginning of each timeslott, each nodei observes its local queue backlogs and performs the following operations: 1. Compute the processing utility weight of each processable commodity, (d;;m);m<M : W (d;;m) i (t)= " Q (d;;m) i (t) (;m+1) Q (d;;m+1) i (t) r (;m+1) Ve i # + ; and set W (d;;M ) i (t) = 0;8d;. W (d;;m) i (t) is indicative of the potential benefit of processing commodity (d;;m) into commodity (d;;m+1) at timet, in terms of the difference between local congestion reduction and processing cost per unit flow. 2. Compute the max-weight commodity: (d;;m) = arg max (d;;m) n W (d;;m) i (t) o : 3. IfW (d;;m) i (t) = 0, set,k = 0. Otherwise, k = arg max k n C i;k W (d;;m) i (t)Vw i;k o : 4. Make the following resource allocation and flow assignment decisions: y i;k (t) = 1; y i;k (t) = 0; 8k6=k ; (d;;m) i;pr (t) =C i;k . r (;m+1) ; (d;;m) i;pr (t) = 0; 8(d;;m)6= (d;;m) : 58 Local transmission decisions: At the beginning of each timeslott, each nodei observes its local queue backlogs and those of its neighbors, and performs the following operations for each of its outgoing links (i;j), j2 + (i): 1. Compute the transmission utility weight of each commodity (d;;m): W (d;;m) ij (t) = h Q (d;;m) i (t)Q (d;;m) j (t)Ve ij i + : 2. Compute the max-weight commodity: (d;;m) = arg max (d;;m) n W (d;;m) ij (t) o : 3. IfW (d;;m) ij (t) = 0, set,k = 0. Otherwise, k = arg max k n C ij;k W (d;;m) ij (t)Vw ij;k o : 4. Make the following resource allocation and flow assignment decisions: y ij;k (t) = 1; y ij;k (t) = 0; 8k6=k ; (d;;m) ij (t) =C ij;k ; (d;;m) ij (t) = 0 8(d;;m)6= (d;;m) : Implementing the above algorithm imposes low complexity on each node. LetJ denote the total number of commodities. We haveJN P (M + 1). Then, the total complexity associated with the processing and transmission decisions of nodei at each timeslot isO(J +K i + P j2 + (i) K ij ), which is linear with respect to the number of commodities and the number of resource allocation choices. Remark 2. Recall that, while assigned flow values can be larger than the corresponding queue lengths, a practical algorithm will only send those packets available for transmission/processing. However, as in [44]-[48], in our analysis, we assume a policy that meets assigned flow values with null packets (e.g., filled with idle bits) when necessary. Null packets consume resources, but do not build up in the network. 4.4.3 Quadratic Dynamic Cloud Network Control (DCNC-Q) DCNC-Q is designed to minimize, at each timeslot, the metric formed by the sum of the quadratic terms ( (d;;m) ij (t)) 2 , ( (d;;m) i;pr (t)) 2 , and ( (d;;m) pr;i (t)) 2 , extracted from (t), andZ(t) +Vh(t), on the right hand side of (4.11), equivalently expressed as min X i2V 8 < : X (d;;m) X j2 + (i) (d;;m) ij (t) 2 Z (d;;m) ij;tr (t) + X (d;;m) " 1+ (;m+1) 2 2 (d;;m) i;pr (t) 2 Z (d;;m) i;pr (t) # +Vh i (t) 9 = ; (4.13a) s.t. (4.4d) (4.4g): (4.13b) 59 The purpose of (4.13) is also to reduce the congestion level while minimizing resource cost. However, by introducing the quadratic terms ( (d;;m) i;pr (t)) 2 and ( (d;;m) ij (t)) 2 , minimizing (4.13a) results in a “smoother” and more “balanced” flow and resource allocation solution, which has the potential of improving the cost-delay tradeoff, with respect to the max-weight solution of DCNC-L that allocates either zero or full capacity to a single commodity at each timeslot. Note that (4.13) can also be decomposed into subproblems at each cloud network node. Using the KKT conditions [53], the solution to each subproblem admits a simple waterfilling-type interpretation. We first describe the resulting local flow scheduling and resource allocation policy and then provide its graphical interpretation. Local processing decisions: At the beginning of each timeslott, each nodei observes its local queue backlogs and performs the following operations: 1. Compute the processing utility weight of each commodity. Sort the resulting set of weights in non- increasing order and form the listfW (c) i (t)g, where c identifies the c-th commodity in the sorted list. 2. For each resource allocation choicek2K i : 2.1) Compute the following waterfilling rate threshold: G i;k (t), 2 6 6 4 p k P s=1 (r (s) ) 2 1+( (s) ) 2 W (s) i (t)C i;k p k P s=1 (r (s) ) 2 1+( (s) ) 2 3 7 7 5 + ; wherep k is the smallest commodity index that satisfiesH (p k ) i (t)>C i;k , withp k =J ifC i;k H (J) i (t); and H (c) i (t), P c s=1 h W (s) i (t)W (c+1) i (t) i (r (s) ) 2 1+( (s) ) 2 ; withr (s) and (s) denoting the processing-transmission flow ratio and the scaling factor of the function that processes commoditys, respectively. 2.2) Compute the candidate processing flow rate for each commodity, 1cJ: ^ (c) i;pr (k;t) = r (c) 1 + ( (c) ) 2 h W (c) i (t)G i;k (t) i + : 2.3) Compute the following optimization metric: i (k;t), J X c=1 " 1 + ( (c) ) 2 2 ^ (c) i;pr (k;t) 2 ^ (c) i;pr (k;t)r (c) W (c) i;pr (t) # +Vw i;k : 3. Compute the processing resource allocation choice: k = arg min k2K i f i (k;t)g: 60 4. Make the following resource allocation and flow assignment decisions: y i;k (t) = 1; y i;k (t) = 0; fork6=k ; (c) i;pr (t) = ^ (c) i;pr (k ;t): Local transmission decisions: At the beginning of each timeslott, each nodei observes its local queue backlogs and those of its neighbors, and performs the following operations for each of its outgoing links (i;j), j2 + (i): 1. Compute the transmission utility weight of each commodity. Sort the resulting set of weights in non-increasing order and form the listfW (c) ij (t)g, wherec identifies thec-th commodity in the sorted list. 2. For each resource allocation choicek2K i : 2.1) Compute the following waterfilling rate threshold: G ij;k (t), 1 p k h X p k s=1 W (s) ij (t) 2C ij;k i + : wherep k is the smallest commodity index that satisfiesH (p k ) ij (t) > C ij;k , withp k = J ifC ij;k H (J) i (t); and H c ij (t), 1 2 c P s=1 h W (s) ij (t)W (c+1) ij (t) i : 2.2) Compute the candidate transmission flow rate for each commodity, 1cJ: ^ (c) ij (k;t) = 1 2 h W (c) ij (t)G ij;k (t) i + : 2.3) Compute the following optimization metric: ij (k;t),Vw ij;k + J X c=1 ^ (c) ij (k;t) 2 ^ (c) ij (k;t)W (c) ij (t) : 3. Compute the processing resource allocation choice: k = arg min k2K ij f ij (k;t)g: 4. Make the following resource allocation and flow assignment decisions: y ij;k (t) = 1; y ij;k (t) = 0; 8k6=k ; (c) ij (t) = ^ (c) ij (k ;t): 61 …… ( ) , ( ) ( ) ( ) ( ) − ( ) = − ( ) ≜ + = − , + = , Commodities ( ) ( ) ( ) Figure 4.3: Waterfilling interpretation of the local processing decisions of DCNC-Q at timet. The total complexity isO(J[log 2 J +K i + P j2 + (i) K ij ]), which is quadratic respective to the number of commodities and the number of resource allocation choices. As stated earlier, DCNC-Q admits a waterfilling-type interpretation, illustrated in Fig. 4.3. We focus on the local processing decisions. Define a two-dimensional vessel for each commodity. The height of vesselc is given by the processing utility weight of commodityc,W (c) i (t), and its width by (r (c) ) 2 1+( (c) ) 2 . For each resource allocation choicek2K i , pour mercury on each vessel up to heightG i;k (t) given in step 2.1 (indicated with yellow in the figure). If available, fill the remaining of each vessel with water (blue in the figure). The candidate assigned flow rate of each commodity is given by the amount of water on each vessel (step 2.2), while to total amount of water is equal to the available capacityC i;k . Finally, step 3 is the result of choosing the resource allocation choicek that minimizes (4.13a) with the corresponding assigned flow rate values. The local transmission decisions follow a similar interpretation that is omitted here for brevity. 4.4.4 Dynamic Cloud Network Control with Shortest Transmission-plus-Processing Distance Bias DCNC algorithms determine packet routes and processing locations according to the evolution of the cloud network commodity queues. However, queue backlogs have to build up before yielding efficient processing and routing configurations, which can result in degraded delay performance, especially in low congested scenarios. In order to reduce average cloud network delay, we extend the approach used in [47], [14] for traditional communication networks, which consists of incorporating a bias term into the metrics that drive scheduling decisions. In a cloud network setting, this bias is designed to capture the delay penalty incurred by each forwarding and processing operation. Let ^ Q (d;;m) i (t) denote the biased backlog of commodity (d;;m) at nodei: ^ Q (d;;m) i (t),Q (d;;m) i (t) +Y (d;;m) i ; (4.14) whereY (d;;m) i denotes the shortest transmission-plus-processing distance bias (STPD), and is a control parameter used to balance the effect of the bias and the queue backlog. The bias term in (4.14) is defined as Y (d;;m) i , 8 < : 1; ifm<M ; H i;d ; ifm =M ; 8i;d;; (4.15) 62 whereH i;j denotes the shortest distance (in number of hops) from nodei to nodej. We note thatY (d;;m) i = 1 for all processable commodities because, throughout this paper, we have assumed that every function is available at all cloud network nodes. In Sec. 4.7.1, we discuss a straight-forward generalization of our model, in which each service function is available at a subset of cloud network nodes, in which case,Y (d;;m) i for each processable commodity is defined as the shortest distance to the closest node that can process commodity (d;;m). The enhanced EDCNC-L and EDCNC-Q algorithms work just like their DCNC-L and DCNC-Q coun- terparts, but using ^ Q (d;;m) i (t) in place ofQ (d;;m) i (t) to make local processing and transmission scheduling decisions. 4.5 Performance Analysis In this section, we analyze the performance of the proposed DCNC algorithms. To facilitate the analysis, we define the following parameters: A max : the constant that bounds the aggregate input rate at all the cloud network nodes; specifically, max i2V Ef[ P (d;;m) a (d;;m) i (t)] 4 g (A max ) 4 . C max pr : the maximum processing capacity among all cloud network nodes; i.e.,C max pr , max i2V fC i;K i g. C max tr : the maximum transmission capacity among all cloud network links; i.e.,C max tr , max (i;j)2E fC ij;K ij g. max : the maximum flow scaling factor among all service functions; i.e., max , max (;m) f (;m) g. r min : the minimum transmission-processing flow ratio among all service functions; i.e., r min , min (;m) fr (;m) g. max : the maximum degree among all cloud network nodes, i.e., max , max i2V f + (i) + (i)g. 4.5.1 Average Cost and Network Stability Theorem 8. If the average input rate matrix = ( (d;;m) i ) is interior to the cloud network capacity region (G; ), then the DCNC algorithms stabilize the cloud network, while achieving arbitrarily close to minimum average costh () with probability 1 (w.p.1), i.e., lim sup t!1 1 t X t1 =0 h()h () + NB V ; (w:p:1) (4.16) lim sup t!1 1 t t1 X =0 X (d;;m);i Q (d;;m) i () NB+V [h (+1)h ()] ; (w:p:1) (4.17) where B = 8 < : B 0 ; under DCNC-L and DCNC-Q, B 1 ; under EDCNC-L and EDCNC-Q, (4.18) 63 withB 0 andB 1 being positive constants determined by the system parametersC max pr ,C max tr ,A max , max , and r min ; and is a positive constant satisfying ( +1)2 . Proof. The proof of Theorem 8 is given in Appendix B.2. Theorem 8 shows that the proposed DCNC algorithms achieve the average cost-delay tradeoff [O(1=V );O(V )] with probability 1. 3 Moreover, (4.17) holds for any interior to , which demonstrates the throughput- optimality of the DCNC algorithms. 4.5.2 Convergence Time The convergence time of a DCNC algorithm indicates how fast its running time average solution approaches the optimal solution. 4 This criterion is particularly important for online scheduling in settings where the arrival process is non-homogeneous, i.e., the average input rate is time varying. In this case, it is important to make sure that the time average solution evolves close enough to the optimal solution much before the average input rate undergoes significant changes. We remark that studying the convergence time of a DCNC algorithm involves studying how fast the average cost approaches the optimal value, as well as how fast the flow conservation violation at each node approaches zero. 5 Let ~ (d;;m) i;pr (t), ~ (d;;m) pr;i (t), and ~ (d;;m) ij (t) denote the actual flow rates obtained from removing all null packets that may have been assigned when queues do not have enough packets to meet the corresponding assigned flow rates. Define, for alli; (d;;m);t, f (d;;m) i (t), X j2 (i) ~ (d;;m) ji (t)+~ (d;;m) pr;i (t)+a (d;;m) i (t) X j2 + (i) ~ (d;;m) ij (t) ~ (d;;m) i;pr (t): (4.19) Then, the queuing dynamics is then given by Q (d;;m) i (t + 1) =Q (d;;m) i (t) + f (d;;m) i (t): (4.20) The convergence time performance of the proposed DCNC algorithms is summarized by the following theorem. Theorem 9. If the average input rate matrix = ( (d;;m) i ) is interior to the cloud network capacity region (G; ), then, for all> 0, whenevert 1/ 2 , the mean time average cost and mean time average actual flow rate achieved by the DCNC algorithms during the firstt timeslots satisfy: 1 t X t1 =0 Efh ()gh () +O (); (4.21) 1 t X t1 =0 E n f (d;;m) i (t) o O(); 8i; (d;;m): (4.22) 3 By setting = 1=V , where denotes the deviation from the optimal solution (see Theorem 9), the cost-delay tradeoff is written as [O();O(1=)]. 4 We assume that the local decisions performed by the DCNC algorithms at each timeslot can be accomplished within a reserved computation time within each timeslot, and therefore their different computational complexities are not taking into account for convergence time analysis. 5 Note that the convergence of the flow conservation violation at each node to zero is equivalent to strong stability (see (4.17)), if interior to (G; ). 64 Proof. The proof is of Theorem 9 given in Appendix B.3. Theorem 9 establishes that, under the DCNC algorithms, both the average cost and the average flow conservation at each node exhibitO(1= 2 ) convergence time toO() deviations from the minimum average cost, and zero, respectively. 4.6 Numerical Results In this section, we evaluate the performance of the proposed DCNC algorithms via numerical simulations in a number of illustrative settings. We assume a cloud network based on the continental US Abilene topology shown in Fig. 4.4. The 14 cloud network links exhibit homogeneous transmission capacities and costs, while the 7 cloud network nodes only differ in their processing resource set-up costs. Specifically, the following two resource settings are considered: 1) ON/OFF resource levels: each node and link can either allocate zero capacity, or the maximum available capacity; i.e.,K i =K ij = 1,8i2V; (i;j)2E. To simplify notation, we defineK,K i + 1 =K ij + 1, 8i2V; (i;j)2E. The processing resource costs and capacities are e i = 1;8i2V;w i;0 = 0;8i2V;w i;1 = 440;8i2Vnf5; 6g;w 5;1 =w 6;1 = 110. C i;0 = 0;C i;1 = 440;8i2V. 6 The transmission resource costs and capacities are e ij = 1;w ij;0 = 0;w ij;1 = 440;8(i;j)2E. C ij;0 = 0;C ij;1 = 440;8(i;j)2E. 2) Multiple resource levels: the available capacity at each node and link is split into 10 resource units; i.e., K = 11;8i2V; (i;j)2E. The processing resource costs and capacities are e i = 1;8i2V; [w i;0 ;w i;1 ; ;w i;10 ;w i;11 ]=[0; 11; ; 99; 110], fori = 5; 6; [w i;0 ;w i;1 ; ;w i;10 ;w i;11 ]=[0; 44; ; 396; 440];8i2Vnf5; 6g; [C i;0 ;C i;1 ; ;C i;10 ;C i;11 ]=[0; 44; ; 396; 440];8i. The transmission resource costs and capacities are e ij = 1;8(i;j)2E; [w ij;0 ;w ij;1 ; ;w ij;10 ;w ij;11 ] = [0; 44; ; 396; 440],8(i;j)2E. [C ij;0 ;C ij;1 ; ;C ij;10 ;C ij;11 ] = [0; 44; ; 396; 440],8(i;j)2E. 6 The maximum capacity is set to 440 in order to guarantee that there is no congestion at any part of the network for the service setting considered in the following. 65 1 2 3 4 5 6 7 8 9 10 11 Figure 4.4: Abilene US Continental Network. Nodes are indexed as: 1) Seattle, 2) Sunnyvale, 3) Denver, 4) Los Angeles, 5) Houston, 6) Kansas City, 7) Atlanta, 8) Indianapolis, 9) Chicago, 10) Washington, 11) New York. Note that, for both ON/OFF and multi-level resource settings, the processing resource set-up costs at node 5 and 6 are 4 times cheaper than at the other cloud network nodes. We consider 2 service chains, each composed of 2 virtual network functions: VNF (1,1) (Service 1, Function 1) with flow scaling factor (1;1) = 1; VNF (1; 2) with (1;2) = 3 (expansion function); VNF (2; 1) with (2;1) = 0:25 (compression function); and VNF (2; 2) with (2;2) = 1. All functions have processing- transmission flow ratior (;m) = 1, and can be implemented at all cloud network nodes. Finally, we assume 110 clients per service, corresponding to all the source-destination pairs in the Abilene network. 4.6.1 Cost-Delay Tradeoff Figs. 4.5(a)-4.5(c) show the tradeoff between the time average cost and the time average end-to-end delay (represented by the total time average occupancy or queue backlog), under the different DCNC algorithms. The input rate of all source commodities is set to 1 and the the cost/delay values are obtained after simulating each algorithm for 10 6 timeslots. Each tradeoff curve is obtained by varying the control parameterV between 0 and 1000 for each algorithm. Small values ofV favor low delay at the expense of high cost, while large values ofV lead to points in the tradeoff curves with lower cost and higher delay. It is important to note that since the two resource settings considered, i.e., ON/OFF (K = 2) vs. multi-level (K = 11), are characterized by the same maximum capacity and the same constant ratios C i;k /w i;k and C ij;k /w ij;k , the performance of the linear DCNC algorithms (DCNC-L and EDCNC-L) does not change under the two resource settings. On the other hand, the quadratic algorithms (DCNC-Q and EDCNC-Q) can exploit the finer resource granularity of the multi-level resource setting to improve the cost-delay tradeoff. We also note that for the enhanced versions of the algorithms that use the STPD bias (EDCNC-L and EDCNC-Q), we choose the bias coefficient among the values of multiples of 10 that leads to the best performance for each algorithm. 7 Fig. 4.5(a) shows how the average cost under all DCNC algorithms reduces at the expense of network delay, and converges to the same minimum value. While all the tradeoff curves follow the same [O(1=V );O(V )] relationship established in Theorem 8, the specific trade-off ratios can be significantly different. The general trends observed in Fig. 4.5(a) are as follows. DCNC-L exhibits the worst cost-delay tradeoff. Recall that DCNC-L assigns either zero or full capacity to a single commodity in each timeslot, and hence the finer 7 Simulation results for different values of can be found in [54]. 66 0 0.5 1 1.5 2 x 10 6 0 0.5 1 1.5 2 2.5 3 x 10 4 Time Average Occupancy Time Average Cost DCNC−L, K=2 or 11 EDCNC−L, K=2 or 11, η=50 DCNC−Q, K=2 EDCNC−Q, K=2, η=10 DCNC−Q, K=11 (a) 0 5 10 15 x 10 5 1000 1500 2000 2500 3000 Time Average Occupancy Average Cost DCNC−L, K=2 or 11 EDCNC−L, K=2 or 11, η=50 DCNC−Q, K=2 EDCNC−Q, K=2, η=10 DCNC−Q, K=11 1600 1380 (b) 0 5 10 15 x 10 5 1000 1500 2000 2500 3000 Time Average Occupancy Time Average Cost DCNC−L, K=2 or 11 EDCNC−L, K=2 or 11, η=50 DCNC−Q, K=2 EDCNC−Q, K=2, η=10 DCNC−Q, K=11 9× 10 5 3× 10 5 (c) 0 1 2 3 4 5 x 10 5 0 10 20 30 40 50 Time (timeslots) Total Flow Conservation Violation DCNC−L, K=2 or 11, V=400 DCNC−L, K=2 or 11, V=500 DCNC−Q, K=2, V=400 DCNC−Q, K=2, V=500 DCNC−Q, K=11, V=100 (d) 0 1 2 3 4 5 x 10 5 0 200 400 600 800 1000 1200 1400 Time (timeslots) Time Average Cost DCNC−L, K=2 or 11, V=400 DCNC−L, K=2 or 11, V=500 DCNC−Q, K=2, V=400 DCNC−Q, K=2, V=500 DCNC−Q, K=11, V=100 (e) 0 2 4 6 8 10 12 14 0 0.5 1 1.5 2 2.5 x 10 6 Average Input Rate per Client for Each Service Average Total Occupancy DCNC−L, K=2 or 11, V=400 EDCNC−L, K=2 or 11, η=50, V=400 DCNC−Q, K=2, V=100 EDCNC−Q, K=2, η=10, V=100 DCNC−Q, K=11, V=20 Network Capacity Boundary (f) Figure 4.5: Performance of DCNC algorithms. a) Time Average Occupancy v.s. Time Average Cost: a general view; b) Time average Occupancy v.s. Time Average Cost: given a target average cost c) Time average Occupancy v.s. Time Average Cost: given a target average occupancy; d) Total flow conservation violation evolution over time: effect of theV value; e) Time average cost evolution over time: effect of theV value; f) Time Average Occupancies with varying service input rate: throughput optimality resource granularity ofK = 11 does not improve its performance. However, adding the SDTP bias results in a substantial performance improvement, as shown by the EDCNC-L curve. Now let’s focus on the quadratic algorithms. DCNC-Q withK = 2 further improves the cost delay-tradeoff, at the expense of increased computational complexity. In this case, adding the SDTP bias provides a much smaller improvement (see EDCNC-Q curve), showing the advantage of the more “balanced" scheduling decisions of DCNC-Q. Finally, DCNC-Q withK = 11 exhibits the best cost-delay tradeoff, illustrating the ability of DCNC-Q to exploit the finer resource granularity to make “smoother" resource allocation decisions. In this setting, adding the SDTP bias does not provide further improvement and it is not shown in the figure. While Fig. 4.5(a) illustrates the general trends in improvements obtained using the quadratic metric and the SDTP bias, there are regimes in which the lower complexity DCNC-L and EDCNC-L algorithms can outperform their quadratic counterparts. We illustrate these regimes In Figs. 4.5(b) and 4.5(c), by zooming into the lower left of Fig. 4.5(a). As shown in Fig. 4.5(b), for the case ofK = 2, the cost-delay curves of DCNC-L and EDCNC-L cross with the curves of DCNC-Q and EDCNC-Q. For example, for a target cost of 1380, DCNC-L and EDCNC-L result in lower average occupancies (8:52 10 5 and 6:43 10 5 ) than DCNC-Q (1:26 10 6 ) and EDCNC-Q (1:21 10 6 ). On the other hand, if we increase the target cost to 1600, DCNC-Q and EDCNC-Q achieve lower occupancy values (4:58 10 5 and 4:11 10 5 ) than DCNC-L 67 (8:52 10 5 ) and EDCNC-L (6:43 10 5 ). Hence, depending on the cost budget, there may be a regime in which the simpler DCNC-L and EDCNC-L algorithms become a better choice. However, this regime does not exist forK = 11, where the average occupancies under DCNC-Q (1342 and 1433 respectively for the two target costs) are much lower than (E)DCNC-L. In Fig. 4.5(c), we compare cost values for given target occupancies. WithK = 2 and a target average occupancy of 9 10 5 , the average costs achieved by DCNC-L (1317) and EDCNC-L (1319) are lower than those achieved by DCNC-Q (1437) and EDCNC-Q (1432). In contrast, if we reduce the target occupancy to 3 10 5 , DCNC-Q and EDCNC-Q (achieving average costs 1754 and 1764) outperform DCNC-L and EDCNC-L (with cost values 2:64 10 4 and 6879 beyond the scope of Fig. 4.5(c)). WithK = 11, DCNC-Q achieves average costs of 1286 and 1271 for the two target occupancies, outperforming all other algorithms. to Node 1 to Node 2 to Node 3 to Node 4 to Node 5 to Node 6 to Node 7 to Node 8 to Node 9 to Node 10 to Node 11 1 2 3 4 5 6 7 8 9 10 11 0 10 20 30 40 Node Index Average Processing Rate (a) 1 2 3 4 5 6 7 8 9 10 11 0 2 4 6 8 10 Node Index Average Processing Rate (b) 1 2 3 4 5 6 7 8 9 10 11 0 2 4 6 8 10 Node Index Average Processing Rate (c) 1 2 3 4 5 6 7 8 9 10 11 0 2 4 6 8 10 12 Node Index Average Processing Rate (d) Figure 4.6: Average Processing Flow Rate Distribution. a) Service 1, Function 1; b) Service 1, Function 2; c) Service 2, Function 1; d) Service 2, Function 2. 4.6.2 Convergence Time In Figs. 4.5(d) and 4.5(e), we show the time evolution of the total flow conservation violation (obtained by summing over all nodes and commodities, the absolute value of the flow conservation violation) and the total time average cost, respectively. The average input rate of each source commodity is again set to 1. As expected, observe how decreasing the value ofV speeds up the convergence of all DCNC algorithms. However, note from Fig. 4.5(e) that the converged time average cost is higher with a smaller value ofV , consistent with the tradeoff established in Theorem 8. Note that the slower convergence of DCNC-Q with respect to DCNC-L with the same value ofV does not necessarily imply a disadvantage of DCNC-Q. In fact, due to its more balanced scheduling decisions, DCNC-Q can be designed with a smallerV than DCNC-L, in order to enhance convergence speed while achieving no worse cost/delay performance. This effect is obvious in the case ofK = 11. As shown in Fig. 4.5(d) and Fig. 4.5(e), withK = 11, DCNC-Q withV = 100 achieves faster convergence than DCNC-L withV = 400, while their converged average cost values are similar. 4.6.3 Capacity Region Fig. 4.5(f) illustrates the throughput performance of the DCNC algorithms by showing the time average occupancy as a function to the input rate (kept the same for all source commodities). The simulation time is 10 6 timeslots and the values ofV used for each algorithm are chosen according to Fig. 4.5(b) in order to 68 guarantee that the average cost is lower than the target value 1600. As the average input rate increases to 13:5, the average occupancy under all the DCNC algorithms exhibits a sharp raise, illustrating the boundary of the cloud network capacity region (see (4.17) and let! 0). Observe, once more, the improved delay performance achieved via the use of the STPD bias and the quadratic metric in the proposed control algorithms. 4.6.4 Processing Distribution Fig. 4.6 shows the optimal average processing rate distribution across the cloud network nodes for each service function under the ON/OFF resource setting (K = 2). We obtain this solution, for example, by running DCNC-L withV = 1000 for 10 6 timeslots. The processing rate of function (;m) refers to the processing rate of its input commodity (d;;m 1). Observe how the implementation of VNF (1; 1) mostly concentrates at node 5 and 6, which are the cheapest processing locations However, note that part of VNF (1; 1) for destinations in the west coast (nodes 1 through 4) takes place at the west coast nodes, illustrating the fact that while processing is cheaper at nodes 5 and 6, shorter routes can compensate the extra processing cost at the more expensive nodes. A similar effect can be observed for destinations in the east coast, where part of VNF(1; 1) takes place at east coast nodes. Fig. 4.6(b) shows the average processing rate distribution for VNF (1; 2). Note that VNF (1; 2) is an expansion function. This results in the processing of commodity (d; 1; 1) completely concentrating at the destination nodes, in order to minimize the impact of the extra cost incurred by the transmission of the larger-size commodity (d; 1; 2) resulted from the execution of VNF (1; 2). For Service 2, note that VNF (2; 1) is a compression function. As expected, the implementation of VNF (2; 1) takes place at the source nodes, in order to reduce the transmission cost of Service 2 by compressing commodity (d; 2; 0) into the smaller-size commodity (d; 2; 1) even before commodity (d; 2; 0) flows into the network. As a result, as shown in Fig. 4.6(c), for all 1d 11, commodity (d; 2; 0) is processed at all the nodes except noded, and the average processing rate of commodity (d; 2; 0) at each nodei6=d is equal to 1, which is the average input rate per client. Fig. 4.6(d) shows the average processing rate distribution for VNF (2; 2), which exhibits a similar distribution to VNF (1; 1), except for having different rate values, due to the compression effect of VNF (2; 1). 4.7 Extensions In this section, we discuss interesting extensions of the DCNC algorithms presented in this paper that can be easily captured via simple modifications to our model. 4.7.1 Location-Dependent Service Functions For ease of notation and presentation, throughout this paper, we have implicitly assumed that every cloud network node can implement all network functions. In practice, each cloud network node may only host a subset of functions f M ;i M ;82 . In this case, the local processing decisions at each node would 69 be made by considering only those commodities that can be processed by the locally available functions. In addition, the STPD biasY (d;;m) i would need to be updated as, for alli;d;, Y (d;;m) i , 8 > < > : min j:j2V;(m+1)2 f M ;j fH i;j + 1g; ifm<M ; H i;d ; ifm =M : 4.7.2 Propagation Delay In this work, we have assumed that network delay is dominated by queueing delay, and ignored propagation delay. However, in large-scale cloud networks, where communication links can have large distances, the propagation of data across two neighbor nodes may incur non-negligible delays. In addition, while much smaller, the propagation delay incurred when forwarding packets for processing in a large data center may also be non-negligible. In order to capture propagation delays, letD pg i andD pg ij denote the propagation delay (in timeslots) for reaching the processing unit at nodei and for reaching neighborj from nodei, respectively. We then have the following queuing dynamics and service chaining constraints: Q (d;;m) i (t+1) 2 4 Q (d;;m) i (t) X j2 + (i) (d;;m) ij (t) (d;;m) i;pr (t) 3 5 + + X j2 (i) (d;;m) ji (tD pg ji ) + (d;;m) pr;i (t) +a (d;;m) i (t); (4.23) (d;;m) pr;i (t) = (;m) (d;;m1) i;pr (tD pg i ): (4.24) Moreover, due to propagation delay, queue backlog observations become outdated. Specifically, the queue backlog of commodity (d;;m) at nodej2(i) observed by nodei at timet isQ (d;;m) j (tD pg ji ). Furthermore, for EDCNC-L and EDCNC-Q, the STPD biasY (d;;m) i , for alli;d;, would be updated as Y (d;;m) i , 8 < : min j2V n ~ H i;j +D pg i o ; ifm<M : ~ H i;d ; ifm =M ; where ~ H i;j is the length of the shortest path from nodei to nodej, with link (u;v)2E having lengthD pg uv . With (4.23), (4.24), and the outdated backlog state observations, the proposed DCNC algorithms can still be applied and be proven to retain the established throughput, average cost, and convergence performance guarantees, while suffering from increased average delay. 4.7.3 Service Tree Structure While most of today’s network services can be described via a chain of network functions, next generation digital services may contain functions with multiple inputs. Such services can be described via a service tree, as shown in Fig. 4.7. In order to capture these type of services, we letI(;m) denote the set of commodities that act as input to function (;m), generating commodity (d;;m). The service chaining constraints are then updated as (d;;m) pr;i (t)= (;n) (d;;n) i;pr (t); 8t;i;d;;m;n2I(;m): 70 ሺ ࢊ ǡ ࣘ ǡ ሻ ሺ ࢊ ǡ ࣘ ǡ ሻ ሺ ࢊ ǡ ࣘ ǡ ሻ ሺ ࢊ ǡ ࣘ ǡ ሻ Functions: Commodities: ሺ ࣘ ǡ ሻ ሺ ࣘ ǡ ሻ ሺ ࣘ ǡ ૠ ሻ …… ሺ ࣘ ǡ ࡹ ࣘ ሻ ሺ ࣘ ǡ ሻ ሺ ࢊ ǡ ࣘ ǡ ሻ ሺ ࢊ ǡ ࣘ ǡ ሻ ሺ ࢊ ǡ ࣘ ǡ ૠ ሻ ሺ ࢊ ǡ ࣘ ǡ ࡹ ࣘ ሻ Figure 4.7: A network service tree2 . VNF (;m) takes input commodities (d;;n),n2I(;m), and generates commodity (d;;m). where (;n) ;8n2I(;m) denotes the flow size ratio between the output commodity (d;;m) and each of its input commoditiesn2I(;m). In addition, the processing capacity constraints are updated as X (d;;n) (d;;n) i;pr (t)r (;n) X k2K i C i;k y i;k (t); 8t;i; wherer (;n) now denotes the computation requirement of processing a unit flow of commodity (d;;n). Using the above updated constraints in the LDP bound minimizations performed by the DCNC algorithms, we can provide analogous throughput, cost, and convergence time guarantees for the dynamic control of service trees in cloud networks. 4.8 Conclusions We addressed the problem of dynamic control of network service chains in distributed cloud networks, in which demands are unknown and time varying. For a given set of services, we characterized the cloud network capacity region and designed online dynamic control algorithms that jointly schedule flow processing and transmission decisions, along with the corresponding allocation of network and cloud resources. The proposed algorithms stabilize the underling cloud network queuing system, as long as the average input rates are within the cloud network capacity region. The achieved average cloud network costs can be pushed arbitrarily close to minimum with probability 1, while trading off average network delay. Our algorithms converge to withinO() of the optimal solutions in timeO(1= 2 ). DCNC-L makes local transmission and processing decisions with linear complexity with respect to the number of commodities and resource allocation choices. In comparison, DCNC-Q makes local decisions by minimizing a quadratic metric obtained from an upper bound expression of the LDP function, and we show via simulations that the cost-delay tradeoff can be significantly improved. Furthermore, both DCNC-L and DCNC-Q are enhanced by introducing a STPD bias into the scheduling decisions, yielding the EDCNC-L and EDCNC-Q algorithms, which exhibit further improved delay performance. 71 Chapter 5 Optimal Control of Wireless Computing Networks Internet traffic will soon be dominated by the consumption of what we refer to as augmented information (AgI) services. While today’s AgI services are mostly implemented in the form of software defined functions instantiated over servers equipped with general purpose hardware at centralized cloud data centers [55], the increasingly low cost and low latency requirements of next generation real-time AgI services is driving cloud resources closer to the end users in the form of small cloud nodes at the edge of the network, resulting in what is referred to as a distributed cloud network. The service distribution problem for wireline cloud networks has been addressed in the previous chapter. However, a key missing aspect in all these works is to explore the wireless computing networks. AgI services are increasingly sourced and accessed from wireless devices, and with the advent of mobile and fog computing [49], service functions can also be hosted at wireless computing nodes (i.e., computing devices with wireless networking capabilities) such as mobile handsets, connected vehicles, compute-enabled access points or cloudlets [50]. When introducing the wireless network into the computing infrastructure, the often unpredictable nature of the wireless channel further complicates flow scheduling, routing, and resource allocation. In the context of traditional wireless communication networks, the Lyapunov drift plus penalty (LDP) control methodology (see [45] and references therein) has been shown to be a promising approach to tackle these intricate stochastic network optimization problems. Ref. [47] extends the LDP approach to multi-hop, multi-commodity wireless ad-hoc networks, leading to the Diversity Backpressure (DIVBAR) algorithm. DIVBAR exploits the broadcast nature of the wireless medium without precise channel state information (CSI) at the transmitter, and is shown to be throughput-optimal under the assumption that at most one packet can be transmitted in each transmission attempt and that no advanced coding scheme is used. Chapter 2 extends DIVBAR by incorporating rateless coding in the transmissions of a single packet and further enhances throughput performance. Motivated by the important role of wireless networks in the delivery of AgI services, in this chapter, we address the problem of optimal distribution of AgI services over a multi-hop wireless computing network, which is composed of nodes with communication and computing capabilities. We extend the multi-commodity- 73 chain (MCC) flow model of [35], [56], [57] for the delivery of AgI services over wireless multi-hop computing networks, enabling the characterization of the flow chaining and scaling aspects of AgI services. In addition, we adopt the broadcast approach coding scheme [58][59], where information is encoded into superposition layers according to the channel conditions. We characterize the capacity region of a wireless computing network and design a fully distributed scheduling and resource allocation algorithm that adaptively stabilizes the underlying queuing system while achieving arbitrarily close to the minimum network cost, with a tradeoff in network delay. Our contributions can be summarized as follows: 1. We jointly schedule the computing and wireless transmission by extending the queuing model for general MCC flow problems to the AgI service distribution problems. In this queuing model, the queue backlog of a given commodity is updated from the wireless transmissions of the same commodity, as well as from the processing of the same commodity and the preceding commodity. 2. By incorporating the broadcast approach coding scheme into the scheduling for wireless computing networks, the routing diversity is exploited and the transmission efficiency is enhanced. Correspondingly, given the set of services and using the broadcast approach with only statistical CSI, we have the following two highlights: We characterize the capacity region of a wireless computing network in terms of the set of exogenous input rates of the source commodity that can be processed through the required service functions and delivered to the required destinations. Unlike the capacity region of the traditional networks, which only depends on the network topology, the capacity region of a computing network also depends on the AgI service structure. We design the dynamic wireless computing network control (DWCNC) algorithm that make local routing, processing, and resource allocation decisions without knowledge of service demands or their statistics, and allow pushing total resource cost arbitrarily close to minimum with a tradeoff in network delay. In particular, our algorithms exhibit a [O(1=V );O(V )] cost-delay tradeoff (whereV is a control parameter). The remainder of this chapter is organized as follows: Section II presents the system model. Section III characterizes the network capacity region of a wireless computing network. Section IV constructs the DWCNC algorithm, and Section V proves the optimal performance of DWCNC. The paper is concluded in Section VI. 5.1 System Model 5.1.1 Network Model We consider a wireless computing network composed ofN =jNj distributed computing nodes that commu- nicate over wireless links labeled according to node pairs (i;j) fori;j2N . Nodei2N is equipped with K tr i transmission resource units (e.g., transmission power) that it can use to transmit information over the 74 …… Commodities: ሺ ࢊ ǡ ሻ ሺ ࢊ ǡ ሻ ሺ ࢊ ǡ ሻ ሺ ࢊ ǡ ࡹ െ ሻ ሺ ࢊ ǡ ࡹ ሻ Functions: ሺ ሻ ሺ ሻ ሺ ࡹ ሻ Figure 5.1: Illustration of the AgI service chain for destinationd2D. There areM functions andM + 1 commodities. The AgI service takes source commodity (d; 0) and delivers final commodity (d;M) after going through the sequence of functionsf1; 2;:::;Mg. Functionm takes commodity (d;m 1) and generates commodity (d;m). wireless channel. In addition, nodei is equipped withK pr i processing resource units (e.g., central processing units or CPUs) that it can use to process information as part of an AgI service (see Sec. 5.1.2). Time is slotted with slots normalized to integer units t2f0; 1; 2;:::g. We use the binary variable y tr i;k (t)2f0; 1g to indicate the allocation or activation ofk2f0;:::;K tr i g transmission resource units at node i at time t, which incurs w tr i;k cost units. Analogously, y pr i;k (t)2f0; 1g indicates the allocation of k2f0;:::;K pr i g processing resource units at nodei at timet, which incursw pr i;k cost units. Notice that binary resource allocation variablesy tr i;k (t),y pr i;k (t) must satisfy P k2K tr i y tr i;k (t) 1, P k2K pr i y pr i;k (t) 1. 5.1.2 Augmented Information Service Model We consider the distribution of an augmented information service described by a chain of functionsM = f1; 2;:::;Mg. A service request is described by a source-destination pair (s;d)2NN , indicating the request for source flows originating at nodes to go through the sequence of functionsM before exiting the network at destination noded. We adopt the MCC flow model, in which commodity (d;m)2NfM; 0g identifies the information flows output from functionm2M for destinationd2N . Commodity (d; 0) denotes the source commodity for destinationd, which identifies the flow that arrive exogenously at each source nodes (see Fig. 5.1). Each service function has (possibly) different processing requirements. We denoter (m) as the processing complexity factor of functionm, which represents the number of operations required by processing one information unit (e.g., bit) through functionm. Another key aspect of AgI services is the fact that information flows can change size as they go through service functions. Let (m) > 0 denote the scaling factor of function m. Then, the size of the function’s output flow is (m) times as large as its input flow. 5.1.3 Computing Model As is shown in Fig. 5.2, we assume that a processing unit (e.g., CPU in a cloudlet node) is co-located with each nodei and implements the service functions by processing the commodities input from nodei. A static dedicated computing channel is modeled, where the achievable processing rate at nodei with the allocation ofk processing resource units is given byR i;k in operations per timeslot. We use (d;m) i;pr (t) to denote the assigned flow rate of commodity (d;m) (0m<M) from nodei to its processing unit at timet, and use (d;m) pr;i (t) to denote the flow rate of commodity (d;m) (0<mM) from the processing unit back to nodei 75 ࣆ ࢘ ǡ ࢊ ǡ ࢚ ࣆ ǡ ࢘ ࢊ ǡ ି ࢚ Processing Unit Computing Node Figure 5.2: The computing node that can process commodity (d;m 1) into commodity (d;m). (see Fig. 5.2). Then, we have the following MCC and maximum processing rate constraints: (d;m) pr;i (t) = (m) (d;m 1) i;pr (t); 8i;d;m>0;t; (5.1) X (d;m> 0) (d;m1) i;pr (t)r (m) X K pr i k=0 R i;k y pr i;k (t); 8i;t: (5.2) 5.1.4 Wireless Transmission Model Due to the broadcast nature of the wireless medium, multiple receivers (RXs) may overhear the transmission of a given transmitter (TX). Multiple TXs may transmit simultaneously to overlapping RXs, due to the use of orthogonal broadcast channels of fixed bandwidth, a priori allocated by a given policy, whose design is outside the scope of this paper. We model the channel between nodei and all other nodes in the network as a physically degraded Gaussian broadcast channel, where the network state process (the vector of all channel gains), denoted byS(t),fs ij (t);8i;j2Ng, evolves according to a Markov process with state spaceS and whose steady-state probability exists. We assume that the statistical CSI is known at the TX, while the instantaneous CSI can only be learned after the transmission has taken place and is thereby outdated (delayed). It is well-known that superposition coding is optimal (capacity achieving) for the physically degraded broadcast channel with independent messages [60]. In particular, in this work we adopt the broadcast approach (see [58, 59] and references therein) coding scheme, which consists of sending incremental information using superposition layers, such that the number of decoded layers at any RX depends on its own channel state, and the information decoded by a given RX is a subset of the information decoded by any other RX with no worse channel gain. That is, for a given transmitting nodei, if we sort theN 1 potential receiving nodes in non-decreasing order of their channel gainsfq i;1 ;:::;q i;N1 g, such thatq i;n withn2f1;:::;N 1g denotes the receiver with then-th lowest channel gain, then the information decoded by receiverq i;n is also decoded by receiverq i;u , foru>n. Moreover, let i;n ,fq i;n ; ;q i;N1 g be the set of receivers with the Nn highest channel gains. Then, we can partition the information transmitted by nodei during a given timeslot intoN 1 disjoint groups, with then-th partition being the information whose successful receiver set is exactly i;n , i.e., the information is decoded by the nodes in i;n but not by nodes inNnfignf i;n g. An example of using the broadcast approach in the transmission is shown in Fig. 5.3. Letp i;k (a) denote the optimal power density function over the continuum of superposition layers resulting from the allocation ofk units of transmission resource at nodei. Then, based on the broadcast approach [59], when allocatingk units of transmission resource, the maximum achievable rate over link (i;j) at timet is 76 Good Medium Bad TX RX Computing Node RX RX Computing Node Computing Node Computing Node Partition 1 Partition 2 Partition 3 Figure 5.3: Illustration of using the broadcast approach in the transmission to explore the multi-receiver diversity. The information decoded by the receiver with the “bad” channel is a subset of the information decoded by the receiver with the “medium” channel, which is further a subset of the information decoded by the receiver with the “good” channel. The transmitted information can therefore be grouped into three partitions. given by R ij;k (t) = Z g ij (t) 0 ap i;k (a) 1 +a R 1 a p i;k (s)ds da (5.3) whereg ij (t) is the channel gain over link (i;j) at timet. In the case of adopting onlyL i discrete code layers, the channel gains of each outgoing links of nodei in each timeslott can be discreitzed intoL i + 1 states, denoted asS i ,f s i;0 ; ; s i;L i g, byL i channel gain thresholdsf g i;l ; 1lL i : g i;1 g i;L i g. Then we have s ij (t) = 8 > > > < > > > : s i;0 ; ifg ij (t)< g i;1 ; s i;l ; if g i;l g ij (t)< g i;l+1 ; 1l<L i 1; s i;L i ; ifg ij (t) g i;L i : Givenk resource units allocated for transmission at nodei, denoteP i;k (l) as the power allocated to code layer l, 1lL i , and the maximum achievable transmission rates corresponding to theL i + 1 channel states are denoted asR i;k ,f R 0 i;k ; ; R L i i;k g, where R 0 i;k = 0, and R l i;k = X l 0 l log 1 + P i;k (l 0 ) g i;l 0 1 + g i;l 0 P l 00 >l 0P i;k (l 00 ) ; for 1lL i : (5.4) The maximum achievable transmission rate over each outgoing link in each timeslot takes a value fromR i;k , i.e.,R ij;k (t)2R i;k . 5.1.5 Communication Protocol The communication protocol between each TX-RX pair is illustrated in Fig. 2.1. At the beginning of each timeslot, TX and RX exchange all necessary control signals, including queue backlog state information (see Sec. 5.1.6). Then, the TX decides how many transmission resource units to allocate for the given timeslot and how to allocate the given bandwidth among the commodities for transmission. 1 Afterwards, the transmission 1 Note that we require by definition uniform power spectral density. 77 ݐ ݐ ݐ ͳ ݐ ͳ Transmitter Receiver Control Instruction Control Instruction Data and Pilot Transmission Final Instruction Feedback Figure 5.4: Timing diagram of the communication protocol over a wireless link for one timeslot starts and lasts for a fixed time period (within the timeslot); during that time both data and pilot tones (whose overhead is neglected) are transmitted. After the transmission ends, every potential RX provides immediate feedback, which contains the identification of the information decoded by the RX and helps the TX to identify the experienced CSI. The TX then makes a forwarding decision and send it through a final instruction to all the RXs, instructing each RX which portion of its decoded information to keep for processing/forwarding in the future (assigning the processing/forwarding responsibility). The control information, feedbacks and final instruction are sent though a stable control channel, whose overhead is neglected. We use (d;m) ij (t) to denote the amount of information of commodity (d;m) retained by nodej after the transmission from nodei during timeslott. In addition, it shall be useful to denote by (d;m) iq i;u ;n (t) the information retained by nodeq i;u belonging to then-th partition of nodei’s transmitted information at timet. Then, due toq i;u 2 i;n for alln satisfyingnu, we have (d;m) iq i;u (t) = X u n=1 (d;m) iq i;u ;n (t); 8i;u;d;m;t: (5.5) Moreover, according to the broadcast approach, the maximum achievable rate of then-th partition at time t isR iq i;n ;k (t)R iq i;n1 ;k (t), givenk resource units allocated, then we have X (d;m) (d;m) i;q i;u ;n (t) X K tr i k=0 R iq i;n ;k (t)R iq i;n1 ;k (t) y tr i;k (t); 8i;t;un; (5.6) whereR iq i;0 ;k (t) = 0, for alli;k;t. Note that Eq. (5.5) and (5.6) can lead to the rate constraint over link (i;j) for allt: P (d;m) (d;m) ij (t) P K tr i k=0 R ij;k (t)y tr i;k (t). 5.1.6 Queuing Model We denote bya (d;m) i (t) the exogenous arrival rate of commodity (d;m) at nodei at timet, and by (d;m) i its expected value. We assume thata (d;m) i (t) is independently and identically distributed (i.i.d.) across timeslots and its second moment is upper bounded:Ef( P (d;m) a (d;m) i (t)) 2 gA 2 max . Recall that, in an AgI service, only the source commodity (d; 0) enters the network exogenously, while all other commodities are created inside the network as the output of a service function. Hence,a (d;m) i (t) = 0, for alli;t whenm> 0. During the AgI service delivery, internal network queues buffer the data according to their commodities. We define the queue backlog of commodity (d;m) at nodei,Q (d;m) i (t), as the amount of commodity (d;m) 78 in the queue of nodei at the beginning of timeslott, which evolves over time as follows: Q (d;m) i (t+1) 2 4 Q (d;m) i (t) X j:j6=i (d;m) ij (t) (d;m) i;pr (t) 3 5 + + X j:j6=i (d;m) ji (t) + (d;m) pr;i (t) +a (d;m) i (t): (5.7) Note that, in an AgI service, only the final commodity (d;M) is allowed to exit the network once it arrives to its destinationd2D, while any other commodity (d;m),m<M, can only get consumed by being processed into the next commodity (d;m + 1) on the service chain. Final commodity (d;M) is assumed to leave the network immediately upon arrival/decoding, i.e.,Q (d;M) d (t) = 0, for alld;t. 5.1.7 Network Objective The goal is to design a control algorithm that dynamically routes and schedules service flows over the wireless computing network with minimum total average resource cost, lim sup t!1 1 t X t1 =0 Efh()g; (5.8) whereh(t) is the total cost of the network at timet: h (t), X i2N X K pr i k=0 w pr i;k y pr i;k (t) + X K tr i k=0 w tr i;k y tr i;k (t) ; (5.9) while ensuring that the network is rate stable [45], i.e., lim t!1 1 t Q (d;m) i (t) = 0 with prob. 1, 8i;d;m: (5.10) 5.2 Wireless Computing Network Capacity Region The wireless computing network capacity region is defined as the closure of all input rate matricesf (d;m) i g that can be stabilized by the network under a control algorithm, given a set of AgI services. Theorem 10. The wireless computing network capacity region consists of all average exogenous input rates f (d;m) i g for which there exist multi-commodity flow variablesf (d;m) ij ,f (d;m) pr;i ,f (d;m) i;pr , together with probability values pr i;k , tr i;k (s), (d;m) i;pr (k), (d;m) i;tr (s;k), (d;m) ij (s;k;n), for alli;j6=i;k;d;m, and all network states s2S, such that: X j f (d;m) ji +f (d;m) pr;i + (d;m) i X j f (d;m) ij +f (d;m) i;pr ; 8i6=d;m; ori;d;m<M (5.11) f (d;m+1) pr;i = (m+1) f (d;m) i;pr ; 8i;d;m<M (5.12) f (d;m) i;pr 1 r (m+1) X K pr i k=0 pr i;k (d;m) i;pr (k)R i;k ; 8i;d;m<M (5.13) 79 f (d;m) ij X s2S s X K tr i k=0 tr i;k (s) (d;m) i;tr (s;k) X q 1 i;s (j) n=1 R iq in ;k (s)R iq in1 ;k (s) (d;m) ij (s;k;n); 8i;j;d;m; (5.14) f (d;M) i;pr = 0; f (d;0) pr;i = 0; f (d;M) dj = 0; f (d;m) i;pr 0; f (d;m) ij 0; 8i;j;d;m; (5.15) X K pr i k=0 pr i;k 1; X K tr i k=0 tr i;k (s) 1; 8i;s; (5.16) X (d;m) (d;m) i;pr (k) 1; X (d;m) (d;m) i;tr (s;k) 1; 8i;s;k (5.17) X j (d;m) ij (s;k;n) 1; 8i;s;k;n (5.18) wheres denotes the network state, whose (i;j)-th element (s) ij indicates the channel state of link (i;j), s denotes the steady state probability distribution of the network state processS(t), andq 1 i;s (j) in (5.14) is the index of nodej in the sequencefq i;1 ; ;q i;N1 g, given the network states. Finally, with an abuse of notation, in (5.14),R ij;k (s) denotes the maximum achievable rate over link (i;j), given the network states and the allocation ofk units of transmission resources. Furthermore, the minimum average network cost required for network stability is given by h = minh (5.19) where h = X i2N X K pr i k=0 pr i;k w pr i;k + X K tr i k=0 w tr i;k X s2S s tr i;k (s) ; (5.20) and the minimization is over all pr i;k , tr i;k (s), (d;m) i;pr (k), (d;m) i;tr (s;k), and (d;m) ij (s;k;n) satisfying (5.11)- (5.18). Proof. See Appendix C.1. In the above theorem, Eq. (5.11) are flow conservation constraints, Eq. (5.13) and (5.14) are rate constraints, and Eq. (5.15) shows the non-negativity and flow efficiency constraints. The probability values pr i;k , tr i;k (s), (d;m) i;pr (k), (d;m) i;tr (s;k) and (d;m) ij (s;k;n) define a stationary randomized policy using single-copy routing - only one copy of each elementary unit (with arbitrary fine granularity) of information is flowing through the network - that is optimal among all stabilizing algorithms (including the algorithms using multi-copy routing), where: pr i;k : the probability thatk processing resource units are allocated at nodei; tr i;k (s): the conditional probability thatk transmission resource units are allocated at nodei, given the network states; (d;m) i;pr (k): the conditional probability that nodei processes commodity (d;m), given the allocation of k processing resource units; (d;m) i;tr (s;k): the conditional probability that nodei transmits commodity (d;m), given the network states and the allocation ofk transmission resource units; 80 (d;m) ij (s;k;n): the conditional probability that nodei forwards the information of commodity (d;m) in then-th partition to nodej, given the network states and the allocation ofk transmission resource units. It is important to note that this optimal stationary randomized policy is hard to obtain in practice, as it requires the knowledge off (d;m) i g and solving a complex nonlinear program. However, its existence is essential for proving the performance of our proposed algorithm. 5.3 Dynamic Wireless Computing Network Control Algorithm Defining a non-negative control parameterV representing the degree to which we emphasize resource cost minimization, we propose a dynamic wireless computing network control strategy that accounts for both transmission and processing -related flow and resource allocation decisions in a fully distributed manner. 5.3.1 The DWCNC Algorithm Dynamic Wireless Computing Network Control (DWCNC): Local processing decisions: In timeslott, each nodei observes its local queue backlogs and performs the following operations: 1. For each commodity (d;m), compute the processing utility weights W (d;m) i (t) = 1 r (m+1) h Q (d;m) i (t) (m+1) Q (d;m+1) i (t) i + ; where we denote [x] + = maxfx; 0g. Specifically, W (d;m) i (t) indicates the “potential benefit” of executing function (m+1) to process commodity (d;m) into commodity (d;m+1) at timet, in terms of local congestion reduction per processing operation. 2. Compute the optimal number of resource unitsk y pr to allocate and the optimal commodity (d;m) y pr to process: h k y pr ; (d;m) y pr i =arg max k;(d;m) n R i;k W (d;m) i (t)Vw pr i;k o ; (5.21) whereV is a non-negative control parameter that determines the degree to which cost is emphasized. 3. Make the following flow rate assignment decisions: (d;m) y pr i;pr (t) =R i;k y pr . r (m y pr +1) ; (d;m) i;pr (t) = 0; 8(d;m)6= (d;m) y pr : Local wireless transmission decisions: In timeslott, each nodei observes its local queue backlogs, the queue backlogs of its potential RXs and the associated statistical CSI, and performs the following operations: 81 1. For each commodity (d;m) and each receiving nodej, compute the differential backlog weight W (d;m) ij (t), h Q (d;m) i (t)Q (d;m) j (t) i + : 2. For each transmission resource allocation choicek2f0;:::;K tr i g, compute the transmission utility weight for each commodity (d;m): W (d;m) i;k;tr (t), P s2S Pr (S(t) =sjS (t 1) = ~ s) N1 P n=1 R iq i;n ;k (s)R iq i;n1 ;k (s) max j2 i;n (s) n W (d;m) ij (t) o ; (5.22) where ~ s denotes the CSI feedbacks at timet 1, and, with an abuse of notation, i;n (s) is used to indicate the dependence of i;n on the network states. 3. Compute the optimal number of resource unitsk y tr to allocate and the optimal commodity (d;m) y tr to transmit: h k y tr ; (d;m) y tr i = arg max k;(d;m) n W (d;m) i;k;tr (t)Vw tr i;k o ; (5.23) Ifk y tr = 0, nodei keeps silent in timeslott. 4. After receiving the CSI feedbacks, nodei identifies the information decoded by each RX and assigns the processing/forwarding responsibility of then-th partition of the transmitted information to the RX in i;n (S(t)) with the largest positiveW (d;m) ij (t), while all other RXs in i;n (S(t)) and nodei discard the information. If no receiver in i;n (S(t)) has positiveW (d;m) ij (t), nodei retains the information of partitionn, while all the receivers in i;n (S(t)) discard it. Remark: For the local processing decisions at nodei over each timeslot, maximizing the metric in (5.21) over [k; (d;m)] can be decomposed into first maximizingW (d;m) i (t) over (d;m) and then maximizing the metric overk given the maximizedW (d;m) i (t). The computational complexity isO(K pr i +NM). In Step 2) of the local transmission decisions of DWCNC,W (d;m) i;k;tr (t) is computed according to (5.22) using the transition probabilities Pr(S(t) =sjS(t 1) = ~ s), which is known as the statistical CSI, but the complexity can be high due to the possible exponentially large network state space with respect to the number of links. However, the computation can be significantly simplified when the channel realizations of the wireless links are mutually independent, which is described in the next Subsection. In Step 4) of the local transmission decisions of DWCNC, discarding the decoded information at the nodes that do not get the processing/forwarding responsibility implies that DWCNC is a single-copy routing algorithm. 82 5.3.2 Computing Transmission Utility Weight with Independent Links and Discrete Code Layers With theL i discrete code layers used in the broadcast approach, we haveR ij;k (t)2R i;k , givenk resource units allocated for transmission at nodei (see Sec. 5.1.4). In addition, we define i;l (S(t)) as the set of all the receivers, in which each receiverj has channel gain no smaller than g i;l at timet, i.e,g ij (t) g i;l for all j2 i;l (S(t)), andg ij (t)< g i;l for allj = 2 i;l (S(t)). GivenS(t) = s andk, we have the following two possible cases for the maximum achievable trans- mission rate of the n-th partition: i) R iq i;n ;k (s)R iq i;n1 ;k (s) = 0; ii) R iq i;n ;k (s)R iq i;n1 ;k (s) = P l 1 l=l 0 R l i;k R l1 i;k , for somel 0 andl 1 satisfying 1l 0 l 1 L i , with i;l (s) = i;n (s) forl 0 ll 1 . Then we have, for alli,s,k, (d;m), andt, N1 P n=1 R iq i;n ;k (s)R iq i;n1 ;k (s) max j2 i;n (s) n W (d;m) ij (t) o = L i P l=1 R l i;k R l1 i;k max j2 i;l (s) n W (d;m) ij (t) o ; based on which we can rewrite Eq. (5.22) as follows: W (d;m) i;k;tr (t) = X L i l=1 R l i;k R l1 i;k X s2S Pr (S(t) =sjS (t 1)) max j2 i;l (s) n W (d;m) ij (t) o = X L i l=1 R l i;k R l1 i;k E n max j2 i;l (S(t)) n W (d;m) ij (t) o H(t) o ; (5.24) whereH(t),fQ(t);S(t1)g is the ensemble of queue backlog observations at timet and the CSI feedbacks for timet 1. Denote 1 (d;m) ij;l (t) as the variable that takes value 1 if receiver j has the largest differential backlog W (d;m) ij (t) among the receivers in i;l (S(t)), and 0 otherwise. Then we have E n max j2 i;l (S(t)) n W (d;m) ij (t) o H(t) o =E n X j W (d;m) ij (t) 1 (d;m) ij;l (S(t)) H(t) o = X j W (d;m) ij (t)' (d;m) ij;l (H(t)); (5.25) where' (d;m) ij;l (H(t)) is the conditional probability that 1 (d;m) ij;l (t) takes value 1 givenH(t). Plugging (5.25) into (5.24) to computeW (d;m) i;k;tr (t), we generate the following two steps to replace Step 2) of the local transmission decisions of DWCNC described in Sec. 5.3.1: 2a) Sort the differential backlog weightsW (d;m) ij (t) for each commodity (d;m) in non-increasing order. Denote (d;m) ij (t) as the set of receivers of nodei with smaller indices than the index of receiverj in the sorted list at timet. In this case, each receiver in (d;m) ij (t) has no less differential backlog weight than receiverj. 2b) For each transmission resource allocation choicek2f0;:::;K tr i g, compute the transmission utility weight for each commodity (d;m): W (d;m) i;k;tr (t) = X L i l=1 R l ij;k R l1 ij;k X j W (d;m) ij (t)' (d;m) ij;l (H(t)): (5.26) 83 Computing' (d;m) ij;l (H(t)) involves the statistical CSI, and the computation can be significantly simplified if the channel realizations of the links are mutually independent. In this case, the value of' (d;m) ij;l (H(t)) can be obtained from a simple multiplication as follows: ' (d;m) ij;l (H(t)) = Pr (g ij (t) g i;l js ij (t 1)) Y v2 (d;m) ij (t) Pr (g iv (t)< g i;l js iv (t 1)) = X L i l 0 =l Pr s ij (t) = s i;l 0 s ij (t 1) Y v2 (d;m) ij (t) X l1 l 0 =0 Pr s iv (t) = s i;l 0 s iv (t 1) ; (5.27) where Pr(s ij (t) = s i;l 0 s ij (t 1) = s i;l 00) is the statistical CSI of link (i;j). The computational complexity associated with the transmission decision made at nodei over each timeslot isO(MN(log 2 N +L i K tr i )), which is dominated by sorting the receivers according to their differential backlogs of all commodities and computing the transmission utility weights for all commodities and resource allocation choices. 5.4 Performance Analysis This section gives the analysis of the throughput optimality and average cost performance of DWCNC, which is provided by the following theorem. Theorem 11. For any exogenous input rate matrix =f (d;m) i g strictly interior to the capacity region , DWCNC stabilizes the wireless computing network, while achieving an average total resource cost arbitrarily close to the minimum average costh () with probability 1; i.e., lim sup t!1 1 t X t1 =0 h()h () + NB V ; with prob. 1, (5.28) lim sup t!1 1 t X ;i;d;m Q (d;m) i () NB +V h h ( +1)h () i ; with prob. 1, whereB is a constant that depends on the system parametersR i;K tr i (s),R i;K pr i ,A max , (m) , andr (m) ; is a positive constant satisfying ( +1)2 ; andh () denotes the average cost obtained by the optimal solution given input rates. Proof. See Appendix C.2. The finite bound on expected total queue length in Theorem 11 implies that the computing network is strongly stable. The parameterV can be increased to lead the average power cost arbitrarily close to the minimum costh () required for network stability, with a linear increase in average network congestion which further indicates the increase of delay. Thus, Theorem 11 demonstrates a [O(1=V );O(V )] cost-delay tradeoff. 84 2 4 5 9 10 11 7 1 AP UE Wireless Link with Rician Fading Wireless Link with Rayleigh Fading !"#"$ =1 !"#%$ =4 !%#"$ =0.25 !%#%$ =1 Service 1: Service 2: 3 8 6 Figure 5.5: A Wireless Computing Network Providing Two Augmented Information Services Table 5.1: Coordinations of the Computing Nodes Node Index 1 2 3 4 5 6 Coordination (X;Y ) (0,10) (10,0) (-5, 20) (22, 0) (27, 5) (24, 10) Node Index 7 8 9 10 11 Coordination (X;Y ) (13, 22) (5, 30) (27, 23) (35, 21) (30, 33) 5.5 Numerical Experiments In this section, we present numerical results obtained from simulating the DWCNC algorithm during 10 6 timeslots. The numerical values presented below of the resource allocation costs, communication flow rates, processing flow rates, and queue backlogs are all measured in normalized unit. 5.5.1 Network Structure We simulate a wireless computing network, shown in Fig. 5.5. All 11 nodes represent computing locations: node 1, 6 and 7 represent access points (APs); the other nodes represent user-end (UE) devices. We list the (X;Y ) coordinations of all the nodes’ positions in Table 5.1 in normalized distance unit. For processing, each AP has five resource allocation choices:w pr i;k =k fori = 1; 6; 7 andk = 0; 1; ; 4, while each UE has two resource allocation choices:w pr i;0 = 0 andw pr i;1 = 2, wherei = 2; ; 5; 8; ; 11. Note that the processing cost per capacity unit is lower at the APs than at the UEs, i.e., the APs are cheaper in processing than the UEs. For the transmissions, each node has two resource allocation choicesw tr i;0 = 0 andw tr i;1 = 1. The edges in Fig. 5.5 represent the active wireless links, whose channel realizations are mutually independent. In addition, the channel realizations of each link are independently and identically distributed (i.i.d.) across the timeslots. 2 We assume there is Line of Sight (LOS) between APs, and the links between APs have Rician fading (see Ref. [1]) with the Rice factor equal to 5 dB. On the other hand, we model the rest of the links to have Rayleigh fading (see Ref. [1]), where no LOS exits. The path loss coefficient of each link is 3. 2 Note that i.i.d. process is a special Markov process, which is assumed in Section 5.1.4 for the evolutions of the network state. The i.i.d. evolution is approximately fulfilled when the timeslot length is equal to the coherence time of the wireless medium. 85 5.5.2 Service Setups The computing network provides two services (see Fig. 5.5), each of which consists of two functions. For the convenience of notation for multiple services, we denote (;m), = 1; 2 andm = 1; 2, as them-th function of service; denote (d;;m),d2f1; ; 11g, = 1; 2 andm = 0; 1 as the commodity processed through function (;m + 1) for destinationd. All four functions have the same complexity factor equal to 1 (number of operations per unit flow). In term of flow scaling of each function, as is shown in Fig. 5.5, function (1; 1) and (1; 2) have scaling factors of 1 and 4, respectively, i.e., (1;1) = 1 and (1;2) = 4; function (2; 1) and (2; 2) have scaling factors of 0:25 and 1, respectively, i.e, (2;1) = 0:25 and (1;2) = 1. Note that function (1; 2) is an expansion function, while function (2; 1) is a compression function. 5.5.3 Communication Setups We simulate the case that each nodei uses the broadcast approach with three additive layered codes for the transmissions, and the three corresponding channel gain thresholds [ g i;1 ; g i;2 ; g i;3 ] for the outgoing links are: [46:82;40:80;39:03] dB for Rician fading channels; [43:02;38:37;36:41] dB for Rayleigh fading channels. By allocating the power among the coding layers in the ratio of 1 : 4 : 6 and 1 : 2:9 : 4:6 of the total transmission power with costw tr i;1 respectively for Rician fading and Rayleigh fading channels, 3 according to (5.4), we generate the maximum achievable transmission rates [ R 0 i;1 ; R 1 i;1 ; R 2 i;1 ; R 3 i;1 ] corresponding to the four discrete channel states as follows: [0; 12:1; 20:6; 48:3] for Rician fading channels; [0; 7:8; 13:1; 27:8] for Rayleigh fading channels. To demonstrate the efficiency of adopting the broadcast approach, we also simulate the case of adopting the traditional outage approach coding scheme in the transmissions under DWCNC scheduling. With the outage approach, the transmission rate is fixed, and the information is reliably decoded when the instantaneous channel gain exceeds a threshold, otherwise no information is decoded. The transmission efficiency enhancement of the broadcast approach comparing with the outage approach has been proven for one-hop broadcast channels in [59]. This set of simulations demonstrate the performance enhancement through simulations over wireless multi-hop networks with broadcast effect. For each node i using the outage approach, we set g out i =40:80 dB and38:37 dB as the outage thresholds of the channel gain respectively for the outgoing links with Rician fading and Rayleigh fading, such that ifg ij (t) g out i andk = 1, the maximum achievable rateR ij;1 (t) is equal to the outage rate denoted as R out i;1 (flow unit per timeslot); otherwise,R ij;1 (t) is zero. Note that the values of g out i and R out i;1 are respectively equal to the values of g i;2 and R 2 i;1 in the broadcast approach. 3 The optimization of the power allocation among different code layers at each transmitting node is beyond the scope of this paper. In this paper, we treat the power values allocated among code layers as given parameters. 86 5.5.4 Performances of DWCNC with the broadcast approach and the outage approach We present simulation results in Fig. 5.6 for the performance of DWCNC providing the two services on the wireless computing network given in Fig. 5.5. In the simulation scenario, each service has 56 clients (source-destination pairs): any pair of two UE nodes is a source-destination pair requesting the delivery of both services through the network. Each source node receives the exogenous arrivals of the source commodity of each service for each destination with rate satisfying i.i.d. Poisson distribution across timeslots with mean equal to 0:7. 0 2 4 6 8 10 12 x 10 5 16 18 20 22 24 26 28 30 32 Average Occupancy Average Cost DWCNC, Broadcast Approach DWCNC, Outage Approach 1.51 1.29×10 5 (a) 0 0.2 0.4 0.6 0.8 1 1.2 0 1 2 3 4 5 6 x 10 5 Average Input Rate per Client of Each Source Commodity Average Occupancy DWCNC, Broadcast Approach DWCNC, Outage Approach 0.28 (b) Figure 5.6: Performance of DWCNC with the broadcast approach and the outage approach in the large scale scenario: a) Average Cost vs. Average Occupancy b) Average occupancies evolving with varying exogenous input rate: throughput optimality Fig. 5.6(a) shows the tradeoffs between the evolutions of the average cost and of the time average occupancy (total queue backlog) as the control parameterV varies between 0 and 10 4 , when running DWCNC with the broadcast approach and the outage approach, given that the average input rate of the source commodity for each client and each service is equal to 1. It can be seen from Fig. 5.6(a) that, with either coding scheme, the average cost under DWCNC algorithm decreases along with the increase of the average occupancy. In general, both evolutions follow the [O(1=V );O(V )] cost-delay tradeoff shown by Theorem 11. However, the corresponding trade-off ratios are different. Comparing the two curves in Fig. 5.6(a), it can be seen that using the broadcast approach can achieve lower occupancy, which implies enhanced delay performance, and lower average cost. On the one hand, to have the average cost below a threshold (assuming that it is feasible to achieve this average cost threshold using both coding schemes), e.g. 19:80, the corresponding occupancy needed when using the broadcast approach (6:84 10 4 ) is lower than the occupancy needed when using the outage approach (1:97 10 5 ), which implies a delay reduction (1:29 10 5 difference in occupancy implying a significant delay reduction). On the other hand, when having an occupancy threshold, e.g. 1:98 10 5 , above which any occupancy is not accepted, the average cost achieved by using the broadcast approach (18:29) is lower than that achieved by using the outage approach (19:80). These results are caused by the enhanced transmission ability by using 87 the broadcast approach instead of using the outage approach, which in turn results in lower delay and lower transmission cost. The curves showing the throughput performances under the DWCNC algorithm with both the broadcast approach and the outage approach are plotted in Fig. 5.6(b), where we simulate the time average occupancy evolving over increasing average exogenous input rates of the source commodities, while setting the control parameterV equal to 500. We maintain the average exogenous input rates of the source commodities among all the clients for both service 1 and 2 to be the same, and the average occupancies with the outage approach and the broadcast approach exhibit sharp increases when the common average exogenous input rate value increases up to approximately 0:83 and 1:11, respectively. According to Theorem 11 and considering! 0, these sharp increases imply that the average input rates have reached the boundary of the computing network capacity region, which is the throughput limit. It can be seen from Fig. 5.6(b) that the throughput limit of using the broadcast approach in DWCNC is larger than that of using the the outage approach. This throughput difference is also an effect of the enhanced transmission ability by using the broadcast approach. 5.5.5 Processing Distribution Across the Network under We further simulate the average processing input rate distribution for the 4 functions and 56 clients across the computing network nodes under DWCNC with both the outage approach and the broadcast approach, respectively shown as Fig. 5.7 and Fig. 5.8, where the average input rate for each client and each service is equal to 1, and we set the control parameterV = 10 4 . Observe from Fig. 5.7(a) and Fig. 5.8(a) that, on the one hand, the implementation of function (1; 1) mostly concentrates at the APs (node 1; 6; 7), which is motivated by the fact that the APs are cheaper in processing than the UEs. On the other hand, part of the processing of function (1; 1) still takes place at the UEs, even though, on average, the APs have available processing capacities to use. This results from the fact that, for certain clientss!d, there exist short paths connecting nodes andd not passing through any AP, such that the commodity (d; 1; 0) steadily chooses routings along these paths and gets processed at the UEs along them, instead of choosing to route along longer paths that pass through the APs and get processed there. This is because the overall cost of the former choice is lower than the cost of later one. Comparing Fig. 5.7(a) and Fig. 5.8(a), it can be seen that the processing of function (1; 1) concentrates more at the APs in Fig. 5.8(a), this is also due to the enhanced transmission ability when using the broadcast approach, which in turn lowers the cost of taking longer paths passing through the APs and get the commodities (d; 1; 0) processed there. Fig. 5.7(b) and Fig. 5.8(b) demonstrate the average processing input rate distribution of function (1; 2), which is an expansion function. In these two figures, the processing of each processed commodity (d; 1; 1) concentrates at its final destinationd in cases of using both the broadcast approach and the outage approach. This processing distribution is the result of minimizing the transmission cost impact of the expanded-size commodities created from the execution of function (1; 2) by running DWCNC. For Service 2, observe that the average processing input rate distributions of function (2; 1) are different in Fig. 5.7(c) and Fig. 5.8(c). With the outage approach used in the transmissions, Fig. 5.7(c) demonstrates that function (2; 1), a compression function, is implemented at all the UEs except at noded and the APs. 88 to Node 2 to Node 3 to Node 4 to Node 5 to Node 8 to Node 9 to Node 10 to Node 11 1 2 3 4 5 6 7 8 9 10 11 0 5 10 15 Node Index Average Processing Input Rate (a) 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 Node Index Average Processing Input Rate (b) 1 2 3 4 5 6 7 8 9 10 11 0 5 10 15 Node Index Average Processing Input Rate (c) 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 Node Index Average Processing Input Rate (d) Figure 5.7: Average processing input rate distribution of DWCNC with the outage approach. a) Service 1, Function 1; b) Service 1, Function 2; c) Service 2, Function 1; d) Service 2, Function 2. to Node 2 to Node 3 to Node 4 to Node 5 to Node 8 to Node 9 to Node 10 to Node 11 1 2 3 4 5 6 7 8 9 10 11 0 5 10 15 Node Index Average Processing Input Rate (a) 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 Node Index Average Processing Input Rate (b) 1 2 3 4 5 6 7 8 9 10 11 0 5 10 15 Node Index Average Processing Input Rate (c) 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 Node Index Average Processing Input Rate (d) Figure 5.8: Average processing input rate distribution of DWCNC with the broadcast approach. a) Service 1, Function 1; b) Service 1, Function 2; c) Service 2, Function 1; d) Service 2, Function 2. This is because, for each clients! d, implementing function (2; 1) takes place at the nodes, in order to reduce the transmission cost of service 2 by compressing the source commodity (d; 2; 0) before it flows into the network. In contrast, as is shown in Fig. 5.8(c), the implementation of function (2; 1) using the broadcast approach mostly concentrates at the APs. This is because the transmission is cheaper when adopting the broadcast approach, and the overall cost can be even lower when commodities (d; 2; 0) for most clients route along paths passing through the APs and get compressed there, even though the routing paths could be longer and the flowing commodity is uncompressed before reaching the APs. The processing distributions of function (2; 2) shown in Fig. 5.7(d) and Fig. 5.8(d) display the similar distribution trends as those of function (1; 1) shown in Fig. 5.7(a) and Fig. 5.8(a), except for different rate values. Comparing Fig. 5.7(d) and Fig. 5.8(d), the processing also concentrates more on the APs when adopting the broadcast approach in DWCNC due to cheaper transmissions. 5.6 Conclusion We considered the problem of optimal distribution of augmented information services over wireless computing networks. We characterized the capacity region of a wireless computing network and designed a dynamic wireless computing network control (DWCNC) algorithm that drives local transmissions-plus-processing flow scheduling and resource allocation decisions, shown to achieve arbitrarily close to minimum average network cost, while subject to network delay increase with the general trade off order [O(1=V );O(V )]. Our solution captures the unique chaining and flow scaling aspects of AgI services, while exploiting the use of the broadcast approach coding scheme over the wireless channel. 89 Chapter 6 Approximation Algorithms for the NFV Service Distribution Problem As described in Section 1.4, distributed cloud networking builds on network functions virtualization (NFV) and software defined networking (SDN) to enable the deployment of network services in the form of elastic virtual network functions (VNFs) instantiated over general purpose servers at multiple cloud locations, and interconnected by a programmable network fabric that allows dynamically steering client flows. Chapter 4 and 5 proposed the dynamic solutions to the service distribution problem in the wired and wireless distributed cloud networks. In contrast, this chapter aims at giving static solutions that assume global knowledge of service demands and network conditions available to each local cloud node. This type of design is useful for the scenarios of physical computing and network resources’ placement and virtual function distribution, which typically have relatively large timescales, e.g., days and hours. After obtaining the static solution, the computing and network flow follow the fixed routing and scheduling decisions instructed by the solution. In contrast to the dynamic NSDP which has the service delivery delay issue due to its dynamic nature, the delay is no longer a critical issue in this scenario. Instead, how fast the algorithm produces an efficient static solution with enough accuracy becomes the main challenge. The contributions in this chapter can be summarized as follows: We formulate the NSDP as a minimum cost multi-commodity-chain network design (MCCND) problem on a cloud-augmented graph, where the goal is to find the placement of service functions, the routing of client flows through the appropriate service functions, and the corresponding allocation of cloud and network resources that minimize the cloud network operational cost. We first address the case of load-proportional resource costs. We show that the fractional NSDP becomes a min-cost multi-commodity-chain flow (MCCF) problem that admits optimal polynomial- time solutions. We design a queue-length based algorithm, named QNSD, that is shown to provide an O() approximation to the fractional NSDP in timeO(1=) with a control factor set to 0. We further conjecture that QNSD exhibits an improvedO(1= p ) convergence – a result that, while not formally proved in this paper, is illustrated via simulations. 91 We then address the case in which resource costs are a function of the integer number of allocated resources. We design a new algorithm, C-QNSD, which constrains the evolution of QNSD to effectively drive service flows to consolidate on a limited number of active resources, yielding good practical solutions to the integer NSDP. The rest of this chapter is organized as follows. We review related work in Section 6.1. Section 6.2 introduces the system model. Section 6.3 describes the network flow based formulation for the NSDP. Sections 6.4 and 6.5 present the proposed approximation algorithms for the fractional and integer NSDP, respectively. We present simulations results in Section 6.6, discuss possible extensions in Section 6.7, and conclude the paper in Section 6.8. 6.1 Related Work To the best of our knowledge, the algorithms presented in this paper are the first approximation algorithms for the NSDP. While the NSDP can be seen as a special case of the CSDP introduced in [61], the authors only provided a network flow based formulation, without addressing the design of efficient approximation algorithms. As shown in Section 6.4, the fractional NSDP is a generalization of the minimum cost multi-commodity flow (MCF) problem. A large body of work has addressed the design of fast fully polynomial time approx- imation schemes (FPTAS) for MCF. The work in [62] summarizes the best known FPTAS for MCF and fractional packing problems. Based on [62], the fastest schemes use shortest-path computations in each iteration to provideO() approximations in timeO(1= 2 ). The scheme in [63] runs in timeO(1=), but requires solving a convex quadratic program in each iteration, yielding slower running times for moderately small. Our proposed QNSD algorithm for the fractional NSDP is shown to provide anO() approximation in timeO(1=), while simply solving a set of linear time max-weight problems in each iteration. We further conjecture that the running time of our algorithm is in factO(1= p ). QNSD is hence also an improved FPTAS for MCF. In [64], the authors extended the shortest-path based algorithms in [62] to design policy-aware routing algorithms that steer network flows through pre-defined sequences of network functions in order to maximize the total served flow. The problem addressed in [62] can be thought of as a maximum flow version of the fractional NSDP, but their algorithms still run in timeO(1= 2 ). With respect to the integer NSDP, we show in Section 6.5 that it is a generalization of the well known NP-hard multi-commodity network design (MCND) problem [65]. Our proposed C-QNSD algorithm extends QNSD with knapsack relaxation techniques similar to those used in MCND [65]. 6.2 System Model 6.2.1 Cloud network model We consider a cloud network modeled as a directed graphG = (V;E) withn =jVj vertices andm =jEj edges representing the set of nodes and links, respectively. A cloud network node represents a distributed 92 ! ! u p(u) q(u) s(u) Figure 6.1: Cloud-augmented graph for nodeu, wherep(u) represents the processing unit that hosts flow processing functions,s(u) the source unit from which flows enter the cloud network, andq(u) the demand unit via which flows exit the cloud network. cloud location, in which virtual network functions or VNFs can be instantiated in the form of e.g. virtual machines (VMs) over general purpose servers [66]. When service flows go through VNFs at a cloud node, they consume cloud resources (e.g., cpu, memory). We denote byw u the cost per cloud resource unit (e.g., server) at nodeu and byc u the maximum number of cloud resource units that can be allocated at nodeu. A cloud network link represents a network connection between two cloud locations. When service flows go through a cloud network link, they consume network resources (e.g., bandwidth). We denote byw uv the cost per network resource unit (e.g., 1 Gbps link) on link (u;v) and byc uv the maximum number of network resource units that can be allocated on link (u;v). 6.2.2 Service model A service2 is described by a chain ofM VNFs. We use the pair (;i), with2 ;i2f1;:::;M g, to denote thei-th function of service, andL to denote the to total number of available VNFs. Each VNF is characterized by its cloud resource requirement, which may also depend on the specific cloud location. We denote byr (;i) u the cloud resource requirement (in cloud resource units per flow unit) of function (;i) at cloud nodeu. That is, when one flow unit goes through function (;i) at cloud nodeu, it consumesr (;i) u cloud resource units. In addition, when one flow unit goes through the fundamental transport function of link (u;v), it consumesr tr uv network resource units. A client requesting service2 is represented by a destination noded2D()V, whereD() denotes the set of clients requesting service. The demand of clientd for service is described by a set of source nodesS(d;)2V and demands d; s ,8s2S(d;), indicating that a set of source flows, each of size d; s flow units and entering the network ats2S(d;), must go through the sequence of VNFs of service before exiting the network at destination noded2D(). We set (d;) u = 0;8u = 2S(d;) to indicate that only nodes inS(d;) have source flows for the request of clientd for service. We note that the adopted destination-based client model allows the total number of clients to scale linearly with the size of the network, as opposed to the quadratic scaling of the source-destination client model. 93 6.3 The NFV Service Distribution Problem The goal of the NFV service distribution problem (NSDP) is to find the placement of service functions, the routing of service flows, and the associated allocation of cloud and network resources, that meet client demands with minimum overall resource cost. In the following, we show how the NSDP can be solved by computing a chained network flow on a properly constructed graph. We adopt a multi-commodity-chain flow (MCCF) model, in which a commodity is uniquely identified by the triplet (d;;i), which indicates that commodity (d;;i) is the output of thei-th function of service for clientd (see Fig. 5.1). We formulate the NSDP as a MCCF problem on the cloud-augmented graph that results from augmenting each node inG with the gadget in Fig. 6.1, wherep(u);s(u), andq(u) denote the processing unit, source unit, and demand unit at cloud nodeu, respectively. The resulting graph is denoted byG a = (V a ;E a ), with V a =V[V 0 ,E a =E[E 0 , and whereV 0 andE 0 denote the set of processing, source, and demand unit nodes and edges, respectively. We denote by (u) and + (u) the set of incoming and outgoing neighbors of node u inG a . In the cloud-augmented graphG a , each edge (u;v)2E a has an associated capacity, unit resource cost, and per-function resource requirement, as described in the following. For the set of edgesf(u;p(u)); (p(u);u) :u2Vg representing the set of compute resources, we have: c u;p(u) =c u ,c p(u);u =c max u w u;p(u) =w u ,w p(u);u = 0 r (;i) u;p(u) =r (;i) u ;8(;i) where c max u = P (d;;i) P s2S(d;) (d;) s r (;i) u . Note that we model the processing of network flows and associated allocation of compute resources using link (u;p(u)), and let edge (p(u);u) be a free-cost edge of sufficiently high capacity that carries the processed flow back to nodeu. For the set of edgesf(s(u);u); (u;s(u)) :u2Vg representing the resources via which client flows enter and exit the cloud network, we have: c s(u);u =c max u ,c u;q(u) =c max u w s(u);u = 0,w u;q(u) = 0 r (;i) s(u);u =r (;i) u;q(u) = 0;8(;i) Note that we model the ingress and egress of network flows via free-cost edges of sufficiently high capacity. In addition, the irrelevant per-function resource requirement for these edges is set to zero. Given that the set of network resources only perform the fundamental transport function of moving bits between cloud network nodes, the per-function resource requirement for the set of edges in the original graph (u;v)2E is set tor tr uv for all (;i), i.e.,r (;i) uv =r tr uv ;8(;i). The capacity and unit resource cost of network edge (u;v)2E is given byc uv andw uv , respectively. 94 We now define the following MCCF flow and resource allocation variables on the cloud-augmented graph G a : Flow variables: f (d;;i) uv indicates the fraction of commodity (d;;i) on edge (u;v)2E a , i.e., the fraction of flow output of function (;i) for destinationd carried/processed by edge (u;v). Resource variables:y uv indicates the total amount of resource units (e.g., cloud or network resource units) allocated on edge (u;v)2E a . The NSDP can then be formulated via the following compact linear program: min X (u;v)2E a w uv y uv (6.1a) s.t. X v2 (u) f (d;;i) vu = X v2 + (u) f (d;;i) uv 8u;d;;i (6.1b) f (d;;i) p(u);u =f (d;;i1) u;p(u) 8u;d;;i6=0 (6.1c) X (d;;i) f (d;;i) uv r (;i+1) uv y uv c uv 8(u;v) (6.1d) f (d;;0) s(u);u = (d;) u 8u;d; (6.1e) f (d;;M ) u;q(u) = 0 d;;u6=d (6.1f) f (d;;i) uv 0; y uv 2Z + 8(u;v);d;;i (6.1g) where, when not specified for compactness,u2V,d2V,2 ,i2f1;:::;M g, and (u;v)2E a . The objective is to minimize the overall cloud network resource cost, described by (6.1a). Recall that the setE a contains all edges in the augmented graphG a , representing both cloud and network resources. Eq. (6.1b) describes standard flow conservation constraints applied to all nodes inV. A specially critical set of constraints are the service chaining constraints described by (6.1c). These constraints establish that in order to have flow of a given commodity (d;;i) coming out of a processing unit, the input commodity (d;;i 1) must be entering the processing unit. Constraints (6.1d) make sure that the total flow at a given cloud network resource is covered by enough resource units without violating capacity. 1 Eqs. (6.1e) and (6.1f) establish the source and demand constraints. Note from (6.1f) that no flows of final commodity (d;;M ) are allowed to exit the network other than at the destination noded. Finally, (6.1g) describe the fractional and integer nature of flow and resource allocation variables, respectively. In this work, we always assume fractional flow variables in order to capture the ability to split client flows to improve overall cloud network resource utilization. With respect to the resource allocation variables, however, we are interested in the following two main versions of the NSDP: Integer NSDP: The use of integer resource variables allows accurately capturing the allocation of an integer number of general purpose resource units (e.g., servers). In this case, the NSDP becomes a 1 Recall from the MCCF model that commodity (d;;i) gets processed by function (;i + 1), and thatr (;i) uv =r tr uv ;8(u;v)2 E;;i. 95 generalization of multi-commodity network design (MCND), where the network is augmented with compute edges that model the processing of service flows and where there are additional service chaining constraints that make sure flows follow service functions in the appropriate order. In fact, for the special case that each service is composed of a single commodity, the NSDP is equivalent to the MCND. We refer to the resulting MCCND problem as the integer NSDP. Fractional NSDP: The use of fractional resource variables becomes a specially good compromise between accuracy and tractability when the size of the cloud network resource units is much smaller than the total flow served at a given location. This is indeed the case for services that serve a large number of large-size flows (e.g., telco services [66]) and/or services deployed using small-size resource units (e.g., micro-services [67]). In this case, the NSDP becomes a generalization of min-cost MCF, with the exact equivalence holding in the case of single commodity services. We refer to the resulting MCCF problem as the fractional NSDP. 6.4 Fractional NSDP As discussed in the previous section, the fractional NSDP can be formulated as the MCCF problem that results from the linear relaxation of (6.1), i.e., by replacingy uv 2 Z + withy uv 0 in (6.1g). While the resulting linear programming formulation admits optimal polynomial-time solutions, it requires solving a linear program with a large number of constraints. In this work, we are interested in designing a fast fully polynomial approximation scheme (FPTAS) for the fractional NSDP. A natural direction would be to build on state of the art FPTAS for MCF that rely on shortest-path computations [62, 64]. However, these techniques have only been shown to provide O() approximations in time O(1= 2 ). Here, we target the design of faster approximations with order improvements in running time, i.e.,O(1=) andO(1= p ), by redesigning queue-length based algorithms that have been shown to be very effective for stochastic network optimization, in order to construct fast iterative algorithms for static MCCF problems such as the fractional NSDP. 6.4.1 QNSD algorithnm In the following, we describe the proposed queue-length based network service distribution (QNSD) algorithm. QNSD is an iterative algorithm that mimics the time evolution of an underlying cloud network queueing system. QNSD exhibits the following main key features: Inspired by a recent result that characterizes the transient and steady state phases in queue-length based algorithms [68], QNSD computes the solution to the fractional NSDP by averaging over a limited iteration horizon, yielding anO() approximation in timeO(1=). In the algorithm description, we use j2Z + to index the iteration frame over which averages are computed. QNSD further exploits the speed-up shown in gradient methods for convex optimization when com- bining gradient directions over consecutive iterations [69], [70], leading to a conjecturedO(1= p ) convergence. 96 Before describing the algorithm, we shall introduce the following queue variables: Actual queues:Q (d;;i) u (t) denotes the queue backlog of commodity (d;;i) at nodeu2V in iteration t. These queues represent the actual packet build-up that would take place in an equivalent dynamic cloud network system in which iterations correspond to time instants. Virtual queues: U (d;;i) u (t) denotes the virtual queue backlog of commodity (d;;i) at nodeu2V in iterationt. These virtual queues are used to capture the momentum generated when combining differential queue backlogs (acting as gradients in our algorithm) over consecutive iterations [70]. The QNSD algorithm works as follows: Initialization f (d;;i) uv (0) =y uv (0) = 0 8(u;v) Q (d;;i) u (0) = 0 8(u;v);d;;i Q (d;;M ) d (t) = 0 8d;;t U (d;;i) u (0) =U (d;;i) u (1) = 0 8u;d;;i U (d;;M ) d (t) = 0 8d;;i;t f (d;;i) s(u);u (t) = ( (d;) u 8u2S(d;);i = 0;t 0 otherwise j = 0 Note that the queues associated with the final commodities at their respective destinations are set to zero for allt to model the egress of flows from the cloud network. Main Procedure In each iterationt> 0: Queue updates: For all (d;;i)6= (u;;M ): Q (d;;i) u (t) = 2 4 Q (d;;i) u (t1) X v2 + (u) f (d;;i) uv (t1) + X v2 (u) f (d;;i) vu (t1) 3 5 + (6.2) Q (d;;i) uv (t) =Q (d;;i) u (t)Q (d;;i) u (t1) (6.3) U (d;;i) u (t) =U (d;;i) u (t1) + Q (d;;i) uv (t) + h U (d;;i) u (t1)U (d;;i) u (t2) i (6.4) where2 [0; 1) is a control parameter that drives the differential queue backlog momentum. Note that actual queues are updated according to standard queuing dynamics, while virtual queues are updated as a combination of the actual differential queue backlog and the virtual differential queue backlog in the previous iteration. 97 Transport decisions: For each link (u;v)2E: – Compute the transport utility weight of each commodity (d;;i): W (d;;i) uv (t) = 1 r tr uv U (d;;i) u (t)U (d;;i) v (t) whereV is a control parameter that governs the optimality v.s. running-time tradeoff. – Compute the max-weight commodity (d;;i) : (d;;i) = arg max (d;;i) n W (d;;i) uv (t) o – Allocate network resources and assign flow rates y uv (t) = ( c uv ifW (d;;i) uv (t)Vw uv > 0 0 otherwise f (d;;i) uv =y uv =r tr uv f (d;;i) uv (t) = 0; 8(d;;i)6= (d;;i) Processing decisions: For each nodeu2V: – Compute the processing utility weight of each commodity (d;;i): W (d;;i) u (t) = 1 r (;i + 1) u U (d;;i) u (t)U (d;;i+1) u (t) This key step in the QNSD computes the benefit of processing commodity (d;;i) via function (;i+1) at nodeu in iterationt, taking into account the difference between the (virtual) queue backlog of commodity (d;;i) and that of the next commodity in the service chain (d;;i+1). Note also how a high cloud resource requirement r (;i + 1) u reduces the benefit of processing commodity (d;;i). – Compute max-weight commodity (d;;i) : (d;;i) = arg max (d;;i) n W (d;;i) uv (t) o – Allocate cloud resources and assign flow rates y u;p(u) (t) = ( c u ifW (d;;i) uv (t)Vw uv > 0 0 otherwise y p(u);u =y u;p(u) f (d;;i) u;p(u) (t) =y u;p(u) =r (;i + 1) u f (d;;i+1) p(u);u (t) =f (d;;i) u;p(u) (t) f (d;;i) u;p(u) (t) =f (d;;i) p(u);u (t) = 0 8(d;;i)6= (d;;i) 98 Solution construction: – If t = 2 j , thent start =t andj =j + 1 – Flow solution f (d;;i) uv = P t =tstart f (d;;i) uv () tt start + 1 8(u;v);d;;i – Resource allocation solution y uv = P t =tstart y uv () tt start + 1 8(u;v) Remark 3. Note that QNSD solvesm +n max-weight problems in each iteration, leading to a running-time per iteration ofO((m +n)nL) =O(mnL). 2 6.4.2 Performance of QNSD Let Q(t + 1) = [Q(t) +Af(t)] + (6.5) denote the matrix form of the QNSD queuing dynamics given by (6.2), wheref(t),Q(t), andA denote the flow vector in iterationt, the queue-length vector in iterationt, and the matrix of flow coefficients, respectively. Let Z(Q), inf [y;f]2X n Vw y y + (Af) y Q o (6.6) denote the dual function of the fractional NSDP weighted by the control parameterV , wherew is the resource cost vector andX is the set of feasible solutions to the fractional NSDP. The performance of the general form of QNSD algorithm is not proven yet. However, when the control parameter is set to zero, i.e., there is no “history” factor getting involved in the decision making in each iteration, the convergence time performance is given as Theorem 12, whose proof follows the proof strategy in [68]. In addition, the proof of Theorem 12 needs the following assumption: Assumption 2. There exists a unique backlog stateQ for the network system, such thatZ(Q )>Z(Q) for allQ6=Q . Theorem 12. Let the input service demand =f (d;) u g be such that the fractional NSDP is feasible with Assumption assumption: 2 satisfied, and the Slater condition is satisfied. Then, lettingV = 1=, the QNSD 2 Recall that the number of clients scales asO(n) andL is the total number of functions. Hence, the number of commodities scales asO(nL). 99 algorithm with = 0 provides anO() approximation to the fractional NSDP in timeO(1=). Specifically, for alltT (), the QNSD solutionf f (d;;i) uv ; y uv g satisfies: X (u;v)2E a w uv y uv h opt +O() (6.7) X v2 (u) f (d;;i) vu X v2 + (u) f (d;;i) uv O() 8u;d;;i (6.8) (6.1c) (6.1g) (6.9) whereh opt denotes the optimal objective function value andT () is anO(1=) function, whose expression is derived in the theorem’s proof given in Section D.1. Proof. See Appendix in Section D.1. Remark 4. While the claim of Theorem 12 does not specify the dependence of the approximation on the size of the cloud network (m;n), in Section D.1, we show that, in timeO(m=), the total cost approximation is O(m) and the flow conservation violation isO(). The total running time of QNSD is thenO( 1 m 2 nL). Conjecture 1. With a properly chosen2 [0; 1), QNSD provides anO() approximation to the fractional NSDP in timeO(1= p ). Our conjecture is based on the fact that: i) as shown in the proof of Theorem 12, = 0 is sufficient for QNSD to achieveO(1=) convergence, ii) recent results have shownO(1= p ) convergence of queue-length based algorithms for stochastic optimization when including a first-order memory or momentum of the differential queue backlog [70], iii) simulation results in Section 6.6 show a significant improvement in the running time of QNSD with nonzero. 6.5 Integer NSDP It is immediate to show that the integer NSDP is NP-Hard by reduction from MCND. Recall that the integer NSDP is equivalent to MCND for the special case of single-commodity services. Hence, noO() approximation can in general be computed in sub-exponential time. After recognizing the difficulty of approximating the integer NSDP, we now establish key observations on the behavior of QNSD that allows us to add a simple, yet effective, condition on the evolution of QNSD that enables constructing a solution to the integer NSDP of good practical performance. We first observe that the QNSD algorithm evolves by allocating an integer number of resources in each iteration. In fact, QNSD solves a max-weight problem in each iteration and allocates either zero or the maximum number of resource units to a single commodity at each cloud network location. However, the 100 solution in each iteration may significantly violate flow conservation constraints. On the other hand, the average over the iterations is shown to converge to a feasible, but, in general, fractional solution. Based on these observations, we propose C-QNSD (constrained QNSD), a variation of QNSD designed to constrain the solution over the algorithm iterations to satisfy flow conservation across consecutive iterations, such that when the iterative flow solution converges, we can guarantee a feasible solution to the integer NSDP. C-QNSD works just as QNSD, but with the max-weight problems solved in each iteration replaced by the fractional knapsak problems that result from adding the conditional flow conservation constraints: X v2 + (u) f (d;;i) uv (t) X v2 (u) f (d;;i) vu (t1) 8d;;i (6.10) Specifically, in each iteration of the main procedure, after the queue updates described by (6.2)-(6.4), the transport and processing decisions are jointly determined as follows: Transport and processing decisions: For eachu2V: max X v2 + (u) X (d;;i) W (d;;i) uv f (d;;i) uv (t)Vy uv (t)w uv s.t. (6.10); (6.1d); (6.1g) where W (d;;i) uv (t) = 1 r tr uv U (d;;i) u (t)U (d;;i) v (t) ; 8v2 + (u)nfp(u)g W (d;;i) u;p(u) (t) = 1 r (;i + 1) u U (d;;i) u (t)U (d;;i + 1) u (t) Observe that without the conditional flow conservation constraints (6.10), the problem above can be decoupled into the set of max-weight problems whose solutions drive the resource allocation and flow rate assignment of QNSD. When including (6.10), the solution to the above maximization problem is forced to fill-up the cloud network resource units with multiple commodities, smoothing the evolution towards a feasible integer solution. In C-QNSD, the above maximization problem is solved via a linear-time heuristic that decouples the problem into a set of fractional knapsacks, one for each neighbor nodev2 + (u) and resource allocation choicey uv 2f0; 1;:::;c uv g. As shown in the following section, C-QNSD effectively consolidates service flows into a limited number of active resources. Providing some form of performance guarantee is of interest to the authors, but out of the scope of this paper. 6.6 Simulation Results In this section, we evaluate the performance of QNSD and C-QNSD in the context of Abilene US continental network, composed of 11 nodes and 28 directed links, as illustrated in Fig. 4.4. We assume each node and link is equipped with 10 cloud resource units and 10 network resource units, respectively. The cost per cloud resource unit is set to 1 at nodes 5 and 6, and to 3 at all other nodes. The cost per network resource unit is set to 1 for all 28 links. 101 0 5000 10000 15000 0 20 40 60 80 100 120 140 160 Iterations Cost DCNC, V=40 DCNC, V=20 QNSD, V=40, θ = 0 QNSD, V=20, θ = 0 QNSD, V=300, θ = 0.9 QNSD, V=150, θ = 0.9 (a) 0 5000 10000 15000 0 5 10 15 20 Iterations Flow Conservation Violation DCNC, V=40 DCNC, V=20 QNSD, V=40, θ = 0 QNSD, V=20, θ = 0 QNSD, V=300, θ = 0.9 QNSD, V=150, θ = 0.9 (b) 1 2 3 4 5 6 7 8 9 10 11 0 5 10 15 Node Index Processing Flow Rate Service 1, Function 1 Service 1, Function 2 Service 2, Function 1 Service 2, Function 2 (c) Figure 6.2: Performance of QNSD. a) Evolution of total cost over algorithm iterations; b) Evolution of flow conservation violation over algorithm iterations; c) Processing resource allocation distribution across cloud network nodes after running QNSD withV = 300 and = 0:9 for 15000 iterations. 6.6.1 QNSD We first test the performance of QNSD. We consider a scenario with 2 services, each composed of 2 functions. Function (1; 1) (service 1, function 1) has resource requirement 1 resource unit per flow unit; while functions (1; 2), (2; 1), and (2; 2) require 3, 2, and 2 resource units per flow unit, respectively. We assume resource requirements do not change across cloud locations, and that all links require 1 network resource unit per flow unit. There are 6 clients, represented by destination nodesf1; 2; 4; 7; 10; 11g in Fig. 4.4. Each destination node in the west coastf1; 2; 4g has each of the east coast nodesf11; 10; 7g as source nodes, and viceversa, resulting in a total of 18 source-destination pairs. We assume that each east coast client requests service 1 and each west coast client requests service 2, and that all input flows have size 1 flow unit. Fig. 6.2a shows the evolution of the cost function over the algorithm iterations. Recall that QNSD exhibits 3 main features: queue-length driven, truncated average computation, and first-order memory. In Fig. 6.2, we refer to QNSD without truncation and without memory as DCNC (see Chapter 4), as it resembles the evolution of the dynamic cloud network control algorithm. We use QNSD with = 0 to refer to QNSD with truncation, but without memory. And finally, QNSD with > 0 refers to the QNSD algorithm with both truncation and memory. It is interesting to observe the improved convergence obtained when progressively including QNSD’s key features. Observe how DCNC evolves the slowest and it is not able to reach the optimal objective function value within the 16000 iterations shown. Decreasing the control parameterV from 40 to 20 speeds up the convergence of DCNC, but yields a further-from-optimal approximation, not appreciated in the plot due to the slow convergence of DCNC. In fact, QNSD without truncation and memory can only guarantee aO(1= 2 ) convergence, which is also the convergence speed of the FPTAS for MCF in [62]. On the other hand, when including truncation, QNSD is able to reach the optimal cost value of 149 in around 6000 iterations, clearly illustrating the fasterO(1=) convergence. The peaks exhibited by the curves of QNSD with truncation illustrate the reset of the average computations at the beginning of each iteration frame. Note again that reducingV further speeds up convergence at the expense of slightly increasing the optimality gap. Finally, when including the memory feature with a value of = 0:9, QNSD is able to converge to the optimal solution even faster, illustrating the conjecturedO(1= p ) speed-up from the momentum generated when combining gradient directions (see Section 6.4). DecreasingV again illustrates the speed-up v.s. optimality 102 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 0 0.2 0.4 0.6 0.8 1 RX TX Flow Rate 0 0.5 1 1.5 2 2.5 3 x 10 4 0 2 4 6 8 10 12 Iterations Cost 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 0 0.2 0.4 0.6 0.8 1 RX TX Flow Rate Figure 6.3: Cost evolution and flow distribution of the C-QNSD solution with input rate 1 flow unit per client and control parametersV = 1000, = 0:9. 0 2000 4000 6000 8000 10000 0 1 2 3 4 5 6 7 8 9 Iterations Cost 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 0 0.2 0.4 0.6 0.8 1 RX TX Flow Rate Figure 6.4: Cost evolution and flow distribution of the C-QNSD solution with input rate 0:5 flow units per client and control parametersV = 100, = 0:9. tradeoff. Fig. 6.2b shows the convergence of the violation of the flow conservation constraints. We can observe a similar behavior as in the cost convergence, with significant speed-ups when progressively adding truncation and memory to QNSD. Finally, Fig. 6.2c shows the processing resource allocation distribution across cloud network nodes. As expected, most of the flow processing concentrates on the cheapest nodes 5 and 6. Note how function (1; 1), which has the lowest processing requirement (1 resource unit per flow unit) gets exclusively implemented in nodes 5 and 6, as it gets higher priority in QNSD’s scheduling decisions. Functions (2; 1) and (2; 2), which require 2 resource units per flow unit, share the remaining processing capacity at nodes 5 and 6. Finally, function (1; 2), with resource requirement 3 resource units per flow unit, and following function (1; 1) in the service chain, gets distributed closer to the east coast nodes, destinations for service 1. 103 6.6.2 C-QNSD In order to test the performance of C-QNSD, we setup a scenario with 2s-d pairs, (1; 11) and (2; 7), both requesting one service composed of one function with resource requirement 1 cloud resource per flow unit. We simulate the performance of C-QNSD for input rates 1 flow unit and 0:5 flow unit per client. Observe from Fig. 6.3 that for input rate 1, C-QNSD is able to converge to a solution of total cost 10, in which each client flow follows the shortest path and where the flow processing happens at nodes 5 and 6, respectively. The 3D bar plot shows the flow distribution over the links (non-diagonal entries in blue) and nodes (diagonal entries in red) in the network. Observe now the solution for input rate 0:5 in Fig. 6.4. The flow ofs-d pair (1; 11) is now routed along the longer pathf1; 2; 4; 5; 7; 10; 11g in order to consolidate as much flow as possible on the activated resources fors-d pair (2; 7). The flow processing of both services is now consolidated at node 5, yielding an overall cost of 7, which is in fact the optimal solution to the integer NSDP in this setting. Note that if the two client flows were following the shortest path and separately getting processed at nodes 5 and 6, without exploiting the available resource consolidation opportunities, the total cost under integer resources would be 10. While preliminary, our results show promise on the combined use of momentum information and conditional flow conservation constraints for providing efficient solutions to the integer NSDP in practical network settings. 6.7 Extensions While not included in this paper for ease of exposition, our model and algorithms can be extended to include: Function availability: Limiting the availability of certain functions to a subset of cloud nodes can be modeled by setting the flow variables associated with function (;i) to zero,f (d;;i) p(u);u = 0, for alld and for all cloud nodesu in which function (;i) is not available. Function flow scaling: Capturing the fact that flows can change size as they go through certain functions can be modeled by letting (;i) denote the number of output flow units per input flow unit of function (;i), and modifying the service chaining constraints (6.1c) as (;i) f (d;;i) p(u);u =f (d;;i1) u;p(u) . Per-function resource costs: If the cost function depends on the number of virtual resource units (e.g., VMs) instead of on the number of physical resource units (e.g., servers), then we can usey (;i) uv and w (;i) uv to denote the number of allocated resource units of function (;i) and the cost per resource unit of function (;i), respectively. Nonlinear cost function: In order to capture nonlinear cost effects such as economies of scale in which the cost per resource unit decreases with the number of allocated resource units, the cost function can be modified as P uv y uv;k w uv;k , withy uv;k 2f0; 1g indicating the allocation ofk resource units, and w uv;k denoting the cost associated with the allocation ofk resource units. The capacity constraints in (6.1d) become P (d;;i) f (d;;i) uv r (;i+1) uv ky uv;k c uv . 104 6.8 Conclusions We have formulated the NFV service distribution problem (NSDP) as a multi-commodity-chain network design problem on a cloud-augmented graph. We have shown that under load-proportional costs, the resulting fractional NSDP becomes a multi-commodity-chain network flow problem that admits optimal polynomial time solutions, and have designed QNSD, a queue-length based iterative algorithm that provides anO() approximation in timeO(1=). We further conjectured that, by exploiting the momentum obtained when combining differential queue backlogs across consecutive iterations, QNSD converges in timeO(1= p ), and illustrated it via simulations. We then addressed the case in which resource costs are a function of the integer number of allocated resources. We showed that the integer NSDP is NP-Hard by reduction from MCND and designed C-QNSD, a heuristic algorithm that constrains the evolution of QNSD to effectively consolidate flows into a limited number of active resources. 105 Appendix A Proofs in Chapter 2 A.1 Proof of Lemma 1 After taking expectation overQ (t 0 ) on both sides of (2.4) and doing concatenated summations overt 0 = 0; 1; 2; ;t 1, it follows that 1 dt t+d1 X =t X n;c E Q (c) n () 2 1 dt d1 X =0 X n;c E Q (c) n () 2 B 0 (d)" 1 t t1 X =0 X n;c E n Q (c) n () o : (A.1) Lettingt!1 yields 0 = lim sup t!1 1 dt d1 X =0 X n;c E Q (c) n () 2 B 0 (d)" lim sup t!1 1 t t1 X =0 X n;c E n Q (c) n () o ; (A.2) and then strong stability is achieved shown as (2.5). A.2 The Proof of Lemma 2 Letf R nk (x) represent the pdf (probability density function) ofR nk (). With Assumption 1, we have F (m) R nk (H 0 ) = H 0 Z 0 F (m1) R nk (H 0 x)f R nk (x)dx<F (m1) R nk (H 0 )F R nk (H 0 ); form 2; (A.3) and then recursively applying the above inequality yields (2.7). By including the case of F R nk (H 0 ) < F (0) R nk (H 0 ) = 1, we further get (2.8). A.3 Proof of the necessity part of Theorem 2 Consider a network satisfying Assumption 1 with input rate matrix (c) n . Suppose there is a policy with RMIA that stably supports (c) n . 107 Define a unit as a copy of a packet. Two units are said to be distinct if they are copies of different original packets. When a packet is successfully transmitted from one node to another, we say that the original unit is retained in the transmitting node while a copy of the unit is created in the receiving node. After the forwarding decision is made, only one of the non-distinct units is kept, either at the transmitting node or at one of the successful receiving nodes. LetA (c) n (t) represent the total number of the distinct units of commodityc that exogenously arrive at noden during the firstt timeslots. DefineY (c) n (t) as the total number of distinct units with source noden and commodityc that are delivered to the destination up to timet. Because of the assumption that the policy is rate stable, for any noden and commodityc, the time average delivery rate is equal to the time average input rate: lim t!1 Y (c) n (t) t = lim t!1 A (c) n (t) t = (c) n with prob: 1: (A.4) LetU (c) j (t) be the set of distinct units that reach their destinationc from the source nodej during the firstt timeslots. DefineG (c) nk (t) to be the total number of units of commodityc within the set S j:j2N U (c) j (t) that are forwarded from noden to nodek within the firstt timeslots. Then for noden and commodityc, it follows that Y (c) n (t) + X k:k2Kn G (c) kn (t) = X k:k2Kn G (c) nk (t); forn6=c: (A.5) Now define the following variables for all nodesn2N ,k2K n and all commoditiesc2N : (c) n (t): the number of times noden decides to transmit the units of commodityc in the firstt timeslots. (c) n (t): the number of epochs for transmitting units of commodityc from noden with RMIA in the firstt timeslots. q (c) n; n (t): the number of times units of commodityc sent by noden with RMIA are first-decoded by the set of nodes n K n ( n 6=;) in the firstt timeslots. (c) nk ( n ;t): the number of times the units of commodityc in S j:j2N U (c) j (t) are forwarded from node n to nodek in the firstt timeslots, given that the first successful receiver set is n . Then we have G (c) nk (t) t = (c) n (t) t (c) n (t) (c) n (t) X n: nKn; n6=; q (c) n; n (t) (c) n (t) (c) nk ( n ;t) q (c) n; n (t) ; (A.6) where we define 0/0 = 0 for terms in the above equation. Note that for allt, we have: 0 (c) n (t) t 1; 0 (c) nk ( n ;t) q (c) n; n (t) 1; n 6=;; (A.7) 0 G (c) nk (t) t 1; G (c) cn (t) =G (c) nn (t) = 0: (A.8) Since the constraints defined in (A.7) and (A.8) show closed and bounded regions with finite dimensions, a subsequence of timeslotsft l g must exists, over which the individual terms in (A.7) and (A.8) converge to constant values (c) n , (c) nk ( n ) andb (c) nk , respectively. 108 Moreover, letT (c) n (i) represent the length of theith epoch for noden to transmit units of commodityc with RMIA. First note that, with Assumption 1, the expectation ofT (c) n (i) exists because E n T (c) n (i) o = 1 X m=1 Y j:j2Kn F (m1) nj (H 0 )< 1 Q j:j2Kn 1F R nj (H 0 ) <1; (A.9) where the inequality holds true due to Lemma 2. Additionally, (c) n (t) and (c) n (t) have the following relation: (c) n (t) X i=1 T (c) n (i) (c) n (t)< (c) n (t)+1 X i=1 T (c) n (i): (A.10) With RMIA,T (c) n (i) is i.i.d. across epochs. If (c) n > 0, with (A.10) and according to the law of large number, we have lim t l !1 (c) n (t l ) (c) n (t l ) = 1 E n T (c) n (i) o = rmia n with prob: 1: (A.11) Here the notation rmia n having a superscript “rmia” but no superscript “c” because its value is only determined by the the RMIA transmission scheme and channel states. Furthermore, with RMIA, the first successful receiver set for each noden across epochs are i.i.d.. Then we get by the law of large numbers that, if (c) n > 0, lim t l !1 q (c) n; n (t l ) (c) n (t l ) =q rmia n; n with prob: 1; n 6=;; (A.12) whereq rmia n; n is the probability that n is the first successful receiver set with RMIA. The value ofq rmia n; n is also only determined by the the RMIA transmission scheme and channel states Suppose under a stationary randomized policy, denoted asPolicy , each noden decides to transmit a unit of commodityc in every timeslot with a fixed probability (c) n and chooses nodek2K n to get the forwarding responsibility with a fixed conditional probability (c) nk ( n ), given that the set of nodes n firstly decode the unit (ifk = 2 n , (c) nk ( n ) has to be set to 0). According to the law of large numbers and (A.11), (A.12), the values (c) n , (c) nk ( n ) andb (c) nk are the limit values over the whole timeslot sequenceftg, i.e., the converging subsequenceft l g becomesftg, and therefore it follows that lim t!1 (c) n (t) t = (c) n with prob: 1; (A.13) lim t!1 (c) nk ( n ;t) q (c) n; n (t) = (c) nk ( n ) with prob: 1; n 6=;; (A.14) lim t!1 G (c) nk (t) t =b (c) nk with prob: 1: (A.15) With (A.8) and (A.15), we have b (c) nk 0;b (c) cn = 0;b (c) nn = 0; forn6=c: 109 Furthermore, dividing both sides of (A.5) byt and using the results of (A.4) and (A.15) yields: (c) n + X k:k2Kn b (c) kn = X k:k2Kn b (c) nk ; forn6=c: Likewise, if n (c)> 0, according to (A.11)-(A.15), taking the limitt!1 in (A.6) yields: b (c) nk = (c) n rmia n X n: nKn; n6=; q rmia n; n (c) nk ( n ): Note that the above equation also holds true trivially if (c) n = 0; rmia n and q rmia n; n do not have policy- specifying mark because their values only depend on the average channel state. Thus, for any (c) n 2 RMIA , a stabilizing stationary randomized policy satisfies (2.12)-(2.14). A.4 Proof of the sufficiency part of Theorem 2 For a network satisfying Assumption 1 with input rate matrix (c) n , suppose there exists a stationary randomized policy, denoted asPolicy , and a constant" > 0, such thatPolicy and (c) n +" satisfy (2.12)-(2.14), which yields: X k:k2Kn b (c) nk X k:k2Kn b (c) kn (c) n "; forn6=c (A.16) Start by extending the queueing dynamic (2.1) to at-timeslot queueing relation underPolicy : Q (c) n (t 0 +t) max 8 < : Q (c) n (t 0 ) t 0 +t1 X =t 0 X k:k2Kn b (c) nk (); 0 9 = ; + t 0 +t1 X =t 0 X k:k2Kn b (c) kn () + t 0 +t1 X =t 0 a (c) n (); (A.17) wheret 0 0;t 1. By squaring both sides of (A.17) and taking expectations on each term givenQ (t 0 ), we upper bound thet-timeslot Lyapunov drift as follows: t (Q (t 0 ))B (t) 2 P n;c Q (c) n (t 0 )E ( P k:k2Kn 1 t t 0 +t1 P =t 0 b (c) nk () P k:k2Kn 1 t t 0 +t1 P =t 0 b (c) nk () 1 t t 0 +t1 P =t 0 a (c) n () Q (t 0 ) ) =B (t) 2 P n;c Q (c) n (t 0 ) ( (c) n 1 t t 0 +t1 P =t 0 E n a (c) n () o + P k:k2Kn 1 t t 0 +t1 P =t 0 E n b (c) nk () o b (c) nk P k:k2Kn 1 t t 0 +t1 P =t 0 E n b (c) kn () o b (c) kn + " P k:k2Kn b (c) nk P k:k2Kn b (c) kn (c) n #) ; (A.18) where the sum of squared terms formed by the flow rate and input rate has been replaced by a constantB (t): 1 t P n;c 2 4 " t 0 +t1 P =t 0 P k:k2Kn b (c) nk () # 2 + " t 0 +t1 P =t 0 P k:k2Kn b (c) kn () + t 0 +t1 P =t 0 a (c) n () # 2 3 5 N 2 t h 1 + (N +A max ) 2 i =B (t) ; (A.19) Q (t 0 ) is dropped from the expectation condition becausePolicy makes decisions independent of it. To prepare for the later proof, we propose the following lemma: 110 Lemma 4. For link (n;k) in a network satisfying Assumption 1, under a stationary randomized policy with RMIA, for any given"> 0, there exists an integerD (c) nk > 0, such that, for allt 0 0 (t 0 is integer), whenever tD (c) nk , the mean time average ofb (c) nk () over the interval from timeslott 0 to timeslott 0 +t 1 satisfies: 1 t t 0 +t1 X =t 0 E n b (c) nk () o b (c) nk ": (A.20) The proof of Lemma 4 is shown in Appendix A.11 and is non-trivial due to the fact thatb (c) nk () is not i.i.d. across timeslots with RMIA and the requirement that the value ofD (c) nk does not depend ont 0 . Based on Lemma 4, there exists an integerD (c) nk > 0, such that, for allt 0 0, whenevertD (c) nk , we have 1 t t 0 +t1 X =t 0 E n b (c) nk () o b (c) nk " 4N ; (A.21) Chooset =D = max n D (c) nk : n;c2N;k2K n o , consider the fact 1 t P t 0 +t1 =t 0 E n a (c) nk () o = (c) n and plug (A.16) and (A.21) into (A.18), it follows that D (Q (t 0 ))B (D )" X n;c Q (c) n (t 0 ); (A.22) Note that (A.22) satisfies the condition required by Lemma 1, and therefore the strong stability can be achieved: lim sup t!1 1 t t1 X =0 X n;c E n Q (c) n () o B (D ) " : (A.23) A.5 Proof of Theorem 3 For a network with an input rate matrix (c) n 2 REP , according to Theorem 1, there exists a stationary randomized policyPolicy that stably supports (c) n by forming a flow rate matrix b (c) nk with REP satisfying (2.9)-(2.11). If the network satisfies Assumption 1, an intuitive proof is to construct another stationary randomized policy, denoted asPolicy 1 , that forms the same flow rate matrix b (c) nk but with RMIA. Note that, when a noden transmits a unit, with RMIA, n is the first successful receiver set in the first- decoding timeslot, while in the same timeslot, with REP, the successful receiver set would be n instead ( n could be empty indicating no successful decoding). Then, due to MIA, we must have n n . Moreover, for noden, the decoding timeslots with REP is a subset of the first-decoding timeslots with RMIA. Base on these facts, define the following variables for all nodesn2N and all commoditiesc2N : q rep;(c) n; n (t): the number of times units of commodityc transmitted by noden with REP are decoded by the set of nodes n K n in the firstt timeslots. q rep;rmia;(c) n; n; n (t): the number of times units of commodity c transmitted by node n with RMIA are first-decoded by the set of n ( n 6=;, n K n ) in the firstt timeslots, while in the same timeslots of transmitting units of commodityc, would be decoded by the set of nodes n with REP. 111 Then we have q rep;(c) n; n (t) = X n: n n q rep;rmia;(c) n; n; n (t); if n 6=;: (A.24) According to the law of large numbers, let (c) n (t)!1, we have lim (c) n (t)!1 q rep;(c) n; n (t) (c) n (t) =q rep n; n ; with prob: 1; if n 6=;: (A.25) Likewise, since the occurrences of n (with REP) and n (with RMIA) in the first-decoding timeslots for noden are i.i.d. across different epochs (with RMIA), then according to the law of large numbers, we have lim (c) n (t)!1 X n: n n q rep;rmia;(c) n; n; n (t) (c) n (t) = lim (c) n (t)!1 X n: n n rmia;(c) n (t) (c) n (t) q rep;rmia;(c) n; n; n (t) rmia;(c) n (t) = X n: n n rmia n q rep;rmia n; n; n ; with prob: 1; if 6=;: (A.26) Divide both sides of (A.24) by (c) n (t) and plug (A.25) and (A.26) in, we get q rep n; n = X n: n n rmia n q rep;rmia n; n; n ; if n 6=;: (A.27) Consider the flow rate underPolicy shown as (2.11) in Theorem 1. Plugging (A.27) into (2.11) yields: b (c) nk = (c) n X n:k2 n X n: n n rmia n q rep;rmia n; n; n (c) nk ( n ) = (c) n rmia n X n:k2 n q rmia n; n 1(c) nk ( n ); (A.28) where we define 1(c) nk ( n ) = X n: n n;k2 n q rep;rmia n; n; n q rmia n; n (c) nk ( n ): (A.29) In (A.29),q rmia n; n is positive because of Assumption 1 and Lemma 2: q rmia n; n = 1 X m=1 Y k:k2 n h F (m1) R nk (H 0 )F (m) R nk (H 0 ) i Y k:k= 2 n;k2Kn F (m) R nk (H 0 )> 0: (A.30) Comparing (A.28) with (2.14) in Theorem 2, if there is a stationary randomized policyPolicy 1 with RMIA, under which each node n transmits a unit of commodity c with probability (c) n in each timeslot, and forwards the decoded unit to nodek2K n with probability 1(c) nk ( n ), given that the first successful receiver set is n , the same flow rate b (c) nk will be formed. Then the remaining part of the proof is to show that the 1(c) nk ( n ) in (A.29) are valid probability values. To validate the 1(c) nk ( n ), first consider the definitions ofq rmia;(c) n; n (t) andq rep;rmia;(c) n; n; n (t), and we have q rmia;(c) n; n (t) = X n: n q rep;rmia;(c) n; n; n (t); n 6=;: (A.31) 112 Divide both sides of (A.31) by rmia;(c) n (t) and let rmia;(c) n (t)!1, by applying the law of large number, we have q rmia n; n = X n: n n q rep;rmia n; n; n ; n 6=;: (A.32) By plugging (A.32) into (A.29), we check the validity of n 1(c) nk ( n ) :k2K n o as follows: X k:k2 n 1(c) nk ( n ) = X n: n n; n6=; q rep;rmia n; n; n q rmia n; n X k:k2 n (c) nk ( n ) P n: n n; n6=; q rep;rmia n; n; n P n: n n; n6=; q rep;rmia n; n; n +q rep;rmia n;;; n 1; n 6=;: (A.33) Thus, for (c) n 2 REP that can be supported byPolicy with REP, there also exists aPolicy 1 with RMIA that forms the same flow rate matrix b (c) nk and stably supports (c) n , i.e., REP RMIA . A.6 Proof of Theorem 4 Suppose (c) n 2 REP has a positive entry (c 0 ) n 0 and can be stably supported by a stationary randomized policy with REP:Policy , which forms a flow pathl n 0 ;c 0 on which each link has a positive time average flow rate. The goal of the proof is to construct a policy with RMIA that can stably support the input rate matrix 0 (c) n (l n 0 ;c 0 ). Based on the proof of Theorem 3, there exists a stationary randomized policyPolicy 1 with RMIA that can also stably support (c) n and forms the flow rate matrix b (c) nk . For link (n;k) on pathl n 0 ;c 0 , regardless of flow conservation constraints, the time average flow rate of commodityc 0 over link (n;k) has a increase potential based onb (c 0 ) nk formed byPolicy 1 with RMIA, by increasing the forwarding probability from 1(c 0 ) nk ( n ) to 1 0 (c 0 ) nk ( n ) ifk2 n : 1 0 (c 0 ) nk ( n ) = 8 > > > > < > > > > : 1(c 0 ) nk ( n ) + q rep;rmia n;;; n +q rep;rmia n; n p ln 0 ;c 0 (n) o ; n !, q rmia n; n ; if n6=n 0 ; 1(c 0 ) n 0 k ( n ) +q rep;rmia n 0 ;;; n 0 . q rmia n 0 ; n 0 ; if n =n 0 : (A.34) Consequently, if we maintain the forwarding probabilities to the nodes in n other than nodek, i.e. if9j2 n butj6=k, then 1 0 (c 0 ) nj ( n ) = 1(c 0 ) nj ( n ), the potential time average flow increase on link (n;k), denoted as (c 0 ) nk , can be obtained, as is shown in (2.16) in the theorem statement. Here we check the validity of the forwarding probabilities n 1 0 (c 0 ) nj ( n ) :j2K n o as follows: Ifk = 2 n , we have P j:j2Kn 1 0 (c 0 ) nj ( n ) = P j:j2Kn 1(c 0 ) nj ( n ) 1 according to (A.33). Ifn6=n 0 and n k;p ln 0 ;c 0 (n) o n , note that the time average flow on link (p ln 0 ;c 0 (n);n) is positive underPolicy , and we can assume that the time average flow rate on the reverse link (n;p ln 0 ;c 0 (n)) is 113 zero underPolicy (see footnote 6), i.e., (c 0 ) np ln 0 ;c 0 (n) ( n ) = 0, ifp ln 0 ;c 0 (n)2 n . Then it follows that X j:j2 n 1 0 (c 0 ) nj ( n ) = X j:j2 n;j6=p ln 0 ;c 0 (n) X n: n n;j2 n q rep;rmia n; n; n (c 0 ) nj ( n ) q rmia n; n + q rep;rmia n;;; n +q rep;rmia n; n p ln 0 ;c 0 (n) o ; n q rmia n; n = P n: n n; n6=;; n6= n p ln 0 ;c 0 (n) o q rep;rmia n; n; n q rmia n; n P j:j2 n;j6=p ln 0 ;c 0 (n) (c 0 ) nj ( n ) + q rep;rmia n;;; n +q rep;rmia n; p ln 0 ;c 0 (n) ; n q rmia n; n P n: n n q rep;rmia n; n; n q rmia n; n = 1: (A.35) Ifn =n 0 andk2 n , or ifn6=n 0 andp ln 0 ;c 0 (n) = 2 n , we also guarantee that P j:j2Kn 0 1 0 (c) n 0 j ( n 0 ) 1 with the similar derivation in (A.35) but without the termq rep;rmia n; n p ln 0 ;c 0 (n) o ; n . With the time average flow rate increase potential (c 0 ) nk on each link (n;k)2l n 0 ;c 0 , let (c 0 ) ln 0 ;c 0 represent the minimum flow increase potential among the links on pathl n 0 ;c 0 , i.e., (c 0 ) ln 0 ;c 0 = min (n;k)2ln 0 ;c 0 n (c 0 ) nk o . Therefore, for commodityc 0 , each link (n;k) along pathl n 0 ;c 0 can support a flow rate increase of (c 0 ) ln 0 ;c 0 just by assigning a new forwarding probability 2(c 0 ) nk ( n ) such that, i.e.,9 (c 0 ) nk 2 [0; 1], 2(c 0 ) nk ( n ) = 8 > > > > < > > > > : 1(c 0 ) nk ( n ) + q rep;rmia n;;; n +q rep;rmia n; n p ln 0 ;c 0 (n) o ; n ! (c 0 ) nk , q rmia n; n ; if n6=n 0 ; 1(c 0 ) n 0 k ( n ) +q rep;rmia n 0 ;;; n 0 (c 0 ) n 0 k . q rmia n 0 ; n 0 ; if n =n 0 ; (A.36) (c 0 ) ln 0 ;c 0 = (c 0 ) nk (c 0 ) nk : (A.37) Then we construct a stationary randomized policy with RMIA, denoted asPolicy 2 , under which each noden chooses to transmit a unit of commodityc with probability (c) n in each timeslot and forwards the decoded unit to nodek2 n with probability 2(c) nk ( n ). The 2(c) nk ( n ) satisfy (A.36) for (n;k)2l n 0 ;c 0 andc =c 0 ; 2(c) nk ( n ) = 1(c) nk ( n ) for (n;k) = 2l n 0 ;c 0 orc6=c 0 . Correspondingly, underPolicy 2 , the time average flow rate over each link can be expressed as follows: b 2(c 0 ) nk =b (c 0 ) nk + (c 0 ) ln 0 ;c 0 ; for (n;k)2l n 0 ;c 0 ; (A.38a) b 2(c) nk =b (c) nk ; for (n;k) = 2l n 0 ;c 0 orc6=c 0 : (A.38b) Following from (A.38), the flow rate matrix b 2(c) nk underPolicy 2 and the input rate matrix 0 (c) n (l n 0 ;c 0 ) satisfy (2.12)-(2.14) in Theorem 2, and therefore 0 (c) n (l n 0 ;c 0 )2 RMIA . 114 A.7 Proof of Corollary 1 With the fact that REP 6= O NN , consider an arbitrary input rate matrix (c) n within REP having a positive entry (c 0 ) n 0 . According to Theorem 4, there exists a simple pathl n 0 ;c 0 with positive time average flow such that corresponding 0 (c) n (l n 0 ;c 0 ) belongs to RMIA . For each link (n;k) on pathl n 0 ;c 0 , we have (c 0 ) nk rep;rmia nk (c 0 ) n rmia n X n: nKn; n6=; q rmia n; n 1(c) nk ( n ) = rep;rmia nk b (c 0 ) nk ; (A.39) where rep;rmia nk = X n:k2 n q rep;rmia n;;; n , X n: nKn; n6=; q rmia n; n : (A.40) The value of rep;rmia nk only depends on the average channel state and is positive: q rep;rmia n; n; n = 1 X m=1 Y k:k2 n F (m1) R nk (H 0 ) [1F R nk (H 0 )] Y k:k= 2 n F (m) R nk (H 0 ) Y k:k2 nn n h F (m1) R nk (H 0 )F R nk (H 0 )F (m) R nk (H 0 ) i > 0; for n n : (A.41) For each pathl n 0 ;c 0 , defineb ln 0 ;c 0 = min n b (c 0 ) nk : (n;k)2l n 0 ;c 0 o ; letL n 0 ;c 0 represent the set of simple paths with positive flow from noden 0 to nodec 0 underPolicy ; definel max n 0 ;c 0 as the simple path with the maximumb ln 0 ;c 0 among the paths inL n 0 ;c 0 ; letL represent the number of geometric simple paths from node n 0 to nodec 0 . Then, for each link (n;k)2l max n 0 ;c 0 , we have b l max n 0 ;c 0 1 L X ln 0 ;c 0 :ln 0 ;c 0 2L n 0 ;c 0 b ln 0 ;c 0 1 L (c 0 ) n 0 : (A.42) Define rep;rmia min = min n rep;rmia nk :n2N;k2K n o , it follows from (A.39) and (A.42) that, for pathl max n 0 ;c 0 , we have (c 0 ) l max n 0 ;c 0 rep;rmia min b l max n 0 ;c 0 1 L rep;rmia min (c 0 ) n 0 ; (A.43) where rep;rmia min . L is a positive constant value that only depends on the average channel state and geometric topology of the network. Thus, according to (A.43), RMIA extends from REP by at least a factor of rep;rmia min . L in the (n 0 ;c 0 )th dimension. Combing with Theorem 3, RMIA is strictly larger than REP . A.8 Proof of Lemma 3 Firstly, according to (2.21), the expectation ofZ n i; ^ Q (u n;i ) , given the backlog state ^ Q (u n;i ), can be upper bounded as follows: E n Z n i; ^ Q (u n;i ) ^ Q (u n;i ) o (a) X c E 8 < : u n;i+1 1 X =u n;i X k:k2Kn b (c) nk () ^ W (c) nk (u n;i ) ^ Q (u n;i ) 9 = ; (b) = X c E 8 < : X k:k2Kn b (c) nk (u n;i+1 1) ^ W (c) nk (u n;i ) ^ Q (u n;i ) 9 = ; ; (A.44) 115 In (A.44), the upper bound condition of (a) is achieved by the following activity: b (c) nk () = 0 when W (c) nk (u n;i ) = 0, i.e., noden never forwards a packet of commodityc to nodek2K n if nodek has non- positive differential backlog (zero differential backlog coefficient) of commodityc, which is consistent with the description in step 5) of the algorithm summary of DIVBAR-RMIA; the equality (b) in (A.44) holds true because of the fact that, for any policy inP,b (c) nk () = 0 whenu n;i <u n;i+1 1. Then we define the following variables for the policies with RMIA inP: (c) n (i): the variable that takes value 1 if noden decides to transmit a unit of commodityc in theith epoch, and takes value 0 otherwise. n (i): the variable that takes value 1 if noden decides to transmit a unit having a commodity (the unit is not null) in theith epoch, and takes value 0 if noden decides to transmit a null packet. X P nk (i): the random variable that takes value 1 if nodek2K n is in the first successful receiver set of theith epoch for noden under a policy inP, and takes value 0 otherwise. Given the policy setP, the value ofX P nk (i) only depends on the channel realizations in epochi for noden. ^ 1 (c) nk (i): the indicator variable that takes value 1 if and only ifX P nk (i) = 1 andX P nj (i) = 0 for all j2 ^ R high;(c) nk (u n;i ). Considering thatX P nk (i) (c) n (i)2f0; 1g andb (c) nk (u n;i+1 1) = b (c) nk (u n;i+1 1)X P nk (i) (c) n (i), it fol- lows from (A.44) that E n Z n i; ^ Q (u n;i ) ^ Q (u n;i ) o X c E 8 < : (c) n (i) X k:k2Kn b (c) nk (u n;i+1 1)X P nk (i) ^ W (c) nk (u n;i ) ^ Q (u n;i ) 9 = ; (a) X c E max k:k2Kn n X P nk (i) ^ W (c) nk (u n;i ) o ^ Q (u n;i ); (c) n (i) = 1 E n (c) n (i) ^ Q (u n;i ) o : (A.45) The inequality (a) in (A.45) holds true because P k:k2Kn b (c) nk (u n;i+1 1) 1; (a) becomes an equality with the following activity:b (c) nk (u n;i+1 1) = 1 when nodek has the largest positive termX P nk (i)W (c) nk (i), i.e., noden forwards a packet to nodek only if nodek is the successful receiver with the largest positive differential backlog of commodityc, which is consistent with step 5) of the algorithm summary of DIVBAR-RMIA. Moreover, note that ^ 1 (c) nk (i) takes value 1 with probability' (c) nk (i), given the backlog state ^ Q (u n;i ) and that noden decides to transmit a unit of commodityc in epochi, and therefore, we have with the definition of ^ 1 (c) nk (i) that E max k:k2Kn n X P nk (i) ^ W (c) nk (u n;i ) o ^ Q (u n;i ); (c) n (i) = 1 =E 8 < : X k:k2Kn ^ W (c) nk (u n;i ) ^ 1 (c) nk (i) ^ Q (u n;i ); (c) n (i) = 1 9 = ; = X k:k2Kn ^ W (c) nk (u n;i ) ^ ' (c) nk (i); (A.46) 116 which is the backpressure metric (2.18) in step 3) of its algorithm summary. Then plugging (A.46) into (A.45) yields : E n Z n i; ^ Q (u n;i ) ^ Q (u n;i ) o (a) X k2Kn ^ W (^ cn(i)) nk (u n;i ) ^ ' (^ cn(i)) nk (i) X c E n (c) n (i) ^ Q (u n;i ) o = ^ n (i)E n n (i)j ^ Q (u n;i ) o (b) ^ n (i): (A.47) The upper bound condition of (a) in (A.47) is achieved by the following activity: noden only transmits a unit whose belonging commodity maximizes the metric of (2.18) if it decides to transmit a packet having a commodity; the upper bound condition of (b) in (A.47) can be achieved by the following activity: node n transmits a unit having a commodity in theith epoch if and only if ^ n (i)> 0. These two upper bounds achieving activities are consistent with step 4) in the algorithm description of DIVBAR-RMIA. In summary, the upper bound achieving conditions of (A.44), (A.45), and (A.47) prove that DIVBAR- RMIA maximizesE n Z n i; ^ Q (u n;i ) ^ Q (u n;i ) o among the policies inP. A.9 Proof of Theorem 5 Reviewing Theorem 2, for a network satisfying Assumption 1 with an input rate matrix (c) n interior to RMIA , there exists a stationary randomized policy with RMIA:Policy , under which thet-timeslot average Lyapunov drift satisfies the condition (2.4) given in Lemma 1, and therefore the strong stability can be achieved. In this proof, the goal is to show that thet-timeslot Lyapunov drift under DIVBAR-RMIA, denoted as ^ Policy, satisfies the similar condition. Correspondingly, given the" satisfying (c) n +" 2 RMIA , the main proof strategy is to compare the upper bounds oft-timeslot Lyapunov drifts respectively under ^ Policy and under a policy that is a “modified version” ofPolicy and is denoted asPolicy 0 . HerePolicy 0 is defined as follows: it is the same as ^ Policy from timeslot 0 to timeslott 0 1; starting from timeslott 0 , it makes the stationary randomized transmitting and forwarding decisions with the same probabilities as Policy , while the transmissions with RMIA does not use the pre-accumulated partial information before timeslott 0 . 7 Start by doing some manipulations on thet-timeslot queueing relation similar as (A.17)-(A.18) under an arbitrary policy, thet-timeslot average Lyapunov drift starting at any timeslott 0 is upper bounded as follows: t (Q (t 0 ))B (t)+ 2 t X n;c Q (c) n (t 0 ) t 0 +t1 X =t 0 E n a (c) n () o 2 X n E n Z n (Q (t 0 ))j t 0 +t1 t 0 Q (t 0 ) o ; (A.48) 7 Note thatPolicy 0 in principle does not belong the RMIA policy space we discussed in Chapter 2, because using no pre- accumulated partial information before timeslott0 is equivalent to making forwarding decisions and renewal operations for all commodities in timeslott0 1 and starting a new epoch in timeslott0 , while timeslott0 1 may not be a first-decoding timeslot for all commodities. But it is convenient to introducePolicy 0 as an intermediate policy in the proof, because starting from timeslot t0, it is statistically the same asPolicy starting from timeslot 0 but with initial backlog state ^ Qn (t0). Likewise, the intermediate policies ~ Policy and ~ Policy , which will be introduced later, do not belong to the RMIA policy space for the same reason but are introduced to facilitate the proof. 117 where, the summation metric Z n (Q (t 0 ))j t 0 +t1 t 0 represents the following expression: Z n (Q (t 0 ))j t 0 +t1 t 0 = 1 t t 0 +t1 X =t 0 X c X k:k2Kn b (c) nk () h Q (c) n (t 0 )Q (c) k (t 0 ) i : (A.49) The comparison between ^ Policy andPolicy 0 is to compare the term P n E n Z n (Q (t 0 ))j t 0 +t1 t 0 Q (t 0 ) o on the right hand side of (A.48). In Chapter 2, we call this term as the key metric. In order to facilitate the comparison between ^ Policy andPolicy 0 , an intermediate policy ~ Policy is introduced: it is the same as ^ Policy from timeslot 0 to timeslott 0 1; in timeslott 0 , each noden makes a transmitting decision based on ^ Q (t 0 ) using the same strategy as under ^ Policy, and according to this transmission decision, either keeps transmitting units of the chosen commodity with RMIA or keeps silent (by transmitting the null units) from then on but without using the pre-accumulated partial information before timeslott 0 ; in each first-decoding timeslot for noden since timeslott 0 , noden makes the forwarding decision based on ^ Q (t 0 ) using the same strategy as ^ Policy. The proof proceeds into two steps: comparing ~ Policy andPolicy 0 as is shown Subsection A.9.1 and comparing ^ Policy and ~ Policy as is shown in Subsection A.9.2. A.9.1 Comparison on the key metric between ~ Policy andPolicy 0 In this part of proof, the key metrics under ~ Policy andPolicy 0 are analyzed on the interval from timeslot t 0 to timeslott 0 +t 1. In order to facilitate the comparison, we further introduce an intermediate policy: ~ Policy , which is defined as follows: it is the same as ^ Policy from timeslot 0 to timeslott 0 1; starting from timeslot t 0 , each node n makes the same transmitting decision as under Policy 0 in each timeslot, according to which noden chooses units to transmit with RMIA but without using the pre-accumulated partial information before timeslott 0 ; in each first-decoding timeslot for noden since timeslott 0 , noden makes the forwarding decision based on ^ Q (t 0 ) using the same strategy as ~ Policy. To prepare the later proof, we count the first epoch of commodityc that ends in or after timeslott 0 as epoch 1 of commodityc. Then we define the following variables: 8 (c) n (t 0 ;t): the number of times noden decides to transmit the units of commodityc from timeslott 0 to timeslott 0 +t 1. (c) n (t 0 ;t): the number of epochs for transmitting units of commodityc from noden with RMIA that end within the interval from timeslott 0 to timeslott 0 +t 1. (c) j : the index of the subsequence of timeslots which are used to transmit units of commodityc. u (c) n;i : the index of the starting timeslot of epochi of commodityc inf (c) j g. X (c) nk (i): the random variable that takes value 1 if nodek2K n is in the first successful receiver set of epochi of commodityc with RMIA and takes value 0 otherwise. 8 The definitions for (c) n (t0;t) and (c) n (t0;t) also appear in the proof of Lemma 4 shown in Appendix A.11. 118 Z (c) n (i;Q ()): the metric over the epochi of commodityc for noden under a policy based on a CPQ backlog state in timeslot shown as Z (c) n (i;Q ()) = u (c) n;i+1 1 X j=u (c) n;i X k:k2Kn b (c) nk (c) j h Q (c) n ()Q (c) k () i : (A.50) Comparing the key metrics under ~ Policy andPolicy 0 Because ~ Policy uses the same forwarding strategy as ^ Policy since timeslott 0 , andPolicy 0 has synchronous epochs as ~ Policy , resembling the derivations in the proof of Lemma 3 (see Appendix A.8), it follows that X n E ~ Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) = X n X c E 8 < : 1 t 0(c) n (t 0 ;t) X i=1 max k:k2Kn n X 0 (c) nk (i) ^ W (c) nk (t 0 ) o ^ Q (t 0 ) 9 = ; X n X c E 8 < : 1 t 0(c) n (t 0 ;t) X i=1 Z 0 (c) n i; ^ Q (t 0 ) ^ Q (t 0 ) 9 = ; = X n E Z 0 n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) ; (A.51) where we use the facts that ~ (c) n (t 0 ;t) = 0 (c) n (t 0 ;t), ~ u (c) n;i =u 0 (c) n;i and ~ X (c) nk (i) =X 0 (c) nk (i). Comparing the key metrics under ~ Policy and ~ Policy To facilitate the later proof, we first consider the policy set, denoted asY, that consists of the policies each of which is defined as follows: it is the same as ^ Policy from timeslot 0 to timeslott 0 1; from timeslot t 0 , each noden uses fixed probabilities to choose commodities to transmit with RMIA without using the pre-accumulated partial information before timeslott 0 ; in each first-decoding timeslot for noden since timeslott 0 , noden makes the forwarding decisions based on ^ Q n (t 0 ) using the same strategy as ^ Policy. Note that both ~ Policy or ~ Policy belong toY. Under a policy inY, the key metric can be expressed as follows: P n E Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) = P n P c E 8 > > < > > : (c) n (t 0 ;t) t 1 (c) n (t 0 ;t) (c) n (t 0 ;t) P i=1 Z (c) n (i; ^ Q(t 0 )) (c) n (t 0 ;t) (c) n (t 0 ;t) ^ Q (t 0 ) 9 > > = > > ; ; (A.52) where we define 0/0 = 0 for terms in the above equation. Since the transmission decisions under a policy in Y are i.i.d. over timeslots, according to the law of large number, we have lim t!1 (c) n (t 0 ;t) t = (c) n with prob: 1: (A.53) 119 Additionally, since P (c) n (t 0 ;t) i=1 T (c) n (i) (c) n (t 0 ;t)< P (c) n (t 0 ;t)+1 i=1 T (c) n (i), 9 we have according to the law of large number, if (c) n > 0, lim t!1 (c) n (t 0 ;t) (c) n (t 0 ;t) =E n T (c) n (i) o = 1 rmia n ; with prob: 1 (A.54) Moreover, the value ofZ (c) n i; ^ Q (t 0 ) under a policy inY only depends on ^ Q (t 0 ) and the channel realiza- tions in epochi, thereforeZ (c) n i; ^ Q (t 0 ) is i.i.d. across epochs given ^ Q (t 0 ), and we have, if (c) n > 0, lim t!1 1 (c) n (t 0 ;t) (c) n (t 0 ;t) X i=1 Z (c) n i; ^ Q (t 0 ) =z (c) n ^ Q (t 0 ) with prob: 1; (A.55) wherez (c) n ^ Q (t 0 ) = E n Z (c) n i; ^ Q (t 0 ) ^ Q (t 0 ) o . According to (A.52)-(A.55) and incorporating the trivial case (c) n = 0, it follows from (A.52) that lim t!1 X n E Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) = X n X c (c) n rmia n z (c) n ^ Q (t 0 ) : (A.56) With (A.56), given the" satisfying (c) n +" 2 RMIA , under a policy inY, there exists an integerD y , such that, for allt 0 0, whenevertD y , we have X n E Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) X n X c (c) n rmia n z (c) n ^ Q (t 0 ) " 16 X n;c ^ Q (c) n (t 0 ): (A.57) Since P c (c) n 1, we have X n X c (c) n rmia n z (c) n ^ Q (t 0 ) X n rmia n max c n z (c) n ^ Q (t 0 ) o : (A.58) If defining ~ c n = arg max c n z (c) n ^ Q (t 0 ) o , (A.58) becomes an equality when noden chooses commodity ~ c n to transmit from timeslott 0 on, which is consistent with the strategy of making transmitting decisions under ~ Policy. Under ~ Policy and ~ Policy , define ~ D 1 = D y and ~ D = D y as the respective threshold integers. Choose ~ D = max n ~ D ; ~ D 1 o , based on (A.58) and (A.57), we can get, for allt 0 0, whenevert ~ D, X n E ~ Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) X n E ~ Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) " 8 X n;c ^ Q (c) n (t 0 ): (A.59) Comparing the key metrics under ~ Policy andPolicy 0 In summary, plugging (A.51) into (A.59), and it follows that, for allt 0 0, whenevert ~ D, X n E ~ Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) X n E Z 0 n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) " 8 X n;c ^ Q (c) n (t 0 ): (A.60) 9 For any policy inY, we assume that a forwarding decision is made at each noden in timeslott0 1, and epoch 1 of each commodity for noden starts in or after timeslott0. 120 A.9.2 Comparison on the key metric between ^ Policy and ~ Policy Note that each epoch either ^ Policy or ~ Policy consists of contiguous timeslots. In this subsection, for a policy, under which each epoch consists of contiguous timeslots, we count the first epoch for noden that ends in or after timeslott 0 as epoch 1 (without specifying the commodity) and denote the starting timeslot of epochi as timeslotu n;i . The later proof can be facilitated by introducing another intermediate but non-causal policy: ~ ~ Policy, which is defined as follows: it is the same as ^ Policy from timeslot 0 to timeslot ^ u n;1 1 at each node n; starting from timeslot ^ u n;1 , noden keeps transmitting the units of commodity ~ c n with RMIA; in each first-decoding timeslot for noden since timeslott 0 , noden makes the forwarding decision based on ^ Q n (t 0 ) using the same strategy as ^ Policy (or ~ Policy). Here note that ~ c n is decided based on the value ^ Q (t 0 ), which is formed by ^ Policy, and we haveu n;1 t 0 . With this non-causality feature, ~ ~ Policy is non-realizable but is used to facilitate the theoretical analysis. Comparison between ^ Policy and ~ ~ Policy For a policy, under which each epoch consists of contiguous timeslots, defineM n (t 0 ;t) as the minimum number of epochs that covers the time interval [t 0 ;t 0 +t 1], i.e., M n (t 0 ;t) = min ( m :u n;1 + m X i=1 T n (i) 1t 0 +t 1 ) ; (A.61) whereT n (i) is defined as the number of timeslots in theith epoch for noden. Additionally, as is defined in (2.21), given ^ Q n (t 0 ), ~ ~ Z (c) n i; ^ Q (t 0 ) is i.i.d. across epochs under ~ ~ Policy, and we can notate its conditional expectation as follows: E n ~ ~ Z n i; ^ Q (t 0 ) ^ Q (t 0 ) o = ~ ~ z n ^ Q (t 0 ) : (A.62) IncorporatingM n (t 0 ;t) and ~ ~ z (c) n ^ Q (t 0 ) , we propose the following lemma that compares ^ Policy and ~ ~ Policy: Lemma 5. For a network satisfying Assumption 1, there exists a positive integer ^ D such that, for allt 0 0, whenevert ^ D, ^ Policy and ~ ~ Policy satisfy the following relationship given the backlog state ^ Q (t 0 ): X n E ^ Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) 1 t X n ~ ~ z n ^ Q (t 0 ) E n ~ ~ M n (t 0 ;t) o " NC 2 (t) +C 1 (t) + " 8 X n;c ^ Q (c) n (t 0 ) # ; (A.63) whereC 1 (t) =Nt (N +A max + 1);C 2 (t) =t (N +A max + 1). The detailed proof of Lemma 5 is shown in Appendix A.13. Note that the metric under ~ ~ Policy on the right hand side of (A.63) is not exactly the key metric we defined before but is more convenient to use for the later derivations. 121 Comparison between ~ ~ Policy and ~ Policy For epochi under ~ ~ Policy and epochj under ~ Policy, where 1i ~ ~ M n (t 0 ;t) and 1j ~ M n (t 0 ;t), note that ~ ~ Z n i; ^ Q (t 0 ) and ~ Z n j; ^ Q (t 0 ) are identically distributed because of the i.i.d channel state across timeslots and the fact that ~ ~ Policy and ~ Policy choose the same commodity to transmit in these two epochs, respectively. Therefore, we get ~ ~ z n ^ Q (t 0 ) = ~ z n ^ Q (t 0 ) : (A.64) On the other hand, under the two policies, since ~ ~ u n;1 t 0 = ~ u n;1 , with any channel realization (common for both policies), we have ~ ~ M n (t 0 ;t) ~ M n (t 0 ;t): (A.65) Resembling part of the proof in Lemma 5 (see Appendix A.13) and considering that ~ M n (t 0 ;t) t because ~ T n (i) 1, define the following indicator function of integeri = 1; 2; ;t under ~ Policy: ~ 1 n (i) = ( 1; 1i ~ M n (t 0 ;t)t 0; ~ M n (t 0 ;t)<it;: (A.66) Consider that ~ Z n i; ^ Q (t 0 ) and ~ 1 n (i) are independent because n ~ Z n i; ^ Q (t 0 ) :i 1 o are i.i.d. and the value of ~ 1 n (i) only depends on ~ T n (1); ; ~ T n (i 1). With (A.64) and (A.65), we have the following relationship: E 8 < : ~ Mn(t 0 ;t) X i=1 ~ Z n i; ^ Q (t 0 ) ^ Q (t 0 ) 9 = ; = t X i=1 E n ~ Z n i; ^ Q (t 0 ) ^ Q (t 0 ) o E ~ 1 n (i) = ~ z n ^ Q (t 0 ) E n ~ M n (t 0 ;t) o ~ ~ z n ^ Q (t 0 ) E n ~ ~ M n (t 0 ;t) o ; (A.67) which completes the comparison between ~ ~ Policy and ~ Policy. Comparison between ^ Policy and ~ Policy Plug (A.67) back into (A.63), it follows that, for allt 0 0, whenevert ^ D 2 , we have X n E ^ Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) X n E 8 < : 1 t ~ Mn(t 0 ;t) X i=1 ~ Z n i; ^ Q (t 0 ) ^ Q (t 0 ) 9 = ; " NC 2 (t) +C 1 (t) + " 8 X n;c ^ Q (c) n (t 0 ) # X n E ~ Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) " NC 2 (t) +C 1 (t) + " 8 X n;c ^ Q (c) n (t 0 ) # ; (A.68) which completes the comparison between ^ Policy and ~ Policy. 122 A.9.3 Strong stability achieved under ^ Policy Combining (A.60) in Subsection A.9.1 and (A.68) in Subsection A.9.2, the comparison on the key backpressure metric between ^ Policy andPolicy 0 is shown as follows: for allt 0 0, whenever8t max n ^ D; ~ D o , we have X n E ^ Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) X n E Z 0 n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) " NC 2 (t) +C 1 (t) + " 4 X n;c ^ Q (c) n (t 0 ; ) # : (A.69) After plugging (A.69) back into (A.48), ^ t ^ Q (t 0 ) can be further upper bounded as follows: ^ t ^ Q (t 0 ) B (t) + 2 [C 1 (t) +NC 2 (t)] + " 2 X n;c ^ Q (c) n (t 0 ) 2 ^ Q (t 0 ) ; (A.70) where ^ Q (t 0 ) is as follows: ^ Q (t 0 ) = X n E Z 0 n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) 1 t X n;c ^ Q (c) n (t 0 ) t 0 +t1 X =t 0 E n a (c) n () o = X n;c ^ Q (c) n (t 0 ) 1 t t 0 +t1 X =t 0 E 8 < : X k:k2Kn b 0 (c) nk () X k:k2Kn b 0 (c) kn ()a (c) n () 9 = ; : (A.71) Since from timeslot t 0 , Policy 0 is the same as the stationary randomized policy Policy starting from timeslot 0, according to the derivations from (A.18) to (A.22) in Appendix A.4, there exists a positive integer D , such that, for allt 0 0, whenevertD , we have ^ Q (t 0 ) = X n;c ^ Q (c) n (t 0 ) 1 t t1 X =0 E 8 < : X k:k2Kn b (c) nk () X k:k2Kn b (c) kn ()a (c) n () 9 = ; " 2 X n;c ^ Q (c) n (t 0 ): (A.72) Plugging (A.72) back into (A.70) and lettingD =t max n ^ D; ~ D;D o , we have, for allt 0 0, ^ D ^ Q (t 0 ) B (D) +C (D) " 2 X n;c ^ Q (c) n (t 0 ); (A.73) whereC (D) = 2 [C 1 (D) +NC 2 (D)] = 4ND (N +A max + 1). Thus, given the positive" satisfying (c) n +" 2 RMIA , (A.73) is achieved under ^ Policy. According to Lemma 1, we achieve (2.22), which completes the proof. A.10 Proof of Theorem 6 Given" satisfying (c) n +" 2 RMIA , the goal of the proof is to show that, thet-timeslot Lyapunov drift under DIVBAR-FMIA, denoted as ^ ^ Policy, has an upper bound satisfying the condition (2.4) required by 123 Lemma 1. The proof procedure is also similar to the proof of Theorem 5 (see Appendix A.9) except for minor modifications. To facilitate the proof, we also introduce the intermediate policies: ~ Policy,Policy 0 , ~ ~ Policy and ~ Policy , which are similar to the ones proposed in Appendix A.9 except for the following modifications: thePolicy 0 in this proof is the same as ^ ^ Policy from timeslot 0 to timeslott 0 1; the ~ Policy in this proof is the same as ^ ^ Policy from timeslot 0 to timeslott 0 1, and from timeslott 0 on, the transmitting and forwarding decisions of each epoch are made based on ^ ^ Q (t 0 ); the ~ Policy in this proof is the same as ^ ^ Policy from timeslot 0 to timeslott 0 1, and from timeslott 0 on, the forwarding decisions of each epoch are made based on ^ ^ Q (t 0 ); the ~ ~ Policy in this proof is the same as ^ ^ Policy from timeslot 0 to timeslot ^ ^ u n;1 1 at each noden, and from timeslot ^ ^ u n;1 on, the transmitting and forwarding decisions of epoch are made based on ^ ^ Q (t 0 ). To achieve the proof goal, the strategy is to compare the key metric P n E Z n ^ ^ Q (t 0 ) t 0 +t1 t 0 ^ ^ Q (t 0 ) under the introduced policies. A.10.1 Comparison on the key metric between ~ Policy andPolicy 0 The comparison between ~ Policy andPolicy 0 on the key metric is the same as that in the proof shown in Appendix A.9.1, except that the backlog coefficient here is ^ ^ Q (t 0 ). The final comparison results is that, for all t 0 0, there exists an integer ~ D> 0, such that, whenevert ~ D, X n E ~ Z n ^ ^ Q (t 0 ) t 0 +t1 t 0 ^ ^ Q (t 0 ) X n E Z 0 n ^ ^ Q (t 0 ) t 0 +t1 t 0 ^ ^ Q (t 0 ) " 8 X n;c ^ ^ Q (c) n (t 0 ): (A.74) A.10.2 Comparison on the key metric between ^ ^ Policy and ~ Policy With the similar strategy as Appendix A.9.2, the comparison on the key backpressure metric between ^ ^ Policy and ~ Policy consists of two steps: compare ^ ^ Policy and ~ ~ Policy and then compare ~ ~ Policy and ~ Policy. In this part of proof, the comparison between ^ ^ Policy and ~ ~ Policy over a single epoch is summarized as Lemma 6 shown as follows: Lemma 6. For each noden in a network satisfying Assumption 1 and for allt 0 0, whenevertT max , we have E n ^ ^ Z n i; ^ ^ Q (u n;i ) ^ ^ Q (t 0 ); ^ ^ 1 n (i) = 1 o E n ~ ~ Z n i; ^ ^ Q (t 0 ) ^ ^ Q (t 0 ); ^ ^ 1 n (i) = 1 o C 2 (t); (A.75) where T max = max n:n2N fEfT n (i)gg; C 2 (t) = t (N +A max + 1); ^ ^ 1 n (i) is equal to ^ 1 n (i) because ^ ^ Policy and ^ Policy have synchronous epochs. 124 The proof of Lemma 6 is shown in Appendix A.14. With (A.75), the remaining proof is the same as in Appendix A.9.2 except changing backlog coefficients to ^ ^ Q (u n;i ) and ^ ^ Q (t 0 ), we have: there exists an integer ^ ^ DT max , such that, for allt 0 0, whenevert ^ ^ D, X n E ^ ^ Z n ^ ^ Q (t 0 ) t 0 +t1 t 0 ^ ^ Q (t 0 ) X n E ~ Z n ^ ^ Q (t 0 ) t 0 +t1 t 0 ^ ^ Q (t 0 ) " NC 2 (t) +C 1 (t) + " 8 X n;c ^ ^ Q (c) n (t 0 ) # ; (A.76) whereC 1 (t) =Nt (N +A max + 1). A.10.3 Strong stability achieved under ^ ^ Policy With the similar manipulations as in Appendix A.9.3, it follows from (A.74) and (A.76) that, for allt 0 0 and with the integerD = max n ^ D 2 ; ~ D 2 ;D o , we have ^ ^ D ^ ^ Q (t 0 ) B (D) +C (D) " 2 X n;c ^ ^ Q (c) n (t 0 ); (A.77) whereC (D) = 4ND (N +A max + 1). According to Lemma 1, we achieve (2.23), which completes the proof. A.11 Proof of Lemma 4 To facilitate the proof, we first define the extended-epoch of commodity c for link (n;k) as the interval consisting of contiguous timeslots between two timeslots, in each of which a unit of commodityc is forwarded from noden to nodek. Specifically, this interval starts from the timeslot right after the timeslot when a unit of commodityc is forwarded from noden to nodek, and ends at the timeslot when the next forwarding of a unit of commodityc from noden to nodek happens. Suppose from timeslot 0 up to an arbitrary timeslott 0 ,M 0 1 units of commodityc have been forwarded from noden to nodek, whereM 0 1. Therefore, if we definet (c) nk;i as the starting timeslot of theith extended epoch of commodityc for link (n;k), with which we further definet (c) nk;1 = 0, we have max n t (c) nk;M 0 1; 0 o t 0 <t (c) nk;M 0 +1 1; (A.78) where, ifM 0 > 1,t (c) nk;M 0 1 is the ending timeslot of the (M 0 1)th extended-epoch of commodityc; ifM 0 = 1, timeslott 0 must be located within the 1st extended-epoch of commodityc, and thereforet 0 is lower bounded by 0. Additionally, for each extended epochi of commodityc for link (n;k), we have b (c) nk t (c) nk;i+1 1 = 1. Under a stationary randomized policy with the RMIA transmission scheme, because of the the re- newal operations, and the stationarity of the decision makings and channel states, 1 t P t1 =0 b (c) nk () and 1 t Pt (c) nk;i +t1 =t (c) nk;i b (c) nk () are identically distributed. This property will be used in the following derivations. 125 On the one hand, ift 0 t (c) nk;M 0 , then P t 0 1 =t (c) nk;M 0 b (c) nk () = 0 and P t 0 +t1 =t (c) nk;M 0 +t b (c) nk () 0, and we have t 0 +t1 X =t 0 b (c) nk () = t (c) nk;M 0 +t1 X =t (c) nk;M 0 b (c) nk () t 0 1 X =t (c) nk;M 0 b (c) nk () + t 0 +t1 X =t (c) nk;M 0 +t b (c) nk () t (c) nk;M 0 +t1 X =t (c) nk;M 0 b (c) nk (); (A.79) where, for any summation term P y =x f (), the summation value is defined as zero wheny<x. Additionally, ift 0 =t (c) nk;M 0 1, sinceb (c) nk t (c) nk;M 0 1 = 1 andb (c) nk t (c) nk;M 0 +t 1 1, we have t 0 +t1 X =t 0 b (c) nk () = t (c) nk;M 0 +t1 X =t (c) nk;M 0 b (c) nk () +b (c) nk t (c) nk;M 0 1 b (c) nk t (c) nk;M 0 +t 1 t (c) nk;M 0 +t1 X =t (c) nk;M 0 b (c) nk (): (A.80) Note that, given any"> 0, there exists an integerD (c) nk;0 > 0, such that, whenevertD (c) nk;0 , we have 1 t t1 X =0 E n b (c) nk () o b (c) nk " 2 : (A.81) Then according to (A.79)-(A.80) together with the fact that 1 t P t1 =0 b (c) nk () and 1 t Pt (c) nk;M 0 +t1 =t (c) nk;M 0 b (c) nk () are identically distributed, it follows that, for allt 0 0, whenevertD (c) nk;0 , 1 t t 0 +t1 X =t 0 E n b (c) nk () o 1 t E 8 > < > : t (c) nk;M 0 +t1 X =t (c) nk;M 0 b (c) nk () 9 > = > ; = 1 t t1 X =0 E n b (c) nk () o b (c) nk " 2 : (A.82) On the other hand, we have P t (c) nk;M 0 +1 1 =t 0 b (c) nk () 2 and P t (c) nk;M 0 +1 +t1 =t 0 +t b (c) nk () 0, and it follows that t 0 +t1 P =t 0 b (c) nk () = t (c) nk;M 0 +1 +t1 P =t (c) nk;M 0 +1 b (c) nk () + t (c) nk;M 0 +1 1 P =t 0 b (c) nk () t (c) nk;M 0 +1 +t1 P =t 0 +t b (c) nk () t (c) nk;M 0 +1 +t1 P =t (c) nk;M 0 +1 b (c) nk () + 2: (A.83) Since 1 t P t1 =0 b (c) nk () and 1 t Pt (c) nk;M 0 +1 +t1 =t (c) nk;M 0 +1 b (c) nk () are identically distributed, for all t 0 0, whenever t max n D (c) nk;0 ;d4/"e o , we have 1 t t 0 +t1 X =t 0 E n b (c) nk () o 1 t E 8 > < > : t (c) nk;M 0 +1 +t1 X =t (c) nk;M 0 +1 b (c) nk () 9 > = > ; + 2 t b (c) nk + " 2 + " 2 =b (c) nk +": (A.84) Combining (A.83) and (A.84), it follows that, given any " > 0, for all t 0 0, whenever t max n D (c) nk;0 ;d4/"e o =D (c) nk , we have 1 t t 0 +t1 X =t 0 E n b (c) nk () o b (c) nk ": (A.85) 126 A.12 Proof of Lemma 1 Taking expectation overQ (t 0 ) on both sides of (2.4), we get 1 d X n;c E Q (c) n (t 0 +d) 2 E Q (c) n (t 0 ) 2 B (d)" X n;c E n Q (c) n (t 0 ) o : (A.86) Writing (A.86) for all timeslots 0; 1; 2; ;t 1 and doing concatenated summations, it follows that 1 dt d+t1 X =t E Q (c) n () 2 1 dt d1 X =0 E Q (c) n () 2 B (d)" 1 t t1 X =0 X n;c E n Q (c) n () o : (A.87) Dropping the non-negative term 1 dt d+t1 P =t E Q (c) n () 2 on the left hand side of the above inequality and lettingt!1, it follows that 0 = lim sup t!1 1 dt d1 X =0 E Q (c) n () 2 B (d)" lim sup t!1 1 t t1 X =0 X n;c E n Q (c) n () o ; (A.88) and then strong stability is achieved: lim sup t!1 1 t t1 X =0 X n;c E n Q (c) n () o B (d) " : (A.89) A.13 Proof of Lemma 5 The comparison procedure between ^ Policy and ~ ~ Policy on their respective metrics involves two aspects: comparing the metrics over a single epoch; then extending the comparison to multiple epochs. To facilitate the later comparisons, it is necessary to transform the key metric under ^ Policy into certain form that is easier to manipulate. First, given the positive" satisfying (c) n +" 2 RMIA , for allt 0 0, whenever8td8/"e = ^ D 1 , we have X n E ^ Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) = X n E 8 < : 1 t ^ Mn(t 0 ;t) X i=1 ^ Z n i; ^ Q (t 0 ) 1 t u ^ Mn (t 0 ;t)+1 1 X =t 0 +t X c X k:k2Kn ^ b (c) nk () h ^ Q (c) n (t 0 ) ^ Q (c) k (t 0 ) i ^ Q (t 0 ) 9 = ; X n E 8 < : 1 t ^ Mn(t 0 ;t) X i=1 ^ Z n i; ^ Q (t 0 ) ^ Q (t 0 ) 9 = ; " 8 X n;c ^ Q (c) n (t 0 ) = X n E 8 < : 1 t ^ Mn(t 0 ;t) X i=1 ^ Z n i; ^ Q (u i ) ^ Q (t 0 ) 9 = ; " 8 X n;c ^ Q (c) n (t 0 ) X n E 8 < : 1 t ^ Mn(t 0 ;t) X i=1 h ^ Z n i; ^ Q (u i ) ^ Z n i; ^ Q (t 0 ) i ^ Q (t 0 ) 9 = ; ; (A.90) 127 where X n E 8 < : 1 t ^ Mn(t 0 ;t) X i=1 h ^ Z n i; ^ Q (u i ) ^ Z n i; ^ Q (t 0 ) i ^ Q (t 0 ) 9 = ; = P n E ( 1 t ^ Mn(t 0 ;t) P i=1 u n;i+1 1 P =u n;i P c P k:k2Kn ^ b (c) nk () h ^ Q (c) n (u n;i ) ^ Q (c) n (t 0 ) + ^ Q (c) k (t 0 ) ^ Q (c) k (u n;i ) i ^ Q (t 0 ) ) : (A.91) In (A.91),t 0 <u n;2 <; ;<u n;Mn(t 0 ;t) t 0 +t 1, whileu n;1 t 0 andt 0 u n;1 + 1T n (1). Then we have the following relationship: ^ Q (c) n (u n;i ) ^ Q (c) n (t 0 ) u n;i 1 X =t 0 2 4 X k:k2Kn ^ b (c) kn () +a (c) n () 3 5 t (N +A max ); for 2iM n (t 0 ;t) ; (A.92) ^ Q (c) k (t 0 ) ^ Q (c) k (u n;i ) u n;i 1 X =t 0 X j:j2K k ^ b (c) kj ()t; for 2iM n (t 0 ;t); k2K n ; (A.93) ^ Q (c) n (u n;1 ) ^ Q (c) n (t 0 ) t 0 1 X =u n;1 X k:k2Kn ^ b (c) nk () = 0; (A.94) ^ Q (c) k (t 0 ) ^ Q (c) k (u n;1 ) t 0 1 X =u n;1 2 4 X j:j2K k ^ b (c) jk () +a (c) k () 3 5 (N +A max )T n (1); fork2K n : (A.95) SettingtT max = max n:n2N fEfT n (i)gg and incorporating the fact that ^ M n (t 0 ;t)t, we plug (A.92)- (A.95) into (A.91) and get X n E 8 < : 1 t ^ Mn(t 0 ;t) X i=1 h ^ Z n i; ^ Q (u n;i ) ^ Z n i; ^ Q (t 0 ) i ^ Q (t 0 ) 9 = ; Nt (N +A max + 1) =C 1 (t): (A.96) If denoting ^ D = max n ^ D 1 ;T max o , and plugging (A.96) into (A.90) witht ^ D, we finally get, for allt 0 0, P n E ^ Z n ^ Q (t 0 ) t 0 +t1 t 0 ^ Q (t 0 ) P n E ( 1 t ^ Mn(t 0 ;t) P i=1 ^ Z n i; ^ Q (u n;i ) ^ Q (t 0 ) ) C 1 (t) " 8 P n;c Q (c) n (t 0 ): (A.97) A.13.1 Comparison between ^ Policy and ~ ~ Policy over a single epoch Define the following indicator function of integeri = 1; 2; 3; ;t: ^ 1 n (i) = ( 1; 1i ^ M n (t 0 ;t)t 0; ^ M n (t 0 ;t)<it: (A.98) 128 Since each noden under ^ Policy makes decisions based on the backlog state observation ^ Q (u n;i ) for each epochi, the value of ^ Z n i; ^ Q (u n;i ) is independent of ^ Q (t 0 ) given ^ Q (u n;i ). Consequently, we have E n ^ Z n i; ^ Q (u n;i ) ^ Q (u n;i ); ^ 1 n (i) = 1 o =E n ^ Z n i; ^ Q (u n;i ) ^ Q (u n;i ); ^ Q (t 0 ); ^ 1 n (i) = 1 o : (A.99) Additionally, according to Lemma 3, the metricE n Z n i; ^ Q (u n;i ) ^ Q (u n;i ); ^ 1 n (i) = 1 o is maximized under ^ Policy among all policies within policy setP, to which ~ ~ Policy also belongs. Thus, given any ^ Q (t 0 ), we have E n ^ Z n i; ^ Q (u n;i ) ^ Q (u n;i ); ^ Q (t 0 ); ^ 1 n (i) = 1 o E n ~ ~ Z n i; ^ Q (u n;i ) ^ Q (u n;i ); ^ Q (t 0 ); ^ 1 n (i) = 1 o =E n ~ ~ Z n i; ^ Q (t 0 ) ^ Q (u n;i ); ^ Q (t 0 ); ^ 1 n (i) = 1 o E n ~ ~ Z n i; ^ Q (t 0 ) ~ ~ Z n i; ^ Q (u n;i ) ^ Q (u n;i ); ^ Q (t 0 ); ^ 1 n (i) = 1 o : (A.100) where, similar to (A.91)-(A.95) but with the roles ofn andk switched, by settingt ^ D, we have E n ~ ~ Z n i; ^ Q (t 0 ) ~ ~ Z n i; ^ Q (u n;i ) ^ Q (u n;i ); ^ Q (t 0 ); ^ 1 n (i) = 1 o t (N +A max + 1) =C 2 (t) ; (A.101) Plugging (A.101) into (A.100) and then taking expectations on both sides over ^ Q (u n;i ), it follows that, whenevert ^ D, E n ^ Z n i; ^ Q (u n;i ) ^ Q (t 0 ); ^ 1 n (i) = 1 o E n ~ ~ Z n i; ^ Q (t 0 ) ^ Q (t 0 ); ^ 1 n (i) = 1 o C 2 (t); (A.102) which completes the comparison on the key metric over a single epoch between ^ Policy and ~ ~ Policy. A.13.2 Comparison between ^ Policy and ~ ~ Policy over ^ M n (t 0 ;t) (or ~ ~ M n (t 0 ;t)) epochs Starting from (A.97), we rewrite the first term on the right hand side as follows: X n E 8 < : 1 t ^ Mn(t 0 ;t) X i=1 ^ Z n i; ^ Q (u n;i ) ^ Q (t 0 ) 9 = ; = 1 t X n t X i=1 E n ^ Z n i; ^ Q (u n;i ) ^ 1 n (i) ^ Q (t 0 ) o : (A.103) Considering that, if ^ 1 n (i) = 0, we have E n ^ Z n i; ^ Q (u n;i ) ^ 1 n (i) ^ Q (t 0 ); ^ 1 n (i) = 0 o =E n ~ ~ Z n i; ^ Q (t 0 ) ^ 1 n (i) ^ Q (t 0 ); 1 n (i) = 0 o = 0; (A.104) if 1 n (i) = 1, according to (A.102), we have, for allt 0 0, whenevert ^ D, E n ^ Z n i; ^ Q (u n;i ) ^ 1 n (i) ^ Q (t 0 ); ^ 1 n (i) = 1 o E n ~ ~ Z n i; ^ Q (t 0 ) ^ 1 n (i) ^ Q (t 0 ); 1 n (i) = 1 o C 2 (t): (A.105) In sum of (A.104) and (A.105), it follows that, for allt 0 0, whenevert ^ D, E n ^ Z n i; ^ Q (u n;i ) ^ 1 n (i) ^ Q (t 0 ) o E n ~ ~ Z n i; ^ Q (t 0 ) ^ 1 n (i) ^ Q (t 0 ) o C 2 (t): (A.106) 129 Plug (A.106) into right hand side of (A.103), and it follows that, for allt 0 0, whenevert ^ D, X n E 8 < : 1 t ^ Mn(t 0 ;t) X i=1 ^ Z n i; ^ Q (u n;i ) ^ Q (t 0 ) 9 = ; 1 t X n t X i=1 E n ~ ~ Z n i; ^ Q (t 0 ) ^ 1 n (i) ^ Q (t 0 ) o NC 2 (t): (A.107) Note that the value of ^ 1 n (i) only depends onT 0 n;t 0 (1);T n (2); ;T n (i 1), whereT 0 n;t 0 (1) =u n;2 t 0 , therefore ~ ~ Z n i; ^ Q (t 0 ) and ^ 1 n (i) are independent. Moreover, we have ^ M n (t 0 ;t) = ~ ~ M n (t 0 ;t) because ^ Policy and ~ ~ Policy have synchronized epochs. Then it follows from (A.107) that, for allt 0 0, whenever t ^ D, X n E 8 < : 1 t ^ Mn(u n;i ;t) X i=1 ^ Z n i; ^ Q (u n;i ) ^ Q (t 0 ) 9 = ; 1 t X n ~ ~ z n ^ Q (t 0 ) t X i=1 E ^ 1 n (i) NC 2 (t) = 1 t X n ~ ~ z n ^ Q (t 0 ) E n ~ ~ M n (t 0 ;t) o NC 2 (t): (A.108) Finally, going back to (A.97) and plugging (A.108) in, we can get (A.63) and complete the proof. A.14 Proof of Lemma 6 Define ^ ^ X (c) nk (i) under ^ ^ Policy as the random variable that takes value 1 if nodek2K n is in the successful receiver set of epochi for noden when a unit of commodityc is transmitted by noden in epochi with FMIA, and takes value 0 otherwise. Since each noden under ^ ^ Policy makes decisions based on the backlog state observation ^ ^ Q (u n;i ) for each epochi, the value of ^ ^ Z n i; ^ ^ Q (u n;i ) is independent of ^ Q (t 0 ) and ^ ^ 1 (i) given ^ ^ Q (u n;i ). Then, with the similar manipulations as in the proof of Lemma 3 (see Appendix A.8), we get a similar result for ^ ^ Policy shown as follows: E n ^ ^ Z n i; ^ ^ Q (u n;i ) ^ ^ Q (u n;i ); ^ ^ Q (t 0 ); ^ ^ 1 n (i) = 1 o =E max k:k2Kn ^ ^ X ( ^ ^ cn(i)) nk (i) ^ ^ W ( ^ ^ cn(i)) nk (u n;i ) ^ ^ Q (u n;i ); ^ ^ Q (t 0 ); ^ ^ 1 n (i) = 1; ^ ^ ( ^ ^ cn(i)) n (i) = 1 E n ^ ^ n (i) ^ ^ Q (u n;i ) o : (A.109) In (A.109), adding ^ Q (t 0 ) and ^ ^ 1 (i) = 1 as part of the given condition is for the convenience of later derivations. To facilitate the later proof, we introduce another intermediate policy, which is denoted as ^ Policy 0 (i), i 1, and is defined as follows: it is the same as ^ ^ Policy from timeslot 0 to timeslot ^ ^ u n;i 1 at each noden; starting from timeslot ^ ^ u n;i , noden makes the transmitting and forwarding decisions based on ^ ^ Q n (u n;i ) using the same strategies as under ^ Policy, and the transmissions use the RMIA transmission scheme. Starting from 130 timeslot ^ ^ u n;i , ^ Policy 0 (i) can be treated the same as ^ Policy but with initial CPQ backlog state ^ ^ Q n (u n;i ), and we have ^ c 0 n (i) = ^ ^ c n (i) ; ^ (^ c 0 n (i)) n (i) = ^ 0 n (i) = ^ ^ n (i) = ^ ^ ( ^ ^ cn(i)) n (i): (A.110) Additionally, in epochi for each noden, since FMIA is used in the transmissions under ^ ^ Policy, where the retained partial information is used in the decoding process, while RMIA is used in the transmissions under ^ Policy 0 (i), we have, for any commodityc, ^ ^ X (c) nk (i) ^ X 0 (c) nk (i): (A.111) With (A.110) and (A.111), it follows from (A.109) that E n ^ ^ Z n i; ^ ^ Q (u n;i ) ^ ^ Q (u n;i ); ^ ^ Q (t 0 ); ^ ^ 1 n (i) = 1 o E max k:k2Kn n ^ X 0 (^ c 0 n (i)) nk (i) ^ ^ W (^ c 0 (i)) nk (u n;i ) o ^ ^ Q (u n;i ); ^ ^ Q (t 0 ); ^ ^ 1 n (i) = 1; ^ 0 n (i) = 1 E n ^ 0 n (i)j ^ ^ Q (u n;i ) o =E n ^ Z 0 n i; ^ ^ Q (u n;i ) ^ ^ Q (u n;i ); ^ ^ Q (t 0 ); ^ ^ 1 n (i) = 1 o : (A.112) Taking expectations over ^ ^ Q (u n;i ) on both sides of (A.112) yields: E n ^ ^ Z n i; ^ ^ Q (u n;i ) ^ ^ Q (t 0 ); ^ ^ 1 n (i) = 1 o E n ^ Z 0 n i; ^ ^ Q (u n;i ) ^ ^ Q (t 0 ); ^ ^ 1 n (i) = 1 o (A.113) According to the definition of ^ Policy 0 (i), we follow the similar manipulations as in the proof of Lemma 5 (see Appendix A.13.1) and get, for allt 0 0, whenevertT max , E n ^ Z 0 n i; ^ ^ Q (u n;i ) ^ ^ Q (t 0 ); ^ ^ 1 n (i) = 1 o E n ~ ~ Z n i; ^ ^ Q (t 0 ) ^ ^ Q (t 0 ); ^ ^ 1 n (i) = 1 o C 2 (t): (A.114) Plugging (A.114) into (A.113) yields (A.75), which completes the proof. 131 Appendix B Proofs in Chapter 4 B.1 Proof of Theorem 7 We prove Theorem 7 by separately proving necessary and sufficient conditions. B.1.1 Proof of Necessity We prove that constraints (5.11)-(5.17) are required for cloud network stability and thath given in (5.19) is the minimum achievable cost by any stabilizing policy. Consider an input rate matrix2 (G; ). Then, there exists a stabilizing policy that supports. We define the following quantities for this stabilizing policy: X (d;;m) i (t): the number of packets of commodity (d;;m) exogenously arriving at nodei, that got delivered within the firstt timeslots F (d;;m) i;pr (t) andF (d;;m) pr;i (t): the number of packets of commodity (d;;m) input to and output from the processing unit of nodei, that got delivered within the firstt timeslots, respectively; F (d;;m) ij (t): the number of packets of commodity (d;;m) transmitted through link (i;j), that got delivered within the firstt timeslots where we say that a packet of commodity (d;;m) got delivered within the firstt timeslots, if it got processed by functionsf(;m + 1);:::; (;M )g and the resulting packet of the final commodity (d;;M ) exited the network at destinationd within the firstt timeslots. The above quantities satisfy the following conservation law: X j2 (i) F (d;;m) ji (t) +F (d;;m) pr;i (t) +X (d;;m) i (t) = X j2 + (i) F (d;;m) ij (t) +F (d;;m) i;pr (t); (B.1) for all nodes and commodities, except for the final commodities at their respective destinations. Furthermore, we define: i;k (t): the number of timeslots within the firstt timelots, in whichk processing resource units were allocated at nodei 133 (d;;m) i;k (t): the number of packets of commodity (d;;m) processed by nodei during the i;k (t) timeslots in whichk processing resource units were allocated ij;k (t): the number of timeslots within the firstt timeslots, in whichk transmission resource units were allocated at link (i;j) (d;;m) ij;k (t): the number of packets of commodity (d;;m) transmitted over link (i;j) during the ij;k (t) timeslots in whichk transmission resource units were allocated It then follows that F (d;;m) i;pr (t) t i;k (t) t (d;;m) i;k (t)r (;m+1) i;k (t)C i;k C i;k r (;m+1) ; 8i;d;;m<M ; (B.2) F (d;;m) ij (t) t ij;k (t) t (d;;m) ij;k (t) ij;k (t)C ij;k C ij;k ; 8(i;j);d;;m; (B.3) where we define 0=0 = 1 in case of zero denominator terms. Note that, for allt, we have 0 i;k (t) t 1; 0 (d;;m) i;k (t)r (;m+1) i;k (t)C i;k 1; (B.4) 0 ij;k (t) t 1; 0 (d;;m) ij;k (t) ij;k (t)C ij;k 1: (B.5) In addition, leth represent the liminf of the average cost achieved by this policy: h, lim inf t!1 1 t X t1 =0 h (): (B.6) Then, due to Boltzano-Weierstrass theorem [71] on a compact set, there exists an infinite subsequence ft u gftg such that lim tu!1 1 t u X tu1 =0 h () =h; (B.7) the left hand of (B.2) and (B.3) converge tof (d;;m) i;pr andf (d;;m) ij : lim tu!1 F (d;;m) i;pr (t u ) t u =f (d;;m) i;pr ; lim tu!1 F (d;;m) ij (t u ) t u =f (d;;m) ij ; (B.8) and the terms in (B.4) and (B.5) converge to i;k , i;k , ij;k , and ij;k : lim tu!1 i;k (t u ) t u = i;k ; lim tu!1 (d;;m) i;k (t u )r (;m+1) i;k (t u )C i;k = i;k ; (B.9) lim tu!1 ij;k (t u ) t u = ij;k ; lim tu!1 (d;;m) ij;k (t u ) ij;k (t)C ij;k = ij;k : (B.10) from which (4.6g) and (4.6h) follow. 134 Plugging (B.8), (B.9), and (B.10) respectively back into (B.2) and (B.3), lettingt u !1 yields f (d;;m) i;pr 1 r (;m+1) i;k (d;;m) i;k C i;k ; (B.11) f (d;;m) ij ij;k (d;;m) ij;k C ij;k ; (B.12) from which (4.6c) and (4.6d) follow. Furthermore, due to cloud network stability, we have lim t!1 P t =0 a (d;;m) i (t) t = lim t!1 X (d;;m) i (t) t = (d;;m) i ; w.p.1;8i;d;;m; (B.13) lim tu!1 F (d;;m) pr;i (t u ) t u = lim tu!1 (;m) F (d;;m1) i;pr (t u ) t u = (;m) f (d;;m1) i;pr ,f (d;;m) pr;i ; w.p.1;8i;d;;m; (B.14) from which (5.12) follows. Evaluating (B.1) inft u g, dividing byt u , sendingt u to1, and using (B.8), (B.13), and (B.14), then Eq. (4.6a) follows. Finally, from (6.1a), and using the quantities defined at the beginning of this section, we have 1 t u X tu1 =0 h () = X i X k2K i 2 4 i;k (t u )w i;k t u + X (d;;m) r (;m+1) (d;;m) i;k (t u )e i t u 3 5 + X (i;j) X k2K ij 2 4 ij;k (t u )w ij;k t u + X (d;;m) (d;;m) ij;k (t u )e ij t u 3 5 = X i X k2K i 2 4 i;k (t u )w i;k t u + i;k (t u ) t u X (d;;m) r (;m+1) (d;;m) i;k (t u )C i;k e i i;k (t u )C i;k 3 5 + X (i;j) X k2K ij 2 4 ij;k (t u )w ij;k t u + ij;k (t u ) t u X (d;;m) (d;;m) ij;k (t u )C ij;k e ij ij;k (t u )C ij;k 3 5 : (B.15) Lettingt u !1, we obtain (4.8). Finally, (4.7) follows from taking the minimum over all stabilizing policies. B.1.2 Proof of Sufficiency Given an input rate matrix,f (d;;m) i g, if there exits a constant> 0 such that input ratef (d;;m) i +g, together with probability values ij;k , i;k , (d;;m) ij;k , (d;;m) i;k , and flow variablesf (d;;m) ij ,f (d;;m) i;pr , satisfy (4.6a)-(4.6h), we can construct a stationary randomized policy that uses these probabilities to make scheduling decisions, which yields the mean rates: E n (d;;m) i;pr (t) o =f (d;;m) i;pr ; E n (d;;m) ij (t) o =f (d;;m) ij : (B.16) 135 Plugging (B.16) andf (d;;m) i +g in (4.6a), we have E 8 < : X j2 + (i) (d;;m) ij (t)+ (d;;m) i;pr (t) X j2 (i) (d;;m) ji (t) (d;;m+1) (d;;m) i;pr (t)a (d;;m)(t) i 9 = ; : (B.17) By applying standard Lyapunov drift manipulations [44], we upper bound the Lyapunov drift (Q (t)) (see Sec. 4.4.1) as (Q(t))NB 0 + P (d;;m);i Q (d;;m) i (t)E n P j2 (i) (d;;m) ji (t) + (d;;m) pr;i (t) P j2 + (i) (d;;m) ij (t)+ (d;;m) pr;i (t)+a (d;;m) i (t) o NB 0 P (d;;m);i Q (d;;m) i (t); (B.18) whereB 0 is a constant that depends on the system parameters. With some additional manipulations, it follows from (B.18) that the cloud network is strongly stable, i.e., the total mean average backlog is upper bounded. Therefore,f (d;;m) i g is interior to (G; ) (due to the existence of). B.2 Proof of Theorem 8 We prove Theorem 8 for each DCNC algorithm by manipulating the linear termZ(t) and the quadratic term (t) in the LDP upper bound expression given in (4.11). B.2.1 DCNC-L We upper bound (t) in (4.11) as follows: (t) 1 2 N h max C max tr + C max pr r min 2 + max C max tr + max C max pr r min +A max 2 i ,NB 0 : (B.19) Plugging (B.19) into (4.11) yields (Q (t)) +VEfh(t)jQ (t)gNB 0 EfVh(t)+Z(t)jQ (t)g+ X (d;;m);i (d;;m) i Q (d;;m) i (t): (B.20) Since,f (d;;m) i g is interior to (G; ), there exists a positive number such that +12 . According to (4.12), DCNC-L minimizesVh(t) +Z(t) among all policies subject to (4.4d)-(4.4g). We use to identify the stationary randomized policy that supports +1 and achieves average costh ( +1), characterized by Theorem 7. The LDP function under DCNC-L can be further upper bounded as (Q (t)) +VEfh(t)jQ (t)gNB 0 +EfVh +Z (t)jQ(t)g+ P (d;;m);i (d;;m) i Q (d;;m) i (t) =NB 0 +Vh ( +1) + P (d;;m);i Q (d;;m) i (t) " P j2 (i) f (d;;m) ji +f (d;;m) pr;i P j2 + (i) f (d;;m) ij +f (d;;m) pr;i + (d;;m) i # NB 0 +Vh ( +1) P (d;;m);i Q (d;;m) i (t): (B.21) 136 where the last inequality holds true due to (4.4b). B.2.2 DCNC-Q We extract the quadratic terms ( (d;;m) ij (t)) 2 and ( (d;;m) i;pr (t)) 2 by decomposing (t) as follows: (t) = tr (t) + pr (t) + 0 (t); (B.22) where tr (t), X (i;j) X (d;;m) (d;;m) ij (t) 2 pr (t), 1 2 X (d;;m);i (d;;m) i;pr (t) 2 + (d;;m) pr;i (t) 2 ; 0 (t), X (d;;m);i (d;;m) i;pr (t) X j2(i) (d;;m) ij (t)+ X j;v:j;v2(i);v6=j (d;;m) ij (t) (d;;m) iv (t)+ (d;;m) pr;i (t) X j2(i) (d;;m) ji (t) + X j;v:j;v2(i);v6=j (d;;m) ji (t) (d;;m) vi (t) + 1 2 a (d;;m) i 2 + a (d;;m) i (t) X j2(i) (d;;m) ji (t) + (d;;m) pr;i (t) : According to (4.13), DCNC-Q minimizes the metric tr (t) + pr (t) +Z(t) +Vh(t) among all policies subject to (4.4d)-(4.4g). Hence, the LDP function under DCNC-Q can be further upper bounded as (Q (t)) +VEfh(t)jQ (t)gE 0 (t)+ tr + pr Q(t) +Vh ( +1) +EfZ (t)jQ(t)g+ P (d;;m);i (d;;m) i Q (d;;m) i (t): (B.23) On the other hand, note that 0 (t) + tr + pr N h (1 + max )C max pr max C max tr r min + ( max 1) max (C max tr ) 2 +A max max C max tr + max C max pr r min + 1 2 (A max ) 2 i +N max (C max tr ) 2 + 1 2(r min ) 2 N C max pr 2 h 1 + ( max ) 2 i = 1 2 N h max C max tr + C max pr r min 2 + max C max tr + max C max pr r min +A max 2 i =NB 0 : (B.24) Plugging (B.24) into (B.23) yield (Q (t)) +VEfh(t)jQ (t)gNB 0 +h ( +1) X (d;;m);i Q (d;;m) i (t): (B.25) B.2.3 EDCNC-L and EDCNC-Q Using (4.14) in (4.1), and following standard LDP manipulations (see Ref. [45]), the LDP function can be upper bounded as follows: (Q (t)) +VEfh(t)jQ (t)g(t) +NB 0 +h ( +1) X (d;;m);i ^ Q (d;;m) i (t); (B.26) 137 where (t), P (d;;m);i Y (d;;m) i E n P j2 (i) (d;;m) ji (t)+a (d;;m) i (t) + (d;;m) pr;i (t) P j2 + (i) (d;;m) ij (t) (d;;m) i;pr (t) Q (t) o : (B.27) DenoteY max , max i;(d;;m) Y (d;;m) i , which satisfiesY max max i;j fH i;j gN 1. Then, following (B.27), we lower bound (t) as (t)N max C max tr + C max pr r min ,NB : (B.28) Plugging (B.28) into (B.26) and using ^ Q (d;;m) i (t)Q (d;;m) i (t) yields (Q (t)) +VEfh(t)jQ (t)gNB 1 +h ( +1) X (d;;m);i Q (d;;m) i (t); (B.29) whereB 1 ,B 0 +B . B.2.4 Network Stability and Average Cost Convergence with Probability 1 We can use the theoretical result in [72] for the proof of network stability and average cost convergence with probability 1. Note that the following bounding conditions are satisfied in the cloud network system: The second momentEf(h(t)) 2 g is upper bounded by ( P ij w ij;K ij + P i w i;K i ) 2 and therefore satisfies X 1 =0 E n (h ()) 2 o. <1: (B.30) Efh(t)jQ(t)g is lower bounded as Efh(t)jQ(t)g 0: (B.31) For alli, (d;;m), andt, the conditional fourth moment of backlog dynamics satisfies E h Q (d;;m) i (t + 1)Q (d;;m) i (t) i 4 Q(t) max C max tr + max C max pr r min +A max 4 <1: (B.32) With (B.30)-(B.32), based on the derivations in [72], Eq. (B.21), (B.25), and (B.29) lead to network stability (4.17) and average cost (4.16) convergence with probability 1 under DCNC-L, DCNC-Q, EDCNC-L(Q), respectively. B.3 Proof of Theorem 9 Let’s first prove Eq. (4.21). To this end denoteh(t), 1 t P t1 =0 Efh()g. Then, under the DCNC policy and after some algebraic manipulations similar to the ones used for (B.21), we upper bound the LDP function as follows: (Q(t)) +VEfh(t)jQ(t)gNB +Vh (); (B.33) 138 whereh () is the minimum average cost given. Taking the expectation overQ(t) on both sides of (B.33) and summing over = 0; ;t 1 further yields 1 2t h E n kQ (t)k 2 o E n kQ (0)k 2 oi NB+V h h ()h(t) i : (B.34) Then it follows that, by settingV = 1= and for allt 1, h(t)h () NB V + 1 2Vt E n kQ (0)k 2 o NB + 1 2 E n kQ(0)k 2 o ; (B.35) which proves (4.21). In order to prove (4.22), we first introduce the following quantities for an arbitrary policy: y(t): the vector of elementsy i (t) andy ij (t); ~ (t): the vector of actual flow rate elements ~ (d;;m) i;pr (t), ~ (d;;m) pr;i (t), and ~ (d;;m) ij (t); ~ x(t): the vector [y(t); ~ (t)]; f(t): the vector of elements f (d;;m) i (t) as in (4.20). Summing both sides of (4.20) over = 0; 1; ;t 1 and then dividing them byt, for alli;d;;m;t 1, yield, f(t), 1 t X t1 =0 Eff()g= 1 t EfQ(t)Q (0)g: (B.36) Lemma B.3.1. If is interior to (G; ), there exits a constant vector such that h ()h(t) y f(t): (B.37) Proof. See Appendix B.4 in the supplementary material. Plugging (B.36) into (B.37) yields h ()h(t) 1 t y EfQ(t)Q(0)g 1 t kk (kEfQ (t)gk +kEfQ (0)gk); (B.38) Under the DCNC policy, by further plugging (B.38) into the right hand side of (B.34), we have 1 2t E n kQ (t)k 2 o E n kQ (0)k 2 o NB +V kk t (kEfQ (t)gk +kEfQ (0)gk): (B.39) By using the factEfkQ (t)k 2 gkEfQ (t)gk 2 in (B.39), it follows that kEfQ (t)gk 2 2VkkkEfQ (t)gk 2NBtEfkQ (0)k 2 g 2VkkkEfQ (0)gk 0: (B.40) The largest value ofkEfQ(t)gk that satisfies (B.40) is given by kEfQ (t)gkVkk + q V 2 kk 2 + 2NBt +EfkQ (0)k 2 g + 2VkkkEfQ (0)gk Vkk+ q (Vkk+EfkQ(0)kg) 2 +2NBt+varfkQ(0)kg: (B.41) 139 Finally, by settingV = 1= andt = 1= 2 , we plug (B.41) back into the right hand side of (B.36) and obtain 1 t X t1 =0 E n f (d;;m) i () o 1 t (kEfQ(t)gk +kEfQ(0)gk) Vkk t + kEfQ(0)gk t + 1 t q (Vkk+EfkQ(0)kg) 2 +2NBt+varfkQ(0)kg 2EfkQ(0)kg+ p varfkQ(0)kg 2 + p 2NB+2kk ; which proves (4.22). B.4 Proof of Lemma B.3.1 Given an arbitrary policy, define (t): the vector of assigned flow rate elements (d;;m) i;pr (t), (d;;m) pr;i (t), and (d;;m) ij (t); x(t): the vector [y(t);x(t)]. With a little bit abuse of notation, denoteh(x(t)),h(t); f(~ x(t)), f(t). In addition, letX represent the set of all possible vectorsx(t) that satisfy the constraints (4.4c)-(4.4g). Note that ~ x(t) also belongs toX . Furthermore, letX represent the convex hull ofX . Then, for all vectorsx2X , the following conditions are satisfied: 1. h(x)h () and f(x) are convex for allx2X ; 2. h(x)h () 0 for allx2X with f(x) 0; 3. 9^ x2X with f(^ x) 0, given interior to (G; ). Item 2) above results immediately from Theorem 7, where anyx2X with f(x) 0 can be treated as the Efx(t)g under a stabilizing stationary randomized policy. Hence, according to Farkas’ Lemma [73], there exists a constant vector 0 such that h(x)h () + y f(x) 0; 8x2X: (B.42) Evaluating (B.42) in ~ x() with = 0;:::;t 1, we have 1 t X t1 =0 h(~ x())h () + 1 t y X t1 =0 f(~ x())0; (B.43) from which it follows that y f(t)= 1 t y t1 X =0 Eff(~ x(t))gh () 1 t t1 X =0 Efh(~ x())g (a) h () 1 t t1 X =0 Efh(x())g=h ()h(t); (B.44) where the inequality (a) is due toh(~ x(t))h(x(t)) that results from the fact ~ (t)(t). 140 Appendix C Proofs in Chapter 5 C.1 Proof of Theorem 10:necessity C.1.1 Proof of Necessity We prove that (5.11)-(5.18) are necessary for the stability of the wireless computing network, and that the minimum average cost can be achieved according to (5.19) and (5.20). We define an elementary unit for the information of each commodity: We assume arbitrarily fine granularity for an elementary unit, i.e., if letting (m) represent the length (measured in information unit) of an elementary unit of commodity (d;m), we have (m) ! 0. Then each elementary unit is undecomposable in transmission and processing, such that integral number of elementary units are transmitted or processed in each transmission or processing attempt. Due to the broadcast nature of the wireless channel, a generic policy may create multiple copies of a given elementary unit, i.e., by allowing multiple successful RXs to keep the decoded elementary units after each transmission attempt. Due to the nature of AgI services, an elementary unit of commodity (d;m) becomes an elementary unit of commodity (d;m + 1) as it gets processed through service functionm + 1 with the lengths scaled by (m+1) : (m+1) = (m+1) (m) , form<M. In the following, we say that two elementary units belong to the same family if they result from the transmission, duplication, and/or processing of the same source elementary unit. Let us assume that when an elementary unit of final commodity (d;M) gets delivered to destination node d, all other elementary units belonging to the same family are immediately discarded by the network nodes – an ideal assumption for traffic reduction of algorithms with multi-copy routing. In particular, if multiple elementary units of final commodity (d;M) belonging to the same family arrive at destination noded at the same time, then only one unit is delivered while all others are discarded. We defineI (d;m) (t) as the set of of elementary units of commodity (d;m) that, after going through the sequence of service functionsfm + 1;m + 2;:::;Mg, are delivered to destinationd within the firstt 141 timeslots. Suppose there exists an algorithm that stabilizes the wireless computing network, possibly allowing multiple copies of a given elementary unit flowing through the network. Under this algorithm, define I (d;m) i (t): the number of elementary units withinI (d;m) (t) that exogenously enter nodei; I (d;m) i;pr (t) andI (d;m) pr;i (t): the number of elementary units withinI (d;m) (t) that enter/exit the processing unit of nodei; I (d;m) ij (t): the number of times the elementary units withinI (d;m) (t) flow over link (i;j). Since the algorithm stabilizes the network, we have, with probability 1, lim t!1 P t =0 a (d;m) i (t) t = lim t!1 I (d;m) i (t) (m) t = (d;m) i ; 8i; (d;m): (C.1) Moreover, the total number of arrivals (both exogenously and endogenously) to nodei of elementary units withinI (d;m) (t) must be equal to the number of departures from nodei of elementary units withinI (d;m) (t). Therefore, we have, fori6=d orm<M, X j;j6=i I (d;m) ji (t) +I (d;m) pr;i (t) +I (d;m) i (t) = X j;j6=i I (d;m) ij (t) +I (d;m) i;pr (t); (C.2) and, form<M and for alli andd, I (d;m+1) pr;i (t) =I (d;m) i;pr (t): (C.3) Furthermore, on the one hand, define the following variables for transmission: T (s;t): the total number of timeslots when the network state iss during the firstt timeslots; ~ tr i;k (~ s;t): the total number of timeslots when nodei allocatesk resource units for transmission, while the previous network state—CSI feedback in the previous timeslot, is ~ s, during the firstt timeslots; ~ (d;m) i;tr (~ s;k;t): the accumulated time (in unit of timeslots) of transmitting the elementary units within I (d;m) (t), while nodei allocatesk resource units for transmission, and the previous network state is ~ s, during the firstt timeslots; (d;m) i;s (~ s;k;t): the accumulated time of transmitting elementary units inI (d;m) when the network state iss, while the network state in the previous timeslot is ~ s, and nodei allocatesk resource units for transmission, during the firstt timeslots; (d;m) i;n;tr (~ s;k;s;t): the total number of times an elementary unit withinI (d;m) (t) are transmitted by node i withk transmission resource units allocated and fall into then-th partition, while the network state is s and the previous network state is ~ s, during the firstt timeslots; ~ (d;m) ij (~ s;k;s;n;t): the total number of times an elementary unit withinI (d;m) (t), transmitted by nodei withk transmission resource units allocated, is retained by nodej while belonging to then-th partition, when the network state iss and the previous network state is ~ s, during the firstt timeslots. 142 Based on the above definitions and the transmission constraints, we have the following relations: ~ tr i;k (~ s;t) T ~ s (t) 0; X K tr i k=0 ~ tr i;k (~ s;t) T ~ s (t) = 1; 8i; ~ s;t; (C.4) ~ (d;m) i;tr (~ s;k;t) ~ tr i;k (~ s;t) 0; X (d;m) ~ (d;m) i;tr (~ s;k;t) ~ tr i;k (~ s;t) 1; 8i;k; ~ s;t; (C.5) ~ (d;m) ij (~ s;k;s;n;t) (d;m) i;n;tr (~ s;k;s;t) 0; X j2 i;n ~ (d;m) ij (~ s;k;s;n;t) (d;m) i;n;tr (~ s;k;s;t) 1; 8i;d;m; ~ s;s;t; (C.6) where we define 0=0 = 1 for any term on the denominator happen to be zero. For each link (i;j), each commodity (d;m), and allt, we then have I (d;m) ij (t) (m) t = X ~ s2S T ~ s (t) t X K tr i k=0 ~ tr i;k (~ s;t) T ~ s (t) ~ (d;m) i;tr (~ s;k;t) ~ tr i;k (~ s;t) X s2S (d;m) i;s (~ s;k;t) ~ (d;m) i;tr (~ s;k;t) X g 1 i;s (j) n=1 (d;m) i;n;tr (~ s;k;s;t) (m) (d;m) i;s (~ s;k;t) ~ (d;m) ij (~ s;k;s;n;t) (d;m) i;n;tr (~ s;k;s;t) : (C.7) The network state process yields, with probability 1, lim t!1 T s (t) t = s ; lim t!1 (d;m) i;s (~ s;k;t) ~ (d;m) i;tr (~ s;k;t) =P ~ ss ; 8i;d;m; (C.8) whereP ~ ss , Pr(S(t) =sjS(t 1) = ~ s). In addition, we upper bound the average rate of then-th partition as follows: 0 (d;m) i;n;tr (~ s;k;s;t) (m) (d;m) i;s (~ s;k;t) R i;g i;n ;k (s)R i;g i;n1 ;k (s); 8i;k;d;m; ~ s;s;t: (C.9) On the other hand, define the following variables for processing: pr i;k (t): the total number of timeslots when nodei allocatesk resource units for processing during the firstt timeslots; (d;m) i;pr (k;t): the accumulated time of processing the elementary units withinI (d;m) (t), while nodei allocatesk resource units for processing, during the firstt timeslots; (d;m) i;pr (k;t): the total number of distinct elementary units withinI (d;m) (t) are processed by nodei withk processing resource units allocated, during the firstt timeslots. Based on the above definitions and the processing constraints, we have the following relations: pr i;k (t) t 0; X K pr i k=0 pr i;k (t) t = 1; 8i;t; (C.10) (d;m) i;pr (k;t) pr i;k (t) 0; X (d;m) (d;m) i;pr (k;t) pr i;k (t) 1; 8i;k;t; (C.11) 0 (d;m) i;pr (k;t) (m) (d;m) i;pr (k;t) R i;k r (d;m+1) ; 8i;t;k;d;m<M: (C.12) 143 For each nodei, we then have, for alli, (d;m) andt, I (d;m) i;pr (t) (m) t = X K pr i k=0 pr i;k (t) t (d;m) i;pr (k;t) pr i;k (t) (d;m) i;pr (k;t) (m) (d;m) i;pr (k;t) : (C.13) Because the constraints in (C.4)-(C.6), (C.9) and (C.10)-(C.12) define bounded ratio sequences with finite dimensions, there exist an infinite long subsequenceft u g of timeslots over which the time average cost achieves its lim inf valueh, and the ratio terms converge: lim tu!1 1 tu P tu1 =0 h () =h; lim tu!1 ~ tr i;k (~ s;tu) T ~ s (tu) = ~ tr i;k (~ s); lim tu!1 ~ (d;m) i;tr (~ s;k;tu) ~ tr i;k (~ s;tu) = ~ (d;m) i;tr (~ s;k); lim tu!1 ~ (d;;m) ij (~ s;k;s;n;tu) (d;m) i;n;tr (~ s;k;s;tu) = ~ (d;;m) ij (~ s;k;s;n); lim (m) !0 lim tu!1 (d;m) i;n;tr (~ s;k;s;tu) (m) (d;m) i;s (~ s;k;tu) =F (d;m) ij (k;s); lim tu!1 pr i;k (tu) t = pr i;k ; lim tu!1 (d;m) i;pr (k;tu) pr i;k (tu) = (d;m) i;pr (k); lim (m) !0 lim tu!1 (d;m) i;pr (k;tu) (m) (d;m) i;pr (k;t 0 ) =F (d;m) i;pr (k): Definef (d;m) ij (t; (m) ), I (d;m) ij (t) (m) . t. Then it follows from (C.7) that f (d;m) ij , lim (m) !0 lim tu!1 f (d;m) ij t u ; (m) (a) X ~ s2S ~ s K tr i X k=0 ~ tr i;k (~ s) ~ (d;m) i;tr (~ s;k) X s2S P ~ ss g 1 i;s (j) X n=1 R i;g i;n ;k (s)R i;g i;n1 ;k (s) ~ (d;m) ij (~ s;k;s;n) (b) = X s2S s X K tr i k=0 tr i;k (s) (d;m) i;tr (s;k) X g 1 i;s (j) n=1 R i;g i;n (s)R i;g i;n1 (s) (d;m) ij (s;k;n); (C.14) where inequality (a) holds true due to the above converging terms for transmission, the convergence terms in (C.8), and the fact F (d;m) ij (k;s) R i;g i;n ;k (s)R i;g i;n1 ;k (s); equality (b) holds true due to the fact s = P ~ s2S ~ s P ~ ss and the following definitions: tr i;k (s), X ~ s2S ~ s P ~ ss s ~ tr i;k (~ s); (d;m) i;tr (s;k), X ~ s2S ~ s P ~ ss ~ tr i;k (~ s) s tr i;k (s) ~ (d;m) i;tr (~ s;k); (d;m) ij (s;k;n), X ~ s2S ~ s P ~ ss ~ tr i;k (~ s) ~ (d;m) i;tr (~ s;k) s tr i;k (s) (d;m) i;tr (s;k) ~ (d;m) ij (~ s;k;s;n): In addition, definef (d;m) i;pr (t; (m) ), I (d;m) i;pr (m) (t) . t. With the converging terms for processing and the factF (d;m) i;pr (k) R i;k /r (m+1) , form<M, it follows from (C.13) that f (d;m) i;pr , lim (m) !0 lim tu!1 f (d;m) i;pr t u ; (m) X K pr i k=0 pr i;k (d;m) i;tr (k) R i;k r (m+1) : (C.15) Moreover, the flow efficiency and non-negativity constraints follows: f (d;M) i;pr = 0; f (d;0) pr;i = 0; f (d;M) dj = 0; f (d;m) i;pr 0; f (d;m) ij 0: (C.16) Furthermore, multiplying (m) t u on both sides of (C.2) and (C.3), and lettingt u !1 and (m) ! 0, we have, fori6=d orm<M, with the result of (C.1), X j f (d;m) ji +f (d;m) pr;i + (d;m) i = X j f (d;m) ij +f (d;m) i;pr ; (C.17) 144 and, form<M and alli andd, with the relation (m+1) = (m+1) (m) , f (m+1) pr;i = (m+1) f (m) i;pr : (C.18) Finally, the time average cost satisfies h = lim tu!1 X i2N " X K pr i k=0 pr i;k (t u ) t u w pr i;k + X ~ s2S T ~ s (t u ) t u X K tr i k=0 ~ tr i;k (~ s;t u ) T ~ s (t u ) w tr i;k # (a) = X i2N X K pr i k=0 pr i;k w pr i;k + X K tr i k=0 w tr i;k X s2S s tr i;k (s) ; (C.19) where, for inequality (a), we use the fact P ~ s2S ~ s ~ tr i;k (~ s) = P s2S s tr i;k (s). In summary, givenf (d;m) i g2 , this proves that there exit a set of flow variables and probability values that satisfy the constraints in Theorem 7. The minimum average costh follows from taking the minimum of h over all the variable sets that stabilize the network. C.1.2 Proof of Sufficiency Given an exogenous input rate matrixf (d;m) i +g,> 0, probability values tr i;k (s), (d;m) i;tr (s;k), (d;m) ij (s;k;n), pr i;k , (d;m) i;pr (k), and multi-commodity flow variablesf (d;m) ij ,f (d;m) i;pr ,f (d;m) pr;i satisfying (5.11)-(5.20), we con- struct a stationary randomized policy using single-copy routing such that: E n (d;m) ij (t) o =f (d;m) ij ; E n (d;m) i;pr (t) o =f (d;m) i;pr ; E n (d;m) pr;i (t) o =f (d;m) pr;i ; (C.20) where (d;m) ij (t), (d;m) i;pr (t), and (d;m) pr;i (t) respectively denote the flow rates assigned by the stationary randomized policy for transmission and processing. Pluggingf (d;m) i +g and the terms in (C.20) into (5.11), after algebraic manipulations, we have E n X j (d;m) ij (t) + (d;m) i;pr (t) X j (d;m) ji (t) (d;m) pr;i (t) o (d;m) i +: (C.21) By applying standard LDP analysis [45], the strong network stability (i.e.,f (d;;m) i g in the interior of the capacity region) follows. C.2 Proof of Theorem 11 Let the Lyapunov drift [45] for the queue backlogs of the network be defined as (H(t)), 1 2 X i;(d;m) E Q (d;m) i (t + 1) 2 Q (d;m) i (t) 2 H (t) : After some standard LDP algebraic manipulations on (5.7) (see Ref. [45]), we have (H(t))+VEfh(t)jH(t)gNB+ X i;(d;m) (d;m) i Q (d;m) i (t) X i E Z pr i (t)Vh pr i (t) +Z tr i (t)Vh tr i (t) H (t) ; (C.22) 145 where, withr min , min m fr (m) g and max , max m f (m) g, we define B, 1 2 max i max j;s:j6=i;s2S n R ij;K tr i (s) o + R i;K pr i . r min 2 + max s2S n X j:j6=i R ji;K tr j (s) o + max R i;K pr i . r min +A max 2 ; Z pr i (t), X (d;m) (d;m) i;pr (t) h Q (d;m) i (t) (m+1) Q (d;m+1) i (t) i ; Z tr i (t), X N1 u=1 X (d;m) (d;m) iq i;u (t) h Q (d;m) i (t)Q (d;m) q i;u (t) i ; (C.23) h pr i (t), X K pr i k=0 w pr i;k y pr i;k (); h tr i (t), X K tr i k=0 w tr i;k y tr i;k (): Lemma C.2.1. Among the algorithms using single-copy routing, the DWCNC algorithm, in each timeslot t, maximizes EfZ tr i (t)Vh tr i (t)jH(t)g subject to (5.5)-(5.6) and EfZ pr i (t)Vh pr i (t) H(t)g subject to (6.1c)-(5.2). Proof. See Appendix C.3. Lemma C.2.1 implies that the right hand side of (C.22) under DWCNC is no larger than the corresponding expression under the optimal stationary randomized policy (characterized in Theorem 7) that supports ( +1)2 and achieves average costh ( +1): (H(t))+VEfh(t)jH(t)gNB + X i;(d;m) (d;m) i Q (d;m) i (t) + X i E Z pr i (t)Vh pr i (t)+Z tr i (t)Vh tr i (t) H(t) NB +Vh (+1) X i X (d;m) Q (d;m) i (t): (C.24) Finally, according to the theoretical results of [72], Eq. (C.24) together with satisfaction of the follow- ing bounding conditions lead to the network stability and average cost convergence under DWCNC with probability 1: 1. The second moment ofh(t) is upper bounded for allt:Ef(h(t)) 2 g P i (w tr i;K tr i +w pr i;K pr i ). 2. We haveEfh(t)jH(t)g lower bounded for allH(t) andt:Efh(t)jH(t)g 0. 3. The conditional fourth moments of queue length change is upper bounded for allH(t),t,i and (d;m): E Q (d;m) i (t + 1)Q (d;m) i (t) 4 H(t) max i max s2S n X j:j6=i R ji;K tr j (s) o + max R i;K pr i . r min +A max 4 : 146 C.3 Proof of Lemma C.2.1 Due to the deterministic nature of computing channel, maximizingEfZ pr i (t)Vh pr i (t) H(t)g is equivalent to maximizing Z pr i (t)Vh pr i (t). The maximization of Z pr i (t)Vh pr i (t) subject to (6.1c)-(5.2) can be achieved by the greedy choice of commodity (d;m), resource allocationk, and the flow rate (d;m) i;pr (t) input for processing, as described by the local processing decisions of DWCNC in Sec. 5.3.1. With respect to the transmission decisions, it follows by plugging (5.5) into (C.23) that Z tr i (t) = X (d;m) X N1 n=1 X N1 u=n (d;m) iq i;u ;n (t) h Q (d;m) i (t)Q (d;m) q i;u (t) i : (C.25) Let (d;m) i;tr (t) be the fraction of the bandwidth allocated by nodei for the transmission of commodity (d;m) in timeslott, and let (d;m) ij;n (t) be the fraction of the transmitted commodity (d;m) in then-th partition from nodei that is forwarded to nodej, withnq 1 i;S(t) (j). Then, assuming single-copy routing, it follows from (5.6) that (d;m) iq i;u ;n (t) = (d;m) i;tr (t) (d;m) iq i;u ;n (t) R iq i;n ;k (S(t))R iq i;n1 ;k (S(t)) ; 8i; t; (C.26) X (d;m) (d;m) i;tr (t) 1; 8i; t; (C.27) X j (d;m) ij;n (t) 1; 8i; t; (d;m): (C.28) Plugging (C.26) into (C.25) and taking the expectation conditioned onH(t) andfy tr i;k (t) = 1g, it follows that E n Z tr i (t)jH(t);y tr i;k (t) = 1 o (a) P (d;m) N1 P n=1 E (d;m) i;tr (t) R iq i;n ;k (S (t))R iq i;n1 ;k (S (t)) N1 P u=n (d;m) iq i;u ;n (t)W (d;m) ig i;u (t) H (t);y tr i;k (t) = 1 (b) P (d;m) N1 P n=1 E (d;m) i;tr (t) R iq i;n ;k (S (t))R iq i;n1 ;k (S (t)) max j2 i;n (S(t)) n W (d;m) ij (t) o H (t);y tr i;k (t) = 1 (c) = P (d;m) E n (d;m) i;tr (t) H (t);y tr i;k (t) = 1 o P N1 n=1 E n max j2 i;n (S(t)) n W (d;m) ij (t) o R iq i;n ;k (S (t))R iq i;n1 ;k (S (t)) H (t);y tr i;k (t) = 1 o (d) max (d;m) N1 P n=1 E R iq i;n ;k (S (t))R iq i;n1 ;k (S (t)) max j2 i;n (S(t)) n W (d;m) ij (t) o H (t);y tr i;k (t) = 1 (e) = max (d;m) n W (d;m) i;k;tr (t) o : (C.29) In (C.29), inequality (a) follows from the definition of W (d;;m) ij (t); inequality (b) follows from (C.28); equality (c) holds because, the values ofR iq i;n ;k (S (t)) and max j2 i;n (S(t)) fW (d;m) ij (t)g are determined by S(t) givenH(t) andfy tr i;k (t) = 1g, and therefore are independent from (d;m) i;tr (t); inequality (d) follows from (C.27); equality (e) follows by the definition ofW (d;m) i;k;tr (t) in (5.22). Finally, taking expectation overy tr i;k (t) on (C.29), we further have EfZ tr i (t)Vh tr i (t)jH(t)g P K tr i k=0 h max (d;m) n W (d;m) i;k;tr (t) o Vw tr i;k i Pr n y tr i;k (t) = 1 o (f) max k;(d;m) n W (d;m) i;k;tr (t)Vw tr i;k o ; (C.30) 147 where (f) follows due to the fact that P K tr i k=0 Pr n y tr i;k (t) = 1 o = 1. In (C.29) and (C.30), the upper bounds (a) and (b) can be achieved by implementing step 4) of the local transmission decisions of DWCNC; the upper bound (d), (e), and (f) can be achieved by implementing step 2) and 3) of the local transmission decisions of DWCNC. This concludes the proof of Lemma C.2.1. 148 Appendix D Proofs in Chapter 6 D.1 Appendix: Proof of Theorem 12 SinceZ(Q) is a piecewise linear concave function ofQ, with Assumption 2, the locally polyhedron property is satisfied, i.e., there exists aL > 0 such that Z(Q )Z(Q)L kQ Qk: (D.1) We then denote the set of queue-length vectors that areD-away fromH as H D , ( Q : sup Q 2H fkQQ kgD ) ; withD, max n B L L 4 ; L 2 ; o ,whereB is an upper bound forkAfk 2 defined as B, X u 2 6 4 0 B @ X v2 (u) c vu min (;i) r (;i) vu 1 C A 2 + 0 B @ X v2 + (u) c uv min (;i) r (;i) uv 1 C A 2 3 7 5: We now state and prove the following lemmas, which are used to prove Theorem 12. Lemma D.1.1. Under QNSD, ifQ(t) = 2H D , we have that, for allQ 2H , kQ(t + 1)Q kkQ(t)Q k L 2 : Proof. From (6.5), it follows that kQ(t + 1)Q k 2 =k [Q(t) +Af(t)] + Q k 2 kQ(t) +Af(t)Q k 2 kQ(t)Q k 2 +B + 2(Q(t)Q ) y Af(t) (D.2) 149 where (D.2) is due to the fact thatkAf(t)k 2 B, as shown in the following: kAfk 2 = X u X (d;;i) 0 @ X v2 (u) f (d;;i) vu X v2 + (u) f (d;;i) uv 1 A 2 X u X (d;;i) 2 4 0 @ X v2 (u) f (d;;i) vu 1 A 2 + 0 @ X v2 + (u) f (d;;i) uv 1 A 2 3 5 X u 2 4 0 @ X v2 (u) X (d;;i) f (d;;i) vu 1 A 2 + 0 @ X v2 + (u) X (d;;i) f (d;;i) uv 1 A 2 3 5 X u 2 6 4 0 B @ X v2 (u) c vu min (;i) r (;i) vu 1 C A 2 + 0 B @ X v2 + (u) c uv min (;i) r (;i) uv 1 C A 2 3 7 5,B: (D.3) From (6.6), we have that, for allQ 2H , Z(Q(t))Z(Q ) = inf [y 0 ;f 0 ]2X n Vw y y 0 + (Af 0 ) y Q(t) o inf [y 0 ;f 0 ]2X n Vw y y 0 + (Af 0 ) y Q o Vw y y(t) + (Af(t)) y Q(t)Vw y y(t) + (Af(t)) y Q = (Af(t)) y (Q(t)Q ): (D.4) Using (D.4) in (D.2), it follows that kQ(t + 1)Q k 2 kQ(t)Q k 2 +B 2 (Z(Q )Z(Q(t))): (D.5) From the definition ofH D , it is immediate to show that for anyQ(t) = 2H D and anyQ 2H , L 2 4 L kQ Q(t)kB 2L kQ Q(t)k: (D.6) Now, using (D.1) and (D.6) in (D.5), kQ(t + 1)Q k 2 kQ(t)Q k 2 +B 2L kQ Q(t)k kQ(t)Q k 2 + L 2 4 L kQ Q(t)k = kQ(t)Q k L 2 2 : (D.7) Hence, for anyQ(t) = 2H D , the queue length evolution is regulated by kQ(t + 1)Q kkQ(t)Q k L 2 : (D.8) 150 Lemma D.1.2. Assuming the Slater condition holds, and letting R D , ( Q : sup Q 2H kQQ kD + p B ) ; R D , infft 0 :Q(t)2R D g; (D.9) then, for allt R D ,Q(t)2R D . Proof. We follow by induction. By the definition of R D , we have thatQ(t)2R D fort = R D . Now, assumeQ(t)2R D holds in iterationt R D . Then, in iterationt + 1, the following holds: ifQ(t) = 2H D , according to Lemma D.1.1, we have thatjQ(t + 1)Q jjQ(t)Q j L 2 D + p B L 2 ; on other hand, ifQ( R D )2H D , it follows thatjQ(t + 1)Q jjQ(t)Q j +jQ(t + 1)Q(t)jD + p B. Hence, in either case,Q(t + 1)2R D . Lemma D.1.3. Lettingh max =max y n w y y o , if the Slater condition holds, the queue-length vector of QNSD satisfies kQ(t)k B /2 +Vh max + p B; 8t> 0; (D.10) where is a positive number satisfyingkQ(t + 1)k 2 kQ(t)k 2 B +Vh max kQ(t)k. Proof. Starting from the queuing dynamics in (6.5), squaring both sides of the equality, recalling that kAf(t)k 2 B, and addingVw y y(t) on both sides, after algebraic manipulation, we get 1 2 kQ(t + 1)k 2 kQ(t)k 2 +Vw y y(t) B 2 +Vw y y(t) +Q(t) y Af(t): (D.11) As described in Section 6.4.1, QNSD computes the resource and flow vectors [y(t);f(t)] in iterationt as [y(t);f(t)] = arg inf [y 0 ;f 0 ]2X n Vw y y 0 + (Af 0 ) y Q(t) o ; (D.12) which implies that, for any feasible [^ y; ^ f], Vw y y(t) +Q(t) y (Af(t))Vw y ^ y +Q(t) y A ^ f: (D.13) Based on the Slater condition, there exists a positive number and a [^ y; ^ f], such thatA ^ f1, and it follows that Vw y y(t) +Q(t) y Af(t)Vh max kQ(t)k: (D.14) We now consider two cases: IfkQ(t)k (B /2 +Vh max ) /, using (D.14) in (D.11) yields 1 2 kQ(t + 1)k 2 kQ(t)k 2 B 2 +Vh max Vw y y(t)kQ(t)k 0: (D.15) IfkQ(t)k< (B /2 +Vh max )/, then kQ(t + 1)kkQ(t)k +kAf(t)k B /2 +Vh max + p B: (D.16) Hence, in either case, we can upper boundkQ(t)k by (B /2 +Vh max )/ + p B for allt> 0. 151 We now proceed to prove Theorem 12. Using (D.13) and evaluating the right-hand side of (D.11) at the optimal solution [y opt ,f opt ], we obtain 1 2 kQ(t + 1)k 2 kQ(t)k 2 +Vw y y B 2 +Vw y y opt +Q(t) y Af opt (t) = B 2 +Vw y y opt ; (D.17) where the last equality follows fromAf opt (t) =0. Now, denotingh opt =w y y opt , from (D.17), we have that w y y(t)h opt B 2V 1 2V kQ(t + 1)k 2 kQ(t)k 2 : (D.18) Next, taking the average of (D.18) over thej-th iteration frame [t j ;t j + t j ], we obtain 1 t j + 1 t j +t j X =t j w y y()h opt B 2V + kQ(t j )k 2 kQ(t j + t j )k 2 2V ( t j + 1) Next, we use the following two properties: R1: R D =O (h max V /L ) =O(mV ). R2: ift j R D thenkQ(t j )k 2 kQ(t j + t j )k 2 =O((D + p B)h max V /L ) =O(m 2 V ). To prove R1, according to Lemma D.1.1, we first have R D d2kQ Q(0)k /L e =d2kQ k /L e. Moreover, from Lemma D.1.3, note that kQ kkQ( R D )k +kQ Q( R D )k B /2 +Vh max + p B +D: (D.19) Hence, we have R D 2 L B /2 +Vh max + p B +D ; (D.20) and therefore we can write R D as R D =O h max V L =O(mV ): (D.21) The above relation holds true becauseh max mh max 0 =O(m), whereh max 0 is the maximum per-node-cost. To prove R2, according to Lemma D.1.2, ift j R D , by using (D.20), then we have kQ(t j )k 2 kQ(t j + t j )k 2 =kQ(t j )Q k 2 kQ(t j + t j )Q k 2 + 2Q y [Q(t j )Q(t j + j )] kQ(t j )Q k 2 + 2kQ kkQ(t j )Q(t j + j )k D + p B 2 + 2kQ k D + p B =O 0 @ D + p B h max V 1 A =O(m 2 V ); (D.22) 152 where the last equality is due toh max =O(m),D =O(B), and B2 X u 2 6 4 X v2 (u) 0 B @ c vu min (;i) r (;i) vu 1 C A 2 + X v2 + (u) 0 B @ c uv min (;i) r (;i) uv 1 C A 2 3 7 5 2m max (u;v)2E a c 2 uv min (;i) r (;i) uv 2 =O(m): (D.23) Using R1 and R2, and letting V = 1 / and t j dmV 1e, it follows from (D.18) that, for all t j R D , w y 0 @ t j +t j X =t j y() t j + 1 1 A h opt B 2V + h max D + p B ( j + 1) + 3 D + p B 2 +B D + p B 2V ( j + 1) =O 0 @ B 2V + h max D + p B mV 1 A =O m V =O(m): (D.24) Observing that y uv = Pt j +t j =t j yuv () t j +1 , Eq. (6.7) in Theorem 12 follows immediately. In order to prove Eq. (6.8), starting from (6.5), we get Q(t + 1) Q(t) +Af(t); (D.25) from which it follows that, for allt j R D , t j +t j X =t j Af(t) t j + 1 Q(t j + t j )Q(t j ) t j + 1 : (D.26) Finally, thefu; (d;;i)g-th element of the vectors in (D.26) satisfies X v2 (u) f (d;;i) vu X v2 + (u) f (d;;i) uv Q (d;;i) u (t j + t j )Q u + Q (d;;i) u (t j )Q u t j + 1 2(D + p B) mV =O 1 V =O(); (D.27) where f (d;;i) vu = t j +t j X =t j f (d;;i) vu () t j + 1 ; f (d;;i) uv = t j +t j X =t j f (d;;i) uv () t j + 1 : 153 Note that R D = O(mV ) = O(m /) due to (D.21), and t j dmV 1e = O(m /). Thus, from (D.24) and (D.27),8tT (),t j + t j + 1 R D +mV =O(m=), the following relations hold: X (u;v)2E a w uv y uv h opt +O(m) (D.28) X v2 (u) f (d;;i) vu X v2 + (u) f (d;;i) uv O() 8u;d;;i (D.29) Note that since QNSD chooses thej-th iteration frame asj =blog 2 tc,t j = 2 j , and t j = 2 j , there exists a j such that 2 j 2 [maxf R D ;mV + 1g; 2 maxf R D ;mV + 1g), and hence Eq. (D.28) and (D.29) hold. In addition, sincef (d;;i) uv (t) andy uv (t) satisfy (6.1c)-(6.1g) in each iteration of QNSD, so does the final solutionf f (d;;i) uv ; y uv g, which concludes the proof of Theorem 12. 154 Bibliography [1] A. F. Molisch. Wireless communications, 2nd edition. IEEE-Press - Wiley, 2011. [2] J. Kleinberg and E. Tardos. Algorithm Design. Addison Wesley, 2005. [3] A. V . Goldberg and S. Rao. Beyond the flow decomposition barrier. Journal of the ACM (JACM), 45(5):783–797, 1998. [4] T. Leighton, F. Makedon, S. Plotkin, C. Stein, É. Tardos, and S. Tragoudas. Fast approximation algorithms for multicommodity flow problems. Journal of Computer and System Sciences, 50(2):228– 243, 1995. [5] T. Leighton and S. Rao. Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. Journal of the ACM (JACM), 46(6):787–832, 1999. [6] S. Biswas and R. Morris. ExOR: Opportunistic multi-hop routing for wireless networks. In ACM SIGCOMM Computer Communication Review, volume 35, pages 133–144, 2005. [7] D. S. J. deCouto, D. Aguayo, J. Bicket, and R. Morris. A high-throughput path metric for multi-hop wireless routing. Wireless Networks, 11(4):419–434, 2005. [8] E. Rozner, J. Seshadri, Y . Mehta, and L. Qiu. SOAR: Simple opportunistic adaptive routing protocol for wireless mesh networks. Mobile Computing, IEEE Transactions on, 8(12):1622–1635, 2009. [9] L. Georgiadis, M. J. Neely, and L. Tassiulas. Resource allocation and cross-layer control in wireless networks. Now Pub, 2006. [10] M. J. Neely. Stochastic network optimization with application to communication and queueing systems. Synthesis Lectures on Communication Networks, 3(1):1–211, 2010. [11] L. Tassiulas and A. Ephremides. Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks. Automatic Control, IEEE Transactions on, 37(12):1936–1948, 1992. 155 [12] L. Tassiulas and A. Ephremides. Dynamic server allocation to parallel queues with randomly varying connectivity. Information Theory, IEEE Transactions on, 39(2):466–478, 1993. [13] M. J. Neely, E. Modiano, and C. E. Rohrs. Power and server allocation in a multi-beam satellite with time varying channels. In INFOCOM 2002. Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE, volume 3, pages 1451–1460. IEEE, 2002. [14] M. J. Neely, E. Modiano, and C. E. Rohrs. Dynamic power allocation and routing for time-varying wireless networks. Selected Areas in Communications, IEEE Journal on, 23(1):89–103, 2005. [15] E. M. Yeh and R. A. Berry. Throughput optimal control of cooperative relay networks. Information Theory, IEEE Transactions on, 53(10):3827–3833, 2007. [16] M. Luby. LT codes. In Foundations of Computer Science, 2002. Proceedings. The 43rd Annual IEEE Symposium on, pages 271 – 280, 2002. [17] J. W. Byers, M. Luby, and M. Mitzenmacher. A digital fountain approach to asynchronous reliable multicast. Selected Areas in Communications, IEEE Journal on, 20(8):1528–1540, 2002. [18] A. Shokrollahi. Raptor codes. Information Theory, IEEE Transactions on, 52(6):2551–2567, 2006. [19] J. Castura, Y . Mao, and S. Draper. On rateless coding over fading channels with delay constraints. In Information Theory, 2006 IEEE International Symposium on, pages 1124–1128. IEEE, 2006. [20] J. Castura and Y . Mao. Rateless coding for wireless relay channels. Wireless Communications, IEEE Transactions on, 6(5):1638 –1642, may 2007. [21] Y . Liu. A low complexity protocol for relay channels employing rateless codes and acknowledgement. In Information Theory, 2006 IEEE International Symposium on, pages 1244–1248. IEEE, 2006. [22] A. F. Molisch, N. B. Mehta, J. S. Yedidia, and J. Zhang. Performance of fountain codes in collaborative relay networks. Wireless Communications, IEEE Transactions on, 6(11):4108–4119, 2007. [23] G. Caire and D. Tuninetti. The throughput of hybrid-ARQ protocols for the gaussian collision channel. Information Theory, IEEE Transactions on, 47(5):1971–1988, 2001. [24] H. Shirani-Mehr, H. Papadopoulos, S.A. Ramprashad, and G. Caire. Joint scheduling and ARQ for MU-MIMO downlink in the presence of inter-cell interference. Communications, IEEE Transactions on, 59(2):578–589, 2011. [25] S. C. Draper, L. Liu, A. F. Molisch, and J. S. Yedidia. Cooperative routing for wireless networks using mutual-information accumulation. Information Theory, IEEE Transactions on, 57(8):5757–5762, 2011. [26] R. Urgaonkar and M. J. Neely. Optimal routing with mutual information accumulation in wireless networks. Selected Areas in Communications, IEEE Journal on, 30(9):1730 –1737, october 2012. 156 [27] M. Kodialam and T. Nandagopal. Characterizing achievable rates in multi-hop wireless networks: the joint routing and scheduling problem. In Proceedings of the 9th annual international conference on Mobile computing and networking, pages 42–54. ACM, 2003. [28] K. Jain, J. Padhye, V . N. Padmanabhan, and L. Qiu. Impact of interference on multi-hop wireless network performance. Wireless networks, 11(4):471–487, 2005. [29] J. Grönkvist. Interference-based scheduling in spatial reuse TDMA. 2005. [30] R. L. Cruz and A. V . Santhanam. Optimal routing, link scheduling and power control in multihop wireless networks. In INFOCOM 2003. Twenty-Second Annual Joint Conference of the IEEE Computer and Communications. IEEE Societies, volume 1, pages 702–711 vol.1, March 2003. [31] Metro network traffic growth: An architecture impact study. Bell Labs Strategic White Paper, December 2013. [32] M. Weldon. Defining the future of networks. CommsDay Summit 2014, April 2014. [33] The programmable cloud network - a primer on SDN and NFV. Alcatel-Lucent Strategic White Paper, June 2013. [34] R. Cohen, L. Lewin-Eytan, J. S. Naor, and D. Raz. Near optimal placement of virtual network functions. In Computer Communications (INFOCOM), 2015 IEEE Conference on, pages 1346–1354. IEEE, 2015. [35] M. Barcelo, J. Llorca, A. M. Tulino, and N. Raman. The cloud service distribution problem in distributed cloud networks. In Communications (ICC), 2015 IEEE International Conference on, pages 344–350. IEEE, 2015. [36] S. Jeschke, C. Brecher, T. Meisen, D. Özdemir, and T. Eschert. Industrial internet of things and cyber manufacturing systems. In Industrial Internet of Things, pages 3–19. Springer, 2017. [37] A. B. Craig. Understanding augmented reality: Concepts and applications. Newnes, 2013. [38] M. J. Neely and R. Urgaonkar. Optimal backpressure routing for wireless networks with multi-receiver diversity. Ad Hoc Networks, 7(5):862–881, 2009. [39] G. Li and H. Liu. Resource allocation for OFDMA relay networks with fairness constraints. Selected Areas in Communications, IEEE Journal on, 24(11):2061–2069, 2006. [40] K. Karakayali, J. H. Kang, M. Kodialam, and K. Balachandran. Cross-layer optimization for OFDMA- based wireless mesh backhaul networks. In Wireless Communications and Networking Conference, 2007.WCNC 2007. IEEE, pages 276–281, 2007. [41] R. Rashtchi, R. H. Gohary, and H. Yanikomeroglu. Joint routing, scheduling and power allocation in OFDMA wireless ad hoc networks. In Communications (ICC), 2012 IEEE International Conference on, pages 5483–5487, 2012. 157 [42] M. Jain, J. Il Choi, T. Kim, D. Bharadia, S. Seth, K. Srinivasan, P. Levis, S. Katti, and P. Sinha. Practical, real-time, full duplex wireless. In Proceedings of the 17th annual international conference on Mobile computing and networking, pages 301–312. ACM, 2011. [43] A Sahai, G. Patel, and A Sabharwal. Asynchronous full-duplex wireless. In Communication Systems and Networks (COMSNETS), 2012 Fourth International Conference on, pages 1–9, Jan 2012. [44] L. Georgiadis, M. J. Neely, and L. Tassiulas. Resource allocation and cross-layer control in wireless networks. Foundations and Trends R in Networking, 1(1):1–144, 2006. [45] M. J Neely. Stochastic network optimization with application to communication and queueing systems. Synthesis Lectures on Communication Networks, 3(1):1–211, 2010. [46] L. Tassiulas and A. Ephremides. Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks. IEEE transactions on automatic control, 37(12):1936–1948, 1992. [47] M. J Neely and R. Urgaonkar. Optimal backpressure routing for wireless networks with multi-receiver diversity. Ad Hoc Networks, 7(5):862–881, 2009. [48] S. Supittayapornpong and M. J Neely. Quality of information maximization for wireless networks via a fully separable quadratic policy. IEEE/ACM Transactions on Networking, 23(2):574–586, 2015. [49] M. Chiang and T. Zhang. Fog and iot: An overview of research opportunities. IEEE Internet of Things Journal, 3(6):854–864, 2016. [50] M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davies. The case for vm-based cloudlets in mobile computing. IEEE pervasive Computing, 8(4), 2009. [51] S. Nastic, S. Sehic, D. Le, H. Truong, and S. Dustdar. Provisioning software-defined iot cloud systems. In Future Internet of Things and Cloud (FiCloud), 2014 International Conference on, pages 288–295. IEEE, 2014. [52] Marc Barcelo, Alejandro Correa, Jaime Llorca, Antonia M Tulino, Jose Lopez Vicario, and Antoni Morell. Iot-cloud service optimization in next generation smart environments. IEEE Journal on Selected Areas in Communications, 34(12):4077–4090, 2016. [53] S. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004. [54] H. Feng, J. Llorca, A. M Tulino, and A. F. Molisch. Optimal dynamic cloud network control. In Communications (ICC), 2016 IEEE International Conference on, pages 1–7. IEEE, 2016. [55] M. K. Weldon. The future X network: a Bell Labs perspective. CRC press, 2016. [56] H. Feng, J. Llorca, Antonia M Tulino, and Andreas F Molisch. Dynamic network service optimization in distributed cloud networks. In Computer Communications Workshops (INFOCOM WKSHPS), 2016 IEEE Conference on, pages 300–305. IEEE, 2016. 158 [57] H. Feng, J. Llorca, A. M. Tulino, and A. F. Molisch. Optimal dynamic cloud network control. In Communications (ICC), 2016 IEEE International Conference on, pages 1–7. IEEE, 2016. [58] S. Shamai and A. Steiner. A broadcast approach for a single-user slowly fading mimo channel. IEEE Transactions on Information Theory, 49(10):2617–2635, 2003. [59] A. M. Tulino, G. Caire, and S. Shamai. Broadcast approach for the sparse-input random-sampled mimo gaussian channel. In Information Theory (ISIT), 2014 IEEE International Symposium on, pages 621–625. IEEE, 2014. [60] A. EI Gamal and Y . Kim. Network information theory. Cambridge university press, 2011. [61] M. Barcelo, J. Llorca, A. M. Tulino, and N. Raman. The cloud service distribution problem in distributed cloud networks. In Communications (ICC), 2015 IEEE International Conference on, pages 344–350. IEEE, 2015. [62] N. Garg and J. Koenemann. Faster and simpler algorithms for multicommodity flow and other fractional packing problems. SIAM Journal on Computing, 37(2):630–652, 2007. [63] D. Bienstock and G. Iyengar. Approximating fractional packings and coverings in o (1/epsilon) iterations. SIAM Journal on Computing, 35(4):825–854, 2006. [64] Z. Cao, M. Kodialam, and T. Lakshman. Traffic steering in software defined networks: Planning and online routing. In ACM SIGCOMM Computer Communication Review, volume 44, pages 65–70. ACM, 2014. [65] T. G. Crainic, A. Frangioni, and B. Gendron. Bundle-based relaxation methods for multicommodity capacitated fixed charge network design. Discrete Applied Mathematics, 112(1):73–99, 2001. [66] R. Cohen, L. Lewin-Eytan, J. S. Naor, and D. Raz. Near optimal placement of virtual network functions. In Computer Communications (INFOCOM), 2015 IEEE Conference on, pages 1346–1354. IEEE, 2015. [67] N. Dragoni, S. Giallorenzo, A. L. Lafuente, M. Mazzara, F. Montesi, R. Mustafin, and L. Safina. Microservices: yesterday, today, and tomorrow. arXiv preprint arXiv:1606.04036, 2016. [68] S. Supittayapornpong and M. J Neely. Time-average stochastic optimization with non-convex decision set and its convergence. In Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), 2015 13th International Symposium on, pages 490–497. IEEE, 2015. [69] P. Tseng. On accelerated proximal gradient methods for convex-concave optimization. SIAM Journal on Optimization, 2008. [70] J. Liu, A. Eryilmaz, N. B. Shroff, and E. S Bentley. Heavy-ball: A new approach to tame delay and convergence in wireless network optimization. In Computer Communications, IEEE INFOCOM 2016-The 35th Annual IEEE International Conference on, pages 1–9. IEEE, 2016. 159 [71] W. Rudin. Principles of mathematical analysis, volume 3. McGraw-hill New York, 1964. [72] M. J. Neely. Queue stability and probability 1 convergence via lyapunov optimization. arXiv preprint arXiv:1008.3519, 2010. [73] D. P. Bertsekas. Convex optimization theory. Athena Scientific Belmont, 2009. 160
Abstract (if available)
Abstract
This thesis includes the five works that I did during my Ph.D. study by now. ❧ In the first work, algorithms are suggested and analyzed for routing in multi-hop wireless ad-hoc networks that exploit mutual information accumulation as the physical layer transmission scheme, and are capable of routing multiple packet streams (commodities) when only the average channel state information is present, and that only locally. The proposed algorithms are modifications of the Diversity Backpressure (DIVBAR) algorithm, under which the packet whose commodity has the largest ''backpressure metric'' is chosen to be transmitted and is forwarded through the link with the largest differential backlog (queue length). In contrast to traditional DIVBAR, each receiving node stores and accumulates the partially received packet in a separate ''partial packet queue'', thus increasing the probability of successful reception during a later possible retransmission. Two variants of the algorithm are presented: DIVBAR-RMIA, under which all the receiving nodes clear the received partial information of a packet once one or more receiving nodes firstly decode the packet
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Optimal resource allocation and cross-layer control in cognitive and cooperative wireless networks
PDF
Scheduling and resource allocation with incomplete information in wireless networks
PDF
Dynamic routing and rate control in stochastic network optimization: from theory to practice
PDF
Adaptive resource management in distributed systems
PDF
Backpressure delay enhancement for encounter-based mobile networks while sustaining throughput optimality
PDF
Efficient delivery of augmented information services over distributed computing networks
PDF
On practical network optimization: convergence, finite buffers, and load balancing
PDF
Algorithmic aspects of energy efficient transmission in multihop cooperative wireless networks
PDF
Transport layer rate control protocols for wireless sensor networks: from theory to practice
PDF
Resource scheduling in geo-distributed computing
PDF
Optimizing task assignment for collaborative computing over heterogeneous network devices
PDF
Efficient data collection in wireless sensor networks: modeling and algorithms
PDF
Quantum computation in wireless networks
PDF
Integrated reconfigurable wireless receivers supporting carrier aggregation
PDF
Understanding the characteristics of Internet traffic dynamics in wired and wireless networks
PDF
Control and optimization of complex networked systems: wireless communication and power grids
PDF
Propagation channel characterization and interference mitigation strategies for ultrawideband systems
PDF
Robust routing and energy management in wireless sensor networks
PDF
Learning distributed representations from network data and human navigation
PDF
Modeling intermittently connected vehicular networks
Asset Metadata
Creator
Feng, Hao
(author)
Core Title
Joint routing, scheduling, and resource allocation in multi-hop networks: from wireless ad-hoc networks to distributed computing networks
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
11/27/2017
Defense Date
10/13/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
cross-layer design,distributed computing networks,OAI-PMH Harvest,stochastic network optimization,virtual network function,wireless ad-hoc networks
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Molisch, Andreas F. (
committee chair
), Golubchik, Leana (
committee member
), Neely, Michael J. (
committee member
)
Creator Email
haofeng.fh666@gmail.com,haofeng@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-458084
Unique identifier
UC11266649
Identifier
etd-FengHao-5933.pdf (filename),usctheses-c40-458084 (legacy record id)
Legacy Identifier
etd-FengHao-5933.pdf
Dmrecord
458084
Document Type
Dissertation
Rights
Feng, Hao
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
cross-layer design
distributed computing networks
stochastic network optimization
virtual network function
wireless ad-hoc networks