Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Energy-efficient computing: Datacenters, mobile devices, and mobile clouds
(USC Thesis Other)
Energy-efficient computing: Datacenters, mobile devices, and mobile clouds
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ENERGY-EFFICIENT COMPUTING: DATACENTERS, MOBILE DEVICES, AND MOBILE CLOUDS by Shuang Chen A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) August 2018 Copyright 2018 Shuang Chen Contents List of Figures iv Abstract vi 1 Introduction 1 2 Related Work 6 3 Peak Shaving in Geo-distributed Datacenters with Hierarchically Deployed Energy Storage Devices 9 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.1 Cloud infrastructure model . . . . . . . . . . . . . . . . . . . 12 3.2.2 Cloud service model . . . . . . . . . . . . . . . . . . . . . . 13 3.2.3 Power consumption model . . . . . . . . . . . . . . . . . . . 14 3.2.4 ESD model . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3.1 Electricity cost calculation . . . . . . . . . . . . . . . . . . . 19 3.3.2 QoS constraint specification . . . . . . . . . . . . . . . . . . 20 3.3.3 Complete formulation . . . . . . . . . . . . . . . . . . . . . . 21 3.4 Solution methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4.1 Lyapunov optimization . . . . . . . . . . . . . . . . . . . . . 23 3.4.2 Opportunistic control formulation . . . . . . . . . . . . . . . 25 3.4.3 Sub-problem decomposition . . . . . . . . . . . . . . . . . . 26 3.4.4 Complete solution . . . . . . . . . . . . . . . . . . . . . . . . 32 3.5 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.5.1 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . 37 3.5.2 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . 38 4 Concurrent Placement, Capacity Provisioning, and Request Flow Control for a Distributed Cloud Infrastructure 41 4.1 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 ii 4.1.1 User behavior . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.1.2 Datacenter location and capacity . . . . . . . . . . . . . . . 42 4.1.3 Routing and delay modeling . . . . . . . . . . . . . . . . . . 43 4.1.4 Power consumption modeling . . . . . . . . . . . . . . . . . 44 4.2 Problem formulation and solution method . . . . . . . . . . . . . . 46 4.2.1 Cost calculation . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.2.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . 50 4.2.3 Solution method . . . . . . . . . . . . . . . . . . . . . . . . 53 4.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.1 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.2 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . 56 5 OptimalOffloadingControlinaMobileCloudComputingSystem 58 5.1 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.1.1 Overall system modeling for an MCC system . . . . . . . . . 58 5.1.2 Power modeling for a mobile device . . . . . . . . . . . . . . 61 5.1.3 Modeling for the rechargeable battery . . . . . . . . . . . . . 62 5.2 SMDP based problem formulation and solution framework . . . . . 65 5.2.1 Device modeling using SMDP . . . . . . . . . . . . . . . . . 65 5.2.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . 69 5.2.3 Solution method . . . . . . . . . . . . . . . . . . . . . . . . 72 5.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . 74 6 Dynamic Voltage and Frequency Scaling in a Mobile Device with a Heterogeneous Computing Architecture 78 6.1 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 6.1.1 Heterogeneous computing architecture . . . . . . . . . . . . 79 6.1.2 System Power Consumption . . . . . . . . . . . . . . . . . . 80 6.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 81 6.3 Solution method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 7 Conclusion 85 Acknowledgments 86 Reference List 87 iii List of Figures 3.1 The structure of a geo-distributed cloud infrastructure with multi- level power hierarchy and energy storage devices . . . . . . . . . . . 13 3.2 The energy flow among different components . . . . . . . . . . . . . 15 3.3 Complete solution method . . . . . . . . . . . . . . . . . . . . . . . 36 3.4 Adopted TOU pricing scheme . . . . . . . . . . . . . . . . . . . . . 37 3.5 Cumulative cost with lead-acid batteries as ESDs . . . . . . . . . . 38 3.6 Cumulative cost with Lithium-ion batteries as ESDs . . . . . . . . . 40 4.1 Simulation Result of Scenario 1 . . . . . . . . . . . . . . . . . . . . 56 4.2 Simulation Result of Scenario 2 . . . . . . . . . . . . . . . . . . . . 56 5.1 System framework of an MCC system . . . . . . . . . . . . . . . . . 59 5.2 Equivalent circuit model for Li-ion batteries [1] . . . . . . . . . . . 62 5.3 Conceptual diagram of a power conversion tree . . . . . . . . . . . . 63 5.4 State transition diagram for the joint SMDP . . . . . . . . . . . . . 69 5.5 Simulation result with varying request generation rate . . . . . . . . 75 5.6 Simulation result with varying SoC . . . . . . . . . . . . . . . . . . 76 5.7 Simulation result with varying utilization level of the server in the cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.8 Simulation result with varying RTT . . . . . . . . . . . . . . . . . . 77 iv 6.1 System model of a mobile device with a heterogeneous computing architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 v Abstract Energyconsumption, ensuedbyutilityexpenses, isamajorconcerninelectrical and electronic systems ranging from warehouse-size datacenters to mobile phones and tablets. Cost-efficient management in these systems are usually partitioned into various sub-problems, each of which can be addressed by a number of methods and techniques. However, isolation of different sub-problems overlooks the inter- dependency in between, resulting in sub-optimality of the solution. Our work focuses on joint consideration and optimization of multiple power management techniques which has been studied separately in prior work. Themethodologyisdemonstratedwithfourspecificproblems. Wefirstconsider the peak shaving problem in the context of a geo-distributed cloud infrastructure equipped with energy storage devices (ESDs). The ESD management problem is solved jointly with the request flow control and server consolidation problem using a Lyapunov optimization framework suitable for systems with a large number of servers. We then concurrently solve the total cost of ownership minimization prob- lem of geo-distributed datacenters which involves both design-time choices such as capacity provisioning and run-time management policies such as user request rout- ing. Third, we study the computation offloading problem in a mobile device using a semi-Markov decision process in which the processing power and the transmis- sion power of the device are controlled in a unified fashion. And lastly, we discuss vi the problem of joint request dispatch and dynamic voltage and frequency scaling in a heterogeneous computing architecture. The proposed solution methods are validated with the support of realistic mod- els and data. The experimental results reflect the superiority of our joint optimiza- tion methodology in these use cases. vii Chapter 1 Introduction Energy efficiency and energy saving have been popular fields of research. On one hand, the electric power supply struggles to match the demand since early 2000s. There are even large-scale blackouts in Texas and California because of spiking power demand and/or a few malfunctioning power plants 1 . Increasing the energy generation capacity and controlling the energy consumption are two of the most straightforward solutions to the problem. In fact, power management is a well-known research topic and has seen a large body of literature in the context of a variety of systems [2]. In addition, power demand shaping, i.e. guiding the demand to meet the supply via means including dynamic pricing, can also help mitigate the energy shortage, since the profile of energy generation and energy usage are usually highlytime-dependent(duetointermittentpowersupplysuchaswindturbinesand photovoltaic cells and diurnal pattern of commercial and residential customers). While the grid companies price the energy to discourage large power consump- tion during peak hours, study of efficiently adjusting its power consumptions is of great interest to energy consumers as it effectively cuts their utility bills. As a major customer in the electricity market, datacenters are estimated to consume around 31GW of electric power globally in 2012 [3]. Many approaches including dynamic voltage and frequency scaling (DVFS) [4] and the usage of energy storage devices (ESDs) [5, 6] have been proposed to address the high utility cost. On the 1 http://www.dallasnews.com/news/community-news/dallas/headlines/20110202-cold- cripples-50-power-plants-triggering-blackouts-for-thousands-across-dallas-fort-worth.ece 1 other hand, for consumer electronics such as mobile phones and tablets, the power consumption is a major concern because it can greatly affect the battery lifetime, which plays a key role in user experience. Unfortunately, while mobile phones and tablets are designed to have more and more features and computational power, thus making them more power hungry, the volumetric/gravimetric energy density of rechargeable batteries grows at a much lower speed. To extend the limited bat- tery life, the concept of mobile cloud computing (MCC) [7] is brought up in which a mobile device shifts part of its workload to a remote cluster with abundant computing and storage resources. This way, if managed judiciously, the mobile device is able to trade the energy spent for transmitting and receiving data for the results of computation intensive tasks which would require much more energy to be obtained locally. In our work that tackles the problem of energy saving and cost control for sys- tems ranging from a warehouse-size datacenter to a mobile phone, we are trying to look at the big picture and connect seemingly distinct problems while keeping the complexity of the solution practically acceptable. For instance, although resource provisioning (i.e. deciding how much resource should be put within a datacen- ter) and resource management (e.g. task assignment and scheduling) happen at different times, these two problems can interact with each other. However, this inter-dependency, especially the influence of resource management schemes on the designchoicesofadatacenter, isoftenoverlooked. Inspecific, ourworkhasfocused on four topics which will be briefly introduced as follows. First, we try to solve the problem of joint control of user request flow and the ESDs in datacenters in the context dynamic utility pricing. We take into consid- eration various aspects in datacenter management including the request dispatch, resource allocation, and peak power shaving using the battery banks at different 2 levels of the power hierarchy of a datacenter to minimize the overall cost from the perspective of a datacenter operator. Rather than first design management policies for each part separately and then combine them together, we propose a joint optimization framework that found the request dispatch, resource allocation, server consolidation, and the battery management policies at the same time. By doing so, we can perfectly address the inter-dependency between these aspects. To solve the problem efficiently, we adopt a Lyapunov optimization framework and decompose the complex optimization problem into a series of much smaller sub-problems, each addressing the control of one node in the system. A general- ized time-of-use (TOU) pricing model that allows energy sell back was considered. Batteries were installed as the energy storage devices at each level of the power hierarchy of the datacenter. Instead of a single charging/discharging efficiency, we follow the Peukert’s law to realistically model the energy conversion losses within the battery. Second, weproposeageneralizedjointoptimizationframeworkofbothdatacen- ter placement/capacity provisioning and request flow control/resource allocation. The objective is to minimize the total average cost of the datacenters in the net- work provided an average delay constraint. The request generation behavior of the cloud users is summarized based on the Google cluster data [8]. The problem of resource allocation is formulated using the Generalized Processor Sharing (GPS) model [9, 10], in which the processor allocates a certain amount of its total compu- tation resources to each request running on it. And the response time of a request is calculated accordingly. In the proposed model, we account for all the major aspects of the capital and operational cost of a datacenter, which can be time dependent and/or location dependent. In addition to building new datacenters, we also allow the option of upgrading existing datacenters. We provide an example 3 based on the geographical information of the United States, and the optimal offline solution is found. Third, we address the problem of application management on a mobile device in an MCC system, where requests of the application can be either processed locally or sent to the remote server. Automatic repeat request (ARQ) protocol is adopted to provide reliable communication in request offloading in a noisy wireless environment. In the mobile device, the mobile CPU employs DVFS to dynamically adjust the processing speed and the power consumption depending on the number ofwaitingrequests. TheRFtransmittercanadaptivelyselectthemostappropriate bit rate and modulation scheme for request offloading, depending on the number of waiting requests, the wireless channel capacity, anticipated server congestion levels, etc. We model the mobile device as a semi-Markov decision process (SMDP) [11], in which the states reflect the remaining workload for the mobile CPU and RF module, and the actions are decision pairs of (DVFS level, transmission bit rate). The objective function is a linear combination of the energy drawn from the battery and the average processing latency per request, in order to achieve a balance between performance and battery life. In the SMDP formulation, we account for the power consumptions of mobile components that cannot be directly controlled, aswellasanaccuratebatterymodeltoestimatethebatteryenergyloss, which is overlooked by most of the previous work. We derive the optimal solution, includingtheoptimaloffloadingrate, DVFSpolicy, andtransmissionscheme, using a linear programming approach combined with a one-dimensional heuristic search. Last but not least, we identify the similarity between the offloading control problem in an MCC system and the DVFS control problem in a heterogeneous computing architecture. The proposed SMDP formulation is extended to solve for the optimal global request dispatch and per-core DVFS policy. 4 The rest of this paper is organized as follows. Related literature is reviewed in Chapter 2. Chapters 3 – 6 present our work in detail on the four aforementioned topics. Finally, Chapter 7 concludes the dissertation. 5 Chapter 2 Related Work Analysis of cloud computing system: The advantages of cloud computing was reviewed in reference [12]. An overview of the architecture of a cloud com- puting system can be found in reference [13]. On modeling datacenters and cloud infrastructures, A package called “CloudSim” was developed [14] in order to simu- late a datacenter’s behavior under different configurations of both physical servers and virtual machines. Design space exploration of datacenters: Some prior work has addressed the placement and/or the capacity provisioning problem of datacenters. For instance, in [15], the authors identified the main costs in datacenters and stated that their placement and provisioning can have a significant impact on service profitability. The authors of [16] provided a generalized cost calculation method and considered the availability of the cloud service. In [17], the authors addressed the problem of green computing and had a discussion on the tradeoff between the carbon footprint, the average cost, and the latency when different locations are selected to build new datacenters. ESD modeling: The modeling of ESDs, especially various types of batteries, which is a crucial component of our work, has been studied using a number of different methodologies. References [18, 19] introduced analytical battery models based on electrochemical modeling and analysis. These models are very accurate but are too complicated to be used for system level design. In comparison, battery models in the form of equivalent electric circuits as provided in references [20, 6 1] are more suitable for developing a mathematical formulation. Furthermore, reference [21] estimates the specific parameters for this battery model that fit the characteristic of Li-ion batteries. Unideal battery characteristics such as the rate capacity effect and capacity fading have been studied as in references [22, 23] with carefully calibrated parameters. Meanwhile, the modeling of other types of ESDs (e.g. super-capacitors or flywheels) can be found in McCluer and Chrisin’s work [24]. Power management of datacenters: There exists a large amount of litera- ture on power management techniques and algorithms in datacenters. A summary of the progress, state of the art, and challenges can be found in references [25, 26] Among some of the well-known work, Raghavendra et al. [27] developed a coordi- nated power management algorithm and investigated its effectiveness and stability under different hardware/software configurations. More recently, concepts includ- ing power capping and peak shaving are proposed to further reduce the utility cost of a datacenter by matching the power consumption profile with the dynamic util- ity pricing policy. Power capping/peak shaving in a datacenter is achieved via either one technique or a combination of different techniques. For instance, Meis- ner et al. [28] proposed to use DVFS, Nathuji et al. [29] integrated the control into the virtual machine power manager, and Buchbinder et al. [30] tackled the problem by designing a smart online job migration algorithm. The usage of ESDs is an idea that was proposed more recently [6, 31, 5]. Wang et al. [6] considered the decision choice of the type and the capacity of the ESDs to be installed at each point in the power hierarchy of the datacenter in addition to the management of these ESDs. Kontorinis et al. [31] emphasized more on the design of a real-time controller. Aksanli et al. [5] pointed out that the batteries used for peak shaving should be accurately modeled to prevent misleading and over-optimistic results. 7 Computation offloading: The discussion regarding the offloading policy in a cloud computing system can be found in a series of prior work. Reference [32] reaches the conclusion that an application or task with high computation but limited data communication requirement could benefit the most from computation offloading. A comparison on power consumptions between local execution and remote execution is made in [33], in which decisions of computation offloading are made based on a joint consideration of the application’s latency deadline, data size,andwirelesschannelcondition. Reference[34]addressestheproblemofcarbon footprint profiling and optimization. A number of dynamic computation offloading schemes are presented in references [35, 36, 37]. Finally, some runtime offloading frameworks have been proposed for specific applications [38, 39]. Lyapunov optimization on related topics: The Lyapunov optimization framework that is adopted in this paper has been proved effective in various areas of research, especially in the context of computer networks. For instance, Tassiulas and Ephremides [40] used it to maximize the throughput in a network in which all links cannot be activated at the same time. Neely et al. [41] managed to decouple the routing problem on independent portions of the network in the study of the fairness-latency tradeoff in a heterogeneous network. Urgaonkar et al. [42] are among the first to apply the Lyapunov optimization framework to the problem of energy storage management in datacenters. The work is further extended by Guo et al. [43] for multiple geo-distributed datacenters. However, both papers treat ESD management as an independent problem from the request flow control problem and the discussion is limited to the centralized deployment scheme of ESDs. 8 Chapter 3 Peak Shaving in Geo-distributed Datacenters with Hierarchically Deployed Energy Storage Devices In this chapter, we propose a solution to the peak shaving problem in a geo- distributed cloud infrastructure with hierarchical ESD development by jointly con- sidering request dispatch, resource allocation, server consolidation, and ESD man- agement. We first develop a formal optimization framework capable of modeling different ESD deployment schemes. Then, applying the Lyapunov optimization theory, we derive a scalable algorithm of which the complexity grows linearly with the total number of nodes in the system. 3.1 Motivation ESDshasbeenprimarilyusedasUPSunitsindatacenterstotemporarilymain- tainthepowersupplyintheeventofapoweroutagebeforethedieselgeneratorsare brought online. Since it is common for a datacenter to over-provision the capacity of ESDs by three to five times, there is an opportunity to leverage the use of ESDs for peak shaving without significantly modifying the power infrastructure of the datacenter[44]. Inaddition, comparedtoDVFSandcontrolknobsattheoperating system level, charging and discharging of batteries or other types of energy storage 9 devices does not interfere with running tasks, hence causing negligible quality of service degradation. Nevertheless, the amount of energy drawn from the power grid over time is an aggregate of all power consuming elements in the datacenter including both servers and ESDs. It is therefore crucial to jointly consider server management (e.g. VM mapping, power state switching, task scheduling, etc.) to optimize the utility cost. As pointed out by studies about the topologies of ESDs in datacenters [45], distributed deployment of ESDs or a hybrid deployment of centralized ESD and distributed ESDs outperforms the centralized-only deployment scheme in terms of energy efficiency. This conclusion is also supported by the fact that distributed ESDs at the rack level or the server level are commonly adopted by datacenters of Google 1 and Facebook 2 . However, a distributed or hybrid ESD deployment scheme also implies that a large number of ESDs are used, which can make optimal fine- grained control of each individual ESD intractable. The prevalence of geo-distributed cloud infrastructures comprised of multiple datacenters at different sites makes peak shaving in datacenters a even more chal- lenging problem. While geo-distributed datacenters are widely adopted by the industry (e.g. Amazon EC2, Microsoft Azure, etc.) to facilitate user access from different parts of the world, the cloud infrastructure is faced with not only more servers, and hence more control variables, but also utility markets with different pricing schemes. Meanwhile, variations in resource availability and utility pric- ing between different sites also provide extra room for further operational cost reduction. 1 https://www.cnet.com/news/google-uncloaks-once-secret-server-10209580/ 2 http://opencompute.org/projects/rack-and-power/ 10 Table 3.1: List of symbols and definitions Symbol Definition E bat k,m The amount of remaining energy within the ESD at node (k,m) E ch k,m ,E ds k,m The amount of charging/discharging energy in a time slot for the ESD at node (k,m) E ch,i k,m ,E ds,i k,m The amount of increased/decreased energy in a time slot within the ESD at node (k,m) E srv k,m The energy consumption of components at node (k,m) other than the ESD in a time slot T req i The average response time required by user i T rtt i,j The round-trip time between user i and server j Λ i The total request arrival rate from user i μ j The average request processing rate of server j x i,j A binary variable indicating whether requests from user i are routed to server j (1) or not (0) y j A binary variable indicating whether server j is switched ON (1) or OFF (0) λ i,j The request arrival rate at server j from user i φ i,j The portion of resources allocated by server j to user i η ch ,η ds Energy conversion efficiency during charging/discharging of an ESD η tr k Energy transmission efficiency between level k and (k + 1) in the infrastructure γ C ,γ D Peukert factors 3.2 System model We will introduce the key assumptions regarding the structure of datacenters and the interaction between the cloud and its users. To increase readability, the definitions of some of the frequently used symbols in this chapter are summarized in Table 3.1. 11 3.2.1 Cloud infrastructure model In this paper, we model the power delivery infrastructure of the set of geo- distributed datacenters hosting cloud services as a tree of five levels as shown in Fig. 3.1 3 . The root of the tree represents the power grid while lower levels of the tree represent entry points of a datacenter, a power distribution unit (PDU), a server rack, and a single blade server, respectively. In addition to other nodes in the tree, a node can be connected to computation, networking, and/or energy storage elements such as servers, switches, batteries, etc. As an example, a leaf node is mapped from a blade server and a server-level ESD. We label a node in the tree using a pair of integers (k,m) where k∈{0,...,K} is its level in the tree (K = 4 in our model) and m = 1, 2,...,M k is its index within level k. The nominal energy storage capacity (in the unit of Joules or kilowatt-hours) of the battery array at node (k,m) is denoted by E hi k,m . In the case that battery arrays are only installed on a portion of the nodes, one can simply set E hi k,m to 0 for the nodes without energy storage capability. All battery arrays and power consuming elements can exchange energy with the power grid through power lines but may suffer from energy loss in transmission. The transmission efficiency between two adjacent levelsk andk + 1 is captured by η tr k ∈ [0, 1). To simplify the notations, the transmission efficiency is considered to be uniform among all nodes in the same layer in both directions. Our formulation does not depend on this simplification and can be straightforwardly extended to a more general model with heterogeneous transmission efficiencies. Furthermore, we define A(k,m) as the set of ancestors of node (k,m), C(k,m) as the set of child 3 While the power conversion modules are not explicitly shown in the figure, the energy con- version and transmission losses are accounted for in the problem formulation section. 12 ... ... ... ... ... Datacenter Level PDU Level Rack Level Server Level Power Grid Figure 3.1: The structure of a geo-distributed cloud infrastructure with multi-level power hierarchy and energy storage devices nodes of node (k,m), and D(k,m) as the set of nodes in the subtree rooted at node (k,m), i.e. node (k,m) and all its descendants. 3.2.2 Cloud service model Since the problem we are interested in is at the scale of a large country such as the United States, it is neither feasible nor necessary to address each individual user in the network. In such cases, we divide the whole country into small regions and aggregate the individual users in each region (e.g. a district or a city) into one user node i. According to reference [46], the amount of user activity can be four times greater in peak hours than in off-peak hours, thus suggesting significant diurnal fluctuation in terms of the service workload. Hence, we use a slotted time model to capture the workload in the cloud and denote the request arrival rate from useri in time slott by Λ i [t]. Depending on the tradeoff between the accuracy and the management overhead, the duration of each time slot can range from a few minutes to more than one hour. 13 In each time slot, the requests from a user can be served by a subset of servers. For ease of expression, we use serverj as a shorthand for the server at node (K,j). A set of binary variables{x i,j [t]} are defined in which x i,j [t] is set to 1 if user i is served by server j in time slot t and set to 0 otherwise. To estimate the quality of service (QoS) of user requests, we apply a generalized processor sharing (GPS) model [9, 10], in which a server allocates a certain amount of its total computation resources to each user. Let μ j denote the expected processing rate of the server j and φ i,j [t] denote the portion of resources in server j allocated for user i in time slot t, then the average response time, denoted by R i,j [t], can be calculated as R i,j [t] = 0, x i,j [t] = 0 ∞, μ j φ i,j [t]≤λ i,j [t] 1 μ j φ i,j [t]−λ i,j [t] +T rtt i,j , otherwise (3.1) whereλ i,j [t] is the request arrival rate at serverj from useri in time slott, andT rtt i,j is the round-trip time between useri and serverj, which is generally a function of the distance between the user and the datacenter. 3.2.3 Power consumption model For a specific node (k,m) in the tree shown in Fig. 3.1, the energy flow is shown in Fig. 3.2. Using the direction of arrows as the positive directions, the amount of energy flow during time slot t among various components connected to the node can be modeled as follows: E in↑ k,m [t] +E out↓ k,m [t] +E ch k,m [t] +E srv k,m [t]≡E in↓ k,m [t] +E out↑ k,m [t] +E ds k,m [t] (3.2) 14 Figure 3.2: The energy flow among different components where E in↑ k,m [t] and E in↓ k,m [t] capture the energy transfer with the parent of node (k,m), E out↑ k,m [t] and E out↓ k,m [t] capture the energy transfer with the children of node (k,m), E ch k,m [t] and E ds k,m [t] are the amount of energy used for battery charg- ing/discharging, andE srv k,m [t] is the energy consumed by other devices (e.g. servers, PSUs, etc.) connected to node (k,m). Furthermore, the energy flow between a node (k,m) and its child nodes in C(k,m) satisfies the following equations: E out↑ k,m [t]≡ X (k 0 ,m 0 )∈C(k,m) η tr k E in↑ k 0 ,m 0[t] (3.3) E out↓ k,m [t]≡ X (k 0 ,m 0 )∈C(k,m) 1 η tr k E in↓ k 0 ,m 0[t] (3.4) Since leaf nodes, i.e. server nodes in our system, do not have any children, we have E out↑ K,m [t] =E out↓ K,m [t]≡ 0. Note that while we define E in↑ k,m [t] and E in↓ k,m [t] as two separate non-negative variables for convenience of problem formulation, at lease one of them should be 15 zero for any given node because the energy flow on any power line can have only one direction. In contrast, bothE out↑ k,m [t] andE out↓ k,m [t] can be positive because these two terms are defined without physical implication to capture the summations of the energy flow to each child node. For components such as network switches and routers, we model their power consumption as a constant because of the small difference between the idle power consumption and peak power consumption of these devices. For instance, for a TP-Link TL-SG1008P switch, the idle power is 4.2W while the peak power is 4.3W when connecting to six clients. On the other hand, fluctuation of power consumption can be more significant in servers. We adopt the model from Gandhi et al. [47] in which the energy consumption of server j in time slot t, i.e. E srv K,j [t], is a function of the utilization level of the server and can be calculated as E srv K,j [t] =y j [t]· P φ j · X i φ i,j [t] ! γ F +P con j τ (3.5) where P φ j , γ F , and P con j are server-dependent coefficients, τ is the duration of a time slot, and y j [t] is a binary variable which is set to 1 if and only if server j is powered on. More specifically, P con j is the idle power consumption of the server while P φ j captures the fluctuation of the server’s power consumption due to its utilization level. γ F ≥ 1 is a coefficient that addresses power state changes and DVFS happening in the server. Note that the power overhead such as the cooling power consumption are also included in coefficients P φ j ’s and P con j ’s. 3.2.4 ESD model Inthispaper, weusebatteryarraysasESDsindatacenters, buttheworkcanbe straightforwardly extended to account for other types of ESDs. In addition to the 16 lossofenergytransmission/conversionduringthecharginganddischargingprocess, we also consider the rate capacity effect from Peukert’s law [22]. More specifically, the rate capacity effect suggests that the rate of energy storage gain/drain inside a battery is different from the input/output power seen from outside at the terminal of the battery. If we denote the input power of the battery asP bat , and the internal energy increasing rate of the battery by P bat,i , then the relationship between P bat and P bat,int can be expressed as P bat = 1 η ch V bat I ref · P bat,i V bat I ref ! γ C , P bat,i V bat I ref > 1 1 η ch P bat,i , 0≤ P bat,i V bat I ref ≤ 1 η ds P bat,i , −1≤ P bat,i V bat I ref < 0 −η ds V bat I ref · P bat,i V bat I ref 1 γ D , P bat,i V bat I ref <−1 (3.6) where γ C and γ D are Peukert factors for charging and discharging, respectively, which have the typical values ranging from 1.1 to 1.3 depending on the battery type, η ch and η ds are the power conversion efficiency for charging and discharging process, respectively, V bat is the storage terminal voltage which is near-constant, and I ref is the reference discharging current with negligible rate capacity effect which is proportional to the nominal capacity of the battery. When P bat,i > 0, Eqn. (3.6) corresponds to the charging process, and when P bat,i < 0, Eqn. (3.6) corresponds to the discharging process. Since both γ C and γ D are greater than 1, and both η ch and η ds are less than 1, the energy efficiency of the battery will be over-estimated without considering the rate capacity effect. 17 To be consistent with the notation in Eqn. (3.2), we define E ch,i k,m [t]≥ 0 and E ds,i k,m [t]≥ 0 as the internal energy increase and decrease of battery at node (k,m), respectively. Then we can derive from Eqn. (3.6) that E ch k,m [t] = τ η ch V bat I ref · E ch,i k,m [t] V bat I ref τ γ C , E ch,i k,m [t] V bat I ref τ > 1 1 η ch E ch,i k,m [t], otherwise (3.7) E ds k,m [t] = τη ds V bat I ref · E ds,i k,m [t] V bat I ref τ 1 γ D , E ds,i k,m [t] V bat I ref τ > 1 η ds E ds,i k,m [t], otherwise (3.8) Denoting the amount of remaining energy in the batteries at node (k,m) at the beginning of time slot t as E bat k,m [t], the change of state of charge (SoC) of the batteries can be expressed as E bat k,m [t + 1]−E bat k,m [t] =E ch,i k,m [t]−E ds,i k,m [t] (3.9) We ignore the self discharge of batteries since its effect is negligible for our problem (less than 5% per month for most types of rechargeable batteries [48]). 3.3 Problem formulation In this section, we will formally describe the formulation of a electricity cost minimization problem subject to QoS constraints. Unless otherwise noted, the decision variables in our formulation are λ i,j [t]’s, φ i,j [t]’s, E ch,i k,m [t]’s, E ds,i k,m [t]’s, x i,j [t]’s, andy j [t]’s. WhileE ch,i k,m [t]’s andE ds,i k,m [t]’s cannot be directly controlled, we use them as decision variables for the convenience of our formulation. Since there 18 exists a one-to-one mapping betweenE ch,i k,m [t] (orE ds,i k,m [t]) toE ch k,m [t] (orE ds k,m [t]), we can derive the needed charging/discharging power after solving the optimization problem. The focus of the proposed solution is the online control of the charg- ing/discharging of the ESDs given a deployment scheme. The energy storage capacity provisioning problem can be can be solved offline in a separate formu- lation, which is beyond the scope of this chapter. 3.3.1 Electricity cost calculation We consider a location-dependent time-of-use pricing scheme in which the elec- tricity purchasing price of the datacenter at node (1,m) in time slot t is denoted by p 1,m [t]. The total electricity cost of the geo-distributed cloud infrastructure in time slot t, denoted by Cost[t] can be calculated as Cost[t] = X 1≤m≤M 1 " p 1,m [t]· 1 η tr 0 E in↓ 1,m [t]−η tr 0 E in↑ 1,m [t] !# (3.10) We further definep k,m [t] as the electricity purchasing price at the location of node (k,m), whichisthesameasp 1,m 0[t]ifnode (k,m)isamemberinthesubtreerooted at datacenter node (1,m 0 ). Since the power capping problem is not the emphasis of this problem, we do not consider the utility pricing for the daily/monthly peak power consumption. It is worth noting that the cost model can be extended to covermorecomplicatedpricingschemesusingthecostamortizationtechniquefrom work of Cui et al. [49] To express the total electricity cost as a function of the decision variables, we iteratively use Eqns. (3.2), (3.3), (3.4), (3.7), and (3.8). Notably, since we have 19 E in↑ k,m ·E in↓ k,m ≡ 0, terms on the right hand side of Eqns. (3.3) and (3.4) can be expanded as E in↑ k 0 ,m 0 =E in↓ k 0 ,m 0−E srv k 0 ,m 0−E ch k 0 ,m 0 +E ds k 0 ,m 0 +E out↑ k 0 ,m 0−E out↓ k 0 ,m 0 (3.11) E in↓ k 0 ,m 0 = h E srv k 0 ,m 0 +E ch k 0 ,m 0−E ds k 0 ,m 0 +E out↓ k 0 ,m 0−E out↑ k 0 ,m 0 i + (3.12) where [·] + is defined as max(·, 0). 3.3.2 QoS constraint specification We consider the QoS constraint in which the average response time of user i should not exceedT req i on any involved server, or equivalently,R i,j [t]≤T req i , where R i,j [t] is defined as in Eqn. (3.1). Since the power consumption of a server is a monotonically increasing function of the amount of allocated resources, an optimal control decision will always allocate the minimum amount of resources satisfying the QoS constraint, which will be denoted byφ req i,j [t] between useri and serverj in time slot t. In the case that x i,j [t] = 1, y j [t] = 1, and T req i −T rtt i,j > 0, i.e., when the server is powered on, requests of the user is dispatched to the server, and the round-trip time between the user and the server is less than the required response time, we have φ req i,j [t] = 1 μ j · 1 T req i −T rtt i,j +λ i,j [t] ! (3.13) In other cases, the definition of φ req i,j [t] is meaningless because the requests from user i will not be served by server j. 20 3.3.3 Complete formulation Summing up the objective function and constraints, the complete optimization problem can be formulated as follows: Find: λ i,j [t]’s, φ i,j [t]’s, E bat k,m [t]’s, E ch,i k,m [t]’s, E ds,i k,m [t]’s, x i,j [t]’s, and y j [t]’s Minimize: P T t=0 Cost[t] Subject to: Eqns. (3.7), (3.8), (3.9), (3.13) P j λ i,j [t]≥ Λ i [t], ∀i,t (3.14) P i φ i,j [t]≤ 1, ∀i,t (3.15) E lo k,m ≤E bat k,m [t]≤E hi k,m ∀k,m,t (3.16) −E lim↑ k,m ≤E in↓ k,m [t]−E in↑ k,m [t]≤E lim↓ k,m ∀k,m,t (3.17) x i,j [t]≤y j [t] ∀i,j,t (3.18) λ i,j [t]≤x i,j [t]· Λ i [t], ∀i,j,t (3.19) φ i,j [t]≤y j [t] ∀i,j,t (3.20) φ i,j [t]≥x i,j [t]·φ req i,j [t] ∀i,j,t (3.21) x i,j [t]∈{0, 1}, ∀i,j,t (3.22) y j [t]∈{0, 1}, ∀j,t (3.23) E ch,i k,m [t]∈ [0,E ch,m k,m ] ∀k,m,t (3.24) E ds,i k,m [t]∈ [0,E ds,m k,m ] ∀k,m,t (3.25) λ i,j [t]≥ 0 ∀i,j,t (3.26) φ i,j [t]≥ 0 ∀i,j,t (3.27) 21 whereT is the control horizon, E bat k,m [t] can be calculated according to Eqn. (3.9), E lo k,m and E hi k,m are the lower and upper bound on the energy storage of batteries at node (k,m), respectively, and E lim↑ k,m and E lim↓ k,m are the maximum amount of transferred energy between node (k,m) and its parent node in each time slot. In practice,E lo k,m andE hi k,m are set based on the need for energy storage during power outages and energy capacity of the battery, respectively. Constraint (3.14) ensures that all user requests will be served by a server. Constraint (3.15) ensures that a server will not be overloaded. The upper bound andlowerboundontheSoCofanybatteryarrayarespecifiedbyConstraint(3.16). Constraint (3.17) is present due to the capacity of the power transmission lines. Constraint (3.18) eliminates the possibility of dispatching requests to servers that are powered off. Constraints (3.19) and (3.20) guarantee that only servers that are powered on can be used to process requests. Constraint (3.21) is introduce to satisfy the QoS requirement. Constraints (3.22) – (3.27) specify the domains of decision variables. The aforementioned problem is a mixed-integer non-linear programming (MINLP) problem which is generally hard to solve. Even if polynomial-time solu- tion methods based on convex optimization techniques similar to the one proposed in reference [50] are used, the computational complexity of the solution is a super- linear function of the number of nodes in the system and not applicable to a typical geo-distributed cloud infrastructure with thousands or tens of thousands of nodes. A control horizon of multiple time slots adds to the complexity and can result in more than 20x increase in the number of decision variables with hourly decision within a one-day horizon. Hence, we propose a more scalable solution based on Lyapunov optimization theories [51]. 22 3.4 Solution methods 3.4.1 Lyapunov optimization The basic idea of Lyapunov optimization is to cast the constraints as operations on a number of queues (or virtual queues) and transform the original problem into minimizing the time average of the cost/penalty function while keeping the queues stable. Consider a system of a number of queues numbered from 1 to N Q . The state of the system is characterized by the backlogs (i.e. lengths) of all the queues in each time slot t, denoted by a vector Q[t] = [Q 1 [t],...,Q N Q[t]] T , where Q[t + 1] = [Q[t]−b(π[t])] + +a(π[t]) (3.28) We use π[t] to denote the control action taken in time slot t, which can affect the cost/penalty incurred in time slot t. a(·) is a vector-valued function representing the amount of work arriving at the queues in a time slot while b(·) is a vector- valued function representing the amount of work that can be processed in a time slot. In the case that there exist time-average constraints on the control actions π[t] in the form of E{G(π[t])} 0 (3.29) where “” is the generalized relational operator meaning “component-wise less than or equal to”, we define a virtual queue vectorQ V [t] with the same cardinality of G(·). Q V [t] can be updated as Q V [t + 1] = h Q V [t]−b V (π[t]) i + +a V (π[t]) (3.30) 23 where a V (π[t]) = 0 and b V (π[t]) = −G(π[t]). With the addition of virtual queues, we define ˜ Q[t] = h Q[t] T ,Q V [t] T i T . Similarly, we can define ˜ a(π[t]) = h a(π[t]) T ,a V (π[t]) T i T and ˜ b(π[t]) = h b(π[t]) T ,b V (π[t]) T i T . A Lyapunov function for this queuing system, denoted by L( ˜ Q[t]) can be defined as L( ˜ Q[t]) = 1 2 ˜ Q[t] T ˜ Q[t] (3.31) Furthermore, the values of ˜ Q[t] can be viewed as a stochastic process, and a con- ditional Lyapunov drift, denoted by Δ( ˜ Q[t]), can be defined as Δ( ˜ Q[t]) =E n L( ˜ Q[t + 1])−L( ˜ Q[t])| ˜ Q[t] o (3.32) If the penalty function in time slott is defined as pen[t], then the Lyapunov “drift- plus-penalty” problem aims at minimizing the following term Δ( ˜ Q[t]) +V·E n pen[t]| ˜ Q[t] o (3.33) whereV is a coefficient controlling the tradeoff between the penalty and the queue stability. One approximate solution to this “drift-plus-penalty” problem is to find an opportunistic control policy in which the control decision in each time slot only depends on the current state of the queues (regardless of the history). The opportunistic control problem in time slot t can be formally described as Find: π[t] Minimize: V·pen[t] + ˜ Q[t] T ˜ a(π[t])− ˜ b(π[t]) Subject to: Constraints on π[t] It can be proved that the opportunistic control policy is a C-additive approxi- mation [51] whereC =O(1/V ) assuming that arrival rate and processing rate of a 24 queue are i.i.d for any given control decision. Note that the opportunistic control policy can be derived even without the i.i.d assumption and the solution is within an additive bound ofO(T/V ) compared to the optimal control policy of a horizon ofT [52]. 3.4.2 Opportunistic control formulation InordertoutilizetheLyapunovoptimizationframeworktoimprovethesolution complexity of the proposed problem formulated in Section 3.3.3, we first cast the original problem formulation to an opportunistic control problem, which will be called OCP, as defined in Section 3.4.1. While the opportunistic control framework solves for the control policy sepa- rately for each time slot, we should still enforce some constraints across different time slots to avoid greedy discharging on batteries in an opportunistic control setup. Hence, we introduce a set of virtual queues, n Z E k,m [t] o capturing energy depletion in batteries, where Z E k,m [t] = [E hi k,m − E bat k,m [t]] + is the backlog of the virtual queue corresponding to the battery at node (k,m) in time slott. By main- taining the stability of virtual queues n Z E k,m [t] o , one can dynamically adjust the charging/discharging power of the batteries. Note that while the constraints on n Z E k,m [t] o are not equivalent Constraint (3.16) in the general case because the back- logs of Z E k,m [t]’s are not intrinsically upper-bounded, which can potentially result in E bat k,m [t] < E lo k,m . However, we will show in Section 3.4.4 that Constraint (3.16) can be strictly enforced if the coefficient V defined in Eqn. (3.33) is properly set. A opportunistic control version of the problem for time slot t is formulated as follows: Find: λ i,j [t]’s, φ i,j [t]’s, E bat k,m [t]’s, E ch,i k,m [t]’s, E ds,i k,m [t]’s, x i,j [t]’s, and y j [t]’s 25 Minimize: V·Cost[t] + X k,m Z E k,m [t]· E ds,i k,m [t]−E ch,i k,m [t] Subject to: Eqns. (3.7), (3.8), (3.9), (3.13), (3.14), (3.15), (3.17) – (3.27) The problem is solved repeatedly at the beginning of each time slot to obtain the control decision in that time slot. The aforementioned OCP formulation effectively decouples the control problem in each time slot from the history and future decisions of the system. However, it remains hard to solve because a large number of binary variables (i.e. x i,j [t]’s and y j [t]’s) must be solved jointly, which will result in exponential complexity in the worst case. To further reduce the solution complexity, we will decompose the opportunistic control problem into a number of independent sub-problems, each with a constant number of binary variables, that can be solved efficiently using standard solvers. 3.4.3 Sub-problem decomposition Our goal is to decompose the OCP in such a way that each sub-problem corre- sponds to a node in the system or a user-server pair. A sufficient condition for an optimization problem to be decomposed this way is that: (i) the objective function can be expressed as a linear function of all decision variables, and (ii) each con- straint involves at most one node in the system or one user-server pair. The main obstacle in our formulation as specified in Section 3.4.2 lies in Constraints (3.14), 26 (3.15), (3.17) and the objective function. Notably, within the objective function, we are interested in the decomposition of the electricity cost term, Cost[t]. Decomposing the electricity cost The decomposition of the Cost[t] term is challenging primarily due to the cal- culation of the total energy consumption in each datacenter with the non-linearity introduced by the power transmission hierarchy as specified in Eqns. (3.2) and (3.12). Nevertheless, we can find an upper bound of the energy drawn from the power grid in datacenter (1,m 0 ), i.e., 1 η tr 0 E in↓ 1,m 0−η tr 0 E in↑ 1,m 0 , as follows: 1 η tr 0 E in↓ 1,m 0−η tr 0 E in↑ 1,m 0≤ X (k,m)∈D(1,m 0 ) E ch k,m +E srv k,m Y k 0 <k 1 η tr k 0 −E ds k,m Y k 0 <k η tr k 0 (3.34) whereE ch k,m andE ds k,m are calculated using Eqns. (3.7) and (3.8), respectively. Note that in the case that we are discussing the system behavior within a single time slot (asopposedtothechangebetweendifferenttimeslots), weomitthetimeslotspeci- fication in our notations to simplify the expression. For instance,E in↓ 1,m 0 andE in↓ 1,m 0[t] are used interchangeably in this case. We use U ∗ m 0 =U ∗ m 0({E ch k,m },{E ds k,m },{E srv k,m }) as a shorthand notation of the right hand side of Eqn. (3.34). Lemma 1. For any given node (k,m) and any positive real number η∈ (0, 1], the following inequality holds 1 η E in↓ k,m −ηE in↑ k,m ≤ 1 η · E srv k,m +E ch k,m +E out↓ k,m −η· E ds k,m +E out↑ k,m (3.35) Proof. For a leaf node (K,m), since E out↑ k,m =E out↓ k,m = 0, the proof is trivial. 27 For a non-leaf node (k,m) where k <K, Jensen’s inequality [53] can be used. Since [·] + is a convex function, we have: [α +β] + = 2· 1 2 α + 1 2 β + ≤ 2· 1 2 · [α] + + [β] + = [α] + + [β] + (3.36) where α and β can take arbitrary values. Applying Jensen’s inequality on Eqn. (3.12), we can get: E in↓ k,m ≤ h E srv k,m i + + h E ch k,m i + + h E out↓ k,m i + + h −E out↑ k,m i + + h −E ds k,m i + = h E srv k,m i + + h E ch k,m i + + h E out↓ k,m i + (3.37) The lemma can hence be proved by substituting E in↓ k,m in Eqn. (3.11) using Eqn. (3.37) and plugging in the result into Eqn. (3.35). Proposition 1. U ∗ m 0({E ch k,m },{E ds k,m },{E srv k,m }) is a valid upper bound of 1 η tr 0 E in↓ 1,m 0−η tr 0 E in↑ 1,m 0 . Proof. To prove the validity, we will show that, for any k 0 ∈ [1,K], we have 1 η T 0 E in↓ 1,m 0−η T 0 E in↑ 1,m 0≤ X (k,m)∈D(1,m 0 ) and k≤k 0 E ch k,m +E srv k,m Y k 0 <k 1 η tr k 0 −E ds k,m Y k 0 <k η tr k 0 + X (k,m)∈D(1,m 0 ) and k=k 0 +1 E in↓ k,m Y k 0 <k 1 η tr k 0 −E in↑ k,m Y k 0 <k η tr k 0 (3.38) The inequality in Eqn. (3.38) can be proved using mathematical induction. First, the boundary case with k 0 = 1 can be trivially proved with Eqn. (3.35). Assuming that the inequality holds for a given k 0 , we can apply Lemma 1 and set η = Q k 0 <k η tr k 0. It can be easily shown that Eqn. (3.38) also holds for (k 0 + 1). 28 By setting k 0 =K, Eqn. (3.38) is reduced to Eqn. (3.34). Moreover,consideringthefamilyoflinearfunctionsofE ch k,m ’s,E ds k,m ’s,andE srv k,m ’s (of whichU ∗ m 0({E ch k,m },{E ds k,m },{E srv k,m }) is a member),U ∗ m 0({E ch k,m },{E ds k,m },{E srv k,m }) is the upper bound that is closest to 1 η tr 0 E in↓ 1,m 0−η tr 0 E in↑ 1,m 0 . Proposition 2.∀{c 1 k,m },{c 2 k,m },{c 3 k,m }, if 1 η tr 0 E in↓ 1,m 0−η tr 0 E in↑ 1,m 0≤ X (k,m)∈D(1,m 0 ) c 1 k,m E ch k,m +c 2 k,m E srv k,m +c 3 k,m E ds k,m , then U ∗ m 0({E ch k,m },{E ds k,m },{E srv k,m })≤ X (k,m)∈D(1,m 0 ) c 1 k,m E ch k,m +c 2 k,m E srv k,m +c 3 k,m E ds k,m , ∀E ch k,m ,E ds k,m ,E srv k,m Proof. We rewrite U ∗ m 0({E ch k,m },{E ds k,m },{E srv k,m }) as U ∗ m 0({E ch k,m },{E ds k,m },{E srv k,m }) = X (k,m)∈D(1,m 0 ) c 1∗ k,m E ch k,m +c 2∗ k,m E srv k,m +c 3∗ k,m E ds k,m where c 1∗ k,m =c 2∗ k,m = Y k 0 <k 1 η tr k 0 c 3∗ k,m =− Y k 0 <k η tr k 0 29 The proposition can be proved by contradiction. Assume that there exists a com- bination of c 1 k,m , c 2 k,m , and c 3 k,m such that U m 0({E ch k,m },{E ds k,m },{E srv k,m }) = X (k,m)∈D(1,m 0 ) c 1 k,m E ch k,m +c 2 k,m E srv k,m +c 3 k,m E ds k,m is an valid upper bound of 1 η tr 0 E in↓ 1,m 0−η tr 0 E in↑ 1,m 0 . Meanwhile, assume that there exist some values for E ch k,m ’s, E ds k,m ’s, and E srv k,m ’s such that U m 0 < U ∗ m 0. It follows that there exists a pair (k 0 ,m 0 ) such that c 1 k 0 ,m 0 < c 1∗ k 0 ,m 0 or c 2 k 0 ,m 0 < c 2∗ k 0 ,m 0 or c 3 k 0 ,m 0 <c 3∗ k 0 ,m 0 . Without loss of generality, we assume that c 1 k 0 ,m 0 <c 1∗ k 0 ,m 0 . By setting E ch k,m = E ds k,m =E srv k,m = 0 for all (k,m) except for E ch k 0 ,m 0 , we can get 1 η tr 0 E in↓ 1,m 0−η tr 0 E in↑ 1,m 0 =E ch k 0 ,m 0 Y k 0 <k 0 1 η tr k 0 =c 1∗ k 0 ,m 0 E ch k 0 ,m 0 >c 1 k 0 ,m 0 E ch k 0 ,m 0 =U m 0 which contradicts the assumption that U m 0 is an upper bound of 1 η tr 0 E in↓ 1,m 0−η tr 0 E in↑ 1,m 0 . For a server node (K,j), we can derive an upper bound for E srv K,j [t] as defined in Eqn. (3.5) as follows: E srv K,j [t] =y j [t]· P φ j · X i φ i,j [t] ! γ F +P con j τ ≤y j [t]· P φ j N · X i (Nφ i,j [t]) γ F +P con j τ (3.39) where N is the total number of user nodes. 30 Decomposing the constraints In order to decompose Constraints (3.14) and (3.15), we relax the OCP formu- lation and rewrite these two constraints as: E n k λ Λ i [t]− P j λ i,j [t] o ≤ 0 ∀i (3.40) E{ P i φ i,j [t]−k φ }≤ 0 ∀j (3.41) That is, instead of enforcing a strict bound in each time slot, we only require that request dispatch and resource allocation are balanced in a time-average sense. Two coefficients, k λ ∈ [1, +∞) and k φ ∈ (0, 1], are introduced to compensate for the relaxation and finetune the behavior of the system. With the relaxation, we can define two sets of virtual queues, denoted by n Z λ i [t] o and n Z φ j [t] o , respectively. As in Eqns. (3.29) and (3.30), the two queues can be updated as follows: Z λ i [t + 1] = h Z λ i [t] + k λ Λ i [t]− P j λ i,j [t] i + (3.42) Z φ j [t + 1] = h Z φ j [t] + ( P i φ i,j [t]−k φ ) i + (3.43) The two constraints can then be transformed into terms in the objective function. For Constraint (3.17), we first find an upper bound and a lower bound for the term E in↓ k,m −E in↑ k,m . For any node (k 0 ,m 0 ), an obvious upper bound of E in↓ k 0 ,m 0 −E in↑ k,m can be derived as E in↓ k 0 ,m 0 −E in↑ k 0 ,m 0 ≤ X (k,m)∈D(k 0 ,m 0 ) E ch k,m +E srv k,m k−1 Y k 0 =k 0 1 η tr k 0 (3.44) 31 Meanwhile, a lower bound of E in↓ k 0 ,m 0 −E in↑ k,m can be calculated as E in↓ k 0 ,m 0 −E in↑ k 0 ,m 0 ≥− X (k,m)∈D(k 0 ,m 0 ) E ds,i k,m k−1 Y k 0 =k 0 η tr k 0 (3.45) For convenience of expression, we denote the upper bound and lower bound of E in↓ k,m −E in↑ k,m derived as in Eqns. (3.44) and (3.45) asE U k,m andE L k,m , respectively. Relaxation similar to Constraints (3.14) and (3.15) can then be applied by rewriting Constraint (3.17) as E n E U k,m o ≤k U E lim↓ k,m ∀k,m (3.46) E n E L k,m o ≥−k L E lim↑ k,m ∀k,m (3.47) where k U ,k L ∈ (0, 1] . Two sets of virtual queues, n Z U k,m [t] o and n Z L k,m [t] o , are thereby introduced where Z U k,m [t + 1] = h Z U k,m [t] +E U k,m [T ]−k U E lim↓ k,m i + (3.48) Z L k,m [t + 1] = h Z L k,m [t]−E L k,m [T ]−k L E lim↑ k,m i + (3.49) 3.4.4 Complete solution After transforming the objective function and constraints in the original OCP, we can formulate one subproblem for each node (k,m) (solved for E ch,i k,m [t] and E ds,i k,m [t]) and one subproblem for each user-server pair (i,j) (solved for x i,j [t]’s, y j [t], λ i,j [t] and φ i,j [t]). Regarding E ch,i k,m [t] and E ds,i k,m [t] for node (k,m) (either server or non-server), we formulate a subproblem in time slott, referred to as ProblemPN(k,m), as follows: Find: E ch,i k,m [t] and E ds,i k,m [t] 32 Minimize: V·p k,m [t]· E ch k,m [t] Y k 0 <k 1 η tr k 0 −E ds k,m [t] Y k 0 <k η tr k 0 +Z E k,m [t]· E ds,i k,m [t]−E ch,i k,m [t] +E ch k,m [t] X (k 0 ,m 0 )∈A(k,m) Z U k 0 ,m 0 [t] k−1 Y k 0 =k 0 1 η tr k 0 +E ds,i k,m [t] X (k 0 ,m 0 )∈A(k,m) Z L k 0 ,m 0 [t] k−1 Y k 0 =k 0 η tr k 0 (3.50) Subject to: Constraints (3.24) and (3.25) Given the backlogs of all virtual queues, Problem PN(k,m)’s are convex and can be solved efficiently using standard solvers such as CVX [54]. Regarding x i,j [t]’s, y j [t], λ i,j [t] and φ i,j [t] for server j, the subproblem can be formulated as follows Find: x i,j [t]’s, y j [t], λ i,j [t]’s, and φ i,j [t]’s Minimize: V·p K,j [t]·E srv K,j [t]− X i Z λ i [t]λ i,j [t] +Z φ j [t] X i φ i,j [t] +E srv K,j [t] X (k 0 ,m 0 )∈A(K,j) Z U k 0 ,m 0 [t] K−1 Y k 0 =k 0 1 η tr k 0 (3.51) Subject to: Constraints (3.18) – (3.27) where E srv K,j [t] can be approximated as in Eqn. (3.39). 33 If y j [t] = 0, i.e. server j is switched off in time slot t, it directly follows that x i,j [t] = λ i,j [t] = φ i,j [t] = 0. Hence, the problem as formulated above is trivially solved. We denote the optimal value of the objective function as in Eqn. (3.51) as C 0 j [t]. Ontheotherhand,ify j [t] = 1,wecanfurtherdecomposetheaforementioned problem into a number of subproblems with respect to each user i, referred to as Problem PU(i,j), as follows: Find: x i,j [t], λ i,j [t], and φ i,j [t] Minimize: V·p K,j [t]· P φ j N · (Nφ i,j [t]) γ F −Z λ i [t]λ i,j [t] +Z φ j [t]φ i,j [t] + P φ j N · (Nφ i,j [t]) γ F · X (k 0 ,m 0 )∈A(K,j) Z U k 0 ,m 0 [t] K−1 Y k 0 =k 0 1 η tr k 0 (3.52) Subject to: Constraints (3.18) – (3.22), (3.24) – (3.27) Sincex i,j [t] is the only integer variable in ProblemPU(i,j) and it is either 0 or 1, we can enumerate allpossible valuesofx i,j [t]. Fixing thevalue ofx i,j [t], Problem PU(i,j) is convex and can be solved efficiently using standard solvers. Plugging in the optimal solution of Problem PU(i,j) for all i into Eqn. (3.51), we can obtain the optimal objective function when y j = 1 for server j, denoted by C 1 j [t]. Comparing the values of C 0 j [t] and C 1 j [t], we can find the optimal value for y j [t] and implement the optimal solution to x i,j [t]’s, λ i,j [t]’s, and φ i,j [t]’s accordingly. As explained in Section 3.4.1, the coefficient V in PN(k,m)’s and PU(i,j)’s can be used to adjust the tradeoff between the stability of the virtual queues and 34 the optimality of the cost/penalty function. Furthermore, we show that V can be used to control the bound on the SoC of the batteries. Proposition 3.∀k,m,t, E bat k,m [t]≥E lo k,m in the solution to PN(k,m) if 0<V ≤ (E hi k,m −E lo k,m −E ds,m k,m )/p max k,m where p max k,m = max t p k,m [t]. Proof. We prove the propostion using mathematical induction. Apparently, E bat k,m [0]≥E lo k,m . Assume thatE bat k,m [t]≥E lo k,m . IfE bat k,m [t]≥E lo k,m +E DM k,m , it can be trivially proved that E bat k,m [t + 1]≥ E lo k,m . Otherwise, Z E k,m [t] > E hi k,m −E lo k,m −E ds,m k,m . As a result, we will have E ds,i k,m [t] = 0 to yield the minimum value of the objective function as in Eqn. (3.50) because Z E k,m [t]E ds,i k,m [t]−Vp k,m [t]E ds k,m [t] Y k 0 <k η T k 0≥ Z E k,m [t]−Vp k,m [t] E ds,i k,m [t]≥ 0 (3.53) and E ds,i k,m [t] X (k 0 ,m 0 )∈A(k,m) Z L k 0 ,m 0 [t] k−1 Y k 0 =k 0 η tr k 0≥ 0 (3.54) Summarizing both cases, we have E bat k,m [t + 1]≥E lo k,m . The pseudo code of the complete solution is shown in Fig. 3.3. As shown in the figure, solving for the control decision in each time slot involves two major steps. In the first step, using the current backlogs of virtual queues, solutions to a set of convex optimization problems are assembled which yield the control decision for each node in the current time slot. In the second step, the backlogs of virtual queues (e.g. SoC of batteries, server utilization levels, etc.) are updated as a result of the control decision. In practice, the backlogs can be updated from either real measurement or estimation based on the system model. 35 1: t← 0 2: loop 3: for all node (k,m) do 4: implement solution to PN(k,m) 5: end for 6: for all server j do 7: calculate C 0 j [t] 8: for all user i do 9: solve PU(i,j) 10: end for 11: calculate C 1 j [t] 12: if C 0 j [t]<C 1 j [t] then 13: y j [t]← 0 14: for all user i do 15: x i,j [t]← 0, λ i,j [t]← 0, φ i,j [t]← 0 16: end for 17: else 18: y j [t]← 1 19: for all user i do 20: implement solution to PU(i,j) 21: end for 22: end if 23: end for 24: t←t + 1 25: updateZ E k,m [t]’s,Z U k,m [t]’s andZ L k,m [t]’s using Eqns. (3.9), (3.48) and (3.49), respectively 26: update Z λ i [t]’s and Z φ j [t]’s using Eqns. (3.42) and (3.43) 27: end loop Figure 3.3: Complete solution method The time complexity of the proposed solution can be derived as O(K P k M k + NM K ). Note that the complexity of the proposed solution scales linearly with the total number of nodes in the system. 36 0 5 10 15 20 Time (hour) 0 0.1 0.2 0.3 Electricity price ($/kWh) Los Angeles New York Figure 3.4: Adopted TOU pricing scheme 3.5 Experimental results 3.5.1 Experimental setup In our simulation framework, we consider a cloud infrastructure with two dat- acenters in Los Angeles and New York, respectively. Each datacenter is comprised of two PDU, each connected to 32 racks with 24 servers per rack. The computa- tional power and power consumption of servers are considered to be heterogeneous in general. Five user nodes are generated in the five most populous cities in the United States, with the number of user requests proportional to the population of the city. The diurnal workload pattern is extracted from the Google cluster trace in 2011 [8]. The round-trip time of a user request is estimated based on the distance between the user and the datacenter and is considered to increase by 5ms per 100km distance. We adopt time-of-use utility pricing schemes for the two datacenters. For the datacenterinLosAngeles, weusethetime-of-useratescheduleA-1fromLADWP 4 . For the datacenter in New York, we adopt the business time-of-use rates from Con 4 https://www.ladwp.com/ladwp/faces/ladwp/aboutus/a-financesandreports/a-fr- electricrates/a-fr-er-electricrateschedules 37 Table 3.2: Model parameters Parameter Value η ch ,η ds 95% η tr k 97% for all levels E lo k,m 20% of the nominal energy capacity P φ j 100–200W P con j 100–200W T req i 1s τ 1h 0 1 2 3 4 5 Time (day) 0 2000 4000 6000 8000 10000 Cumulative cost ($) 100% battery cost 75% battery cost 50% battery cost 25% battery cost No battery cost Baseline Figure 3.5: Cumulative cost with lead-acid batteries as ESDs Edison 5 . The hourly electricity prices of the two cities are shown in Fig. 3.4. Other system model parameters can be found in Table 3.2. 3.5.2 Simulation results We first consider the scenario where lead-acid batteries are used as ESDs with Peukert factors γ C = γ D = 1.15. ESDs are deployed hierarchically within data- centers with 0.24kWh, 5kWh, 150kWh, and 300kWh energy capacity at the server level, rack level, PDU level, and datacenter level, respectively. Note that the 5 https://www.coned.com/en/save-money/energy-saving-programs/time-of-use 38 0.24kWh ESD at the server level is approximately 3L in volume if lead-acid bat- teries are used, which is typically the space limit for a server-level ESD. At higher levels, We assume that the energy storage capacity scales proportionally to the number of servers connected below each node. The amortized battery replacement cost is estimated assuming a 750-cycle lifetime at 80% depth of discharge [55] as in our use case. According to reference [56], lead-acid batteries costs 120$/kWh. Meanwhile, lead-acid batteries are commonly deployed and over-provisioned in datacenters as UPS units. Considering the relatively short calendar life (about four years [6]), these lead-acid batteries need to be replaced regularly even without aggressive charging/discharging for peak shaving purposes. Therefore, the actual battery replacement overhead in our solution should be less than the amortized cost of purchasing the full energy storage capacity for the datacenters. The cumu- lative cost considering energy consumption and battery replacement expenses over a five-day period is shown in Fig. 3.5. The scenario without peak shaving is used as the baseline. As can be seen from the figure, the proposed solution is beneficial when the actual battery replacement overhead is up to the amortized purchasing cost of 50% of the total energy storage capacity within the infrastructure. We then consider the scenario where Lithium-ion batteries are used as ESDs with Peukert factors γ C = γ D = 1.05. To begin with, we use the same energy storage capacity as in the lead-acid battery scenario. Due to the rapid decrease in the cost of Lithium-ion batteries, we show the performance of our solution using the (estimated) battery cost in different years. In 2014, the cost of Lithium-ion batteries is 200$/kWh [57]. In 2016, the cost of Lithium-ion batteries has been brought down to 145$/kWh [58]. It has been projected that the cost of Lithium- ion batteries will be reduced below 100$/kWh by 2025 [59]. The amortized cost of the battery is calculated assuming a 1900-cycle lifetime at 80% depth of discharge 39 0 1 2 3 4 5 Time (day) 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Cumulative cost ($) 2014 cost 2016 cost 2025 cost No battery cost Baseline Figure 3.6: Cumulative cost with Lithium-ion batteries as ESDs [56] as in our use case. The cumulative cost considering energy consumption and battery replacement expenses over a five-day period is shown in Fig. 3.6. As can be seen from the figure, the proposed solution can achieve up to 16.7% utility cost saving if there is no cost for battery replacement. While the cost saving is marginal (around 1.6%) when considering the battery cost in 2014, the cost saving becomes increasingly significant as the battery cost decreases over years. With 145$/kWh and 100$/kWh battery cost as in year 2016 and 2025, the cost saving is 6.1% and 9.1%, respectively. Since the energy density of Lithium-ion batteries is much larger than that of lead-acid batteries, a larger amount of energy storage can be provisioned given the same space constraint (e.g. 3L at the server level). When the energy storage capacity of Lithium-ion batteries is increased by 2x, 3x, and 4x, the daily cost will decrease by 10.3%, 13.0%, and 16.8%, respectively, using the battery cost of year 2016. 40 Chapter 4 Concurrent Placement, Capacity Provisioning, and Request Flow Control for a Distributed Cloud Infrastructure In this chapter, we propose a generalized joint optimization framework of both datacenter placement/capacity provisioning and request flow control/resource allo- cation. The total cost of ownership of the datacenters are optimized accounting for all the major aspects of the capital and operational expenses. 4.1 System model 4.1.1 User behavior LetU denote the set of cloud users. As previously stated, when the problem we are interested in is in the scale of a large country such as the United States, it is neither feasible nor necessary to address to each individual user in the network. In such cases, we divide the whole country into small regions and aggregate the individual users in each region into one user node, corresponding to one element u∈U. Generally, we can divide one day into a set of time periods, denoted byH. For each time periodh∈H, the request generation rate of useru is denoted byλ h u . 41 The total lasting time of time periodh in one day is denoted byl h . In the case that the users’ behavior is similar and can be characterized by the same set of request generation rate in different time sections, we include these time sections into one time period for simplicity. In other words, each time period in our problem may be comprised of a set of disjoint time sections. A simple example would be dividing a whole day into peak hours, h peak , and off-peak hours, h off−peak . Apparently, λ h peak u should be greater than λ h off−peak u . 4.1.2 Datacenter location and capacity LetD denote the set of available locations for datacenters. For each location d∈D, let y d = 1 if a datacenter is going to be built at location d, and y d = 0 otherwise. For each location d∈D, let z d = 1 if a datacenter is already built at locationd, andz d = 0 otherwise. We assume that the servers in the datacenters are homogeneous, andthenthesizeofadatacentercanbecharacterizedbythenumber of servers installed. Let n d denote the number of servers in a new datacenter at location d if y d = 1 or the number of servers added to an existing server if z d = 1. Due to some reasons, e.g. limited land space or available electricity generation, we cannot install infinite number of servers in a datacenter. Also, it is not practical to build up a datacenter with just a few servers. The minimum and maximum numbers of servers that can be added to each location are denoted by N d,min and N d,max , respectively. Alternatively, we can express the size of a datacenter in terms of its amount of computation resources. If the processing speed of one server is μ s , then the total processing speed of a datacenter with n d severs, denoted by μ d , can be calculated as μ d = n d ·μ s . In spite of the discrete nature, n d and μ d can be assumed to be able to take continuous values in the case that the total number of servers in a datacenter is large. For existing datacenters, we denote the original 42 number of servers and the original processing speed by n 0,d and μ 0,d , respectively. For new datacenters, we define n 0,d = 0 and μ 0,d = 0. 4.1.3 Routing and delay modeling As described in Section 4.1.1, the users behave differently in different time periods. Therefore, theroutingstrategyshouldbespecifiedaccordingly. Letx h u,d = 1 if during time period h, some of the requests generated by user u are routed to the datacenter at location d, and x h u,d = 0 otherwise. And let λ h u,d denote the rate that requests are routed from user u to the datacenter at location d during time period h when x h u,d = 1. According to the GPS model, the average processing latency of these requests, t h proc,u,d , can be calculated as t h proc,u,d = 1 φ h d,u · (μ d +μ 0,d )−λ h u,d (4.1) where φ h d,u is the proportion of computation resources that datacenter at location d allocated to user u during time period h. The equivalent processing speed for requests routed from user u to datacenter at location d, denoted by ψ h d,u , can be calculated as ψ h d,u =φ h d,u · (μ d +μ 0,d ) (4.2) It reflects the processing capability seen from a user’s side as if it is exclusively used for the user. Let t prop,u,d and t prop,d,u denote the propagation delay from user u to location d and the propagation delay from location d to user u, respectively. Then the 43 average total delay experienced by a request routed from user u to the datacenter at location d during time period h, denoted by t h total,u,d , can be calculated as: t h total,u,d =t prop,u,d +t h proc,u,d +t prop,d,u (4.3) where t h proc,u,d is calculated as in Eqn. (4.1). Generally, the propagation delays in the two directions are not necessarily the same because the data packets may go through different paths in the network. However, since the analysis of all the routers’ behavior in the network is complicated and out of the scope of this paper, we use the estimation of propagation delay based on Euclidean distance between the source and the destination, in which case the propagation delay in two directions will be the same. 4.1.4 Power consumption modeling For a datacenter consisting of homogeneous servers, let P idle and P peak denote the power consumptions of a server at the utilization level of 0 and 100%, respec- tively. And let E usage,d denote the power usage effectiveness (PUE) [60] of a dat- acenter at location d. Then according to [61], the total power consumption of a datacenter at locationd during time periodh, denoted byP h total,d , can be calculated as P h total,d =(n d +n 0,d )· [P idle + (E usage,d − 1)·P peak ] + (n d +n 0,d )· (P peak −P idle )·γ h d + (4.4) whereγ h d is the utilization level of the datacenter during time period h, which can be obtained by γ h d = X u x h u,d λ h u,d ! /(μ d +μ 0,d ) (4.5) 44 and is an empirical constant. If the datacenter treats all the servers in it fairly, then the power consumption corresponding to the newly installed servers, denoted by P h new,d , can be approximated as P h new,d ={n d [P idle + (E usage,d − 1)P peak ] +n d (P peak −P idle )γ h d + (4.6) Please note that P h total,d is equal to P h new,d for a new datacenter since n 0,d = 0. DefineP total,d as the maximum power consumption of the datacenter at location d. It can be calculated by applying γ h d = 1 to Eqn. (4.4) as P total,d = (n d +n 0,d )·E usage,d ·P peak + (4.7) Similarly, we defineP new,d the maximum power consumption of all the new servers. P new,d can be obtained by P new,d =n d ·E usage,d ·P peak + (4.8) These two parameters are useful for the estimation of the cost of a datacenter, which is introduced in Section 4.2.1. PUE is a parameter that characterize the power consumption of components other than the servers. It is related to the geographical location. For instance, PUE is usually higher in regions with higher temperature or humidity because the cooling system consumes more energy in these regions. 45 4.2 Problem formulation and solution method We formulate our problem as an optimization problem with the objective to minimize the average total cost of all the new and existing datacenters in the network subject to the constraint of a maximum allowable average latency for each user. We first calculate the overall cost of a datacenter. 4.2.1 Cost calculation The overall cost of a datacenter over its lifetime is comprised of various aspects of capital cost and operational cost, each of which may depend on the time, the size of the datacenter and/or the location of the datacenter. We will address these aspects separately. Land cost. The land cost, denoted byC land,d , is the cost of buying enough land space to build up a datacenter, and can be calculated as follows C land,d =LandPrice(d)·Area(d) (4.9) whereLandPrice(d) is the cost of unit area at locationd, andArea(d) is the area required for a datacenter at location d. According to [62], the required area of a datacenter can be estimated as linearly proportional to its maximum total power consumption. Since we do not need to account for the land space occupied by the original part of an existing datacenter, P new,d is used instead ofP total,d . Eqn. (4.9) can be rewritten as C land,d =LandPrice(d)·k p−a ·P new,d (4.10) 46 wherek p−a is the ratio between the required area of a datacenter and its maximum total power consumption. Infrastructure cost and server cost. Theinfrastructurecost, denotedbyC infra,d , is the cost of installing the power delivery network, the cooling system, and the internalnetworkingwithinadatacenter. Similartothelandcost, theinfrastructure cost of a datacenter can be estimated as C infra,d =k p−infra ·P new,d (4.11) where k p−infra , usually in the unit of $/MW, is the infrastructure cost per unit power consumption of the datacenter. The server cost, denoted by C serv,d , is the cost of buying new servers for a datacenter and can be straightforwardly calculated as C serv,d =n d ·ServerPrice (4.12) where ServerPrice is the price of a single server. Connection cost. The connection cost, denoted byC conn,d , is the cost to lay out the optical fibers to the nearest Internet Backbone and the cost to lay out power transmission lines to the existing electric power network, and can be calculated as C conn,d =TransLinePrice·DistPower(d) +FiberPrice·DistInternet(d) (4.13) whereDistPower(d)isthedistancefromthedatacenteratlocationdtothenearest power plant or the existing transmission line,DistInternet(d) is the distance from thedatacenteratlocationdtothenearestInternetbackbone, andTransLinePrice andFiberPrice are the prices of transmission line and optical fiber per unit length, 47 respectively. For an existing datacenter, we do not have this part of cost since the connection is already built up along with the datacenter. Cooling cost. The cooling cost of a datacenter can be divided into two parts. One is the cost of electricity consumed by the computer room air conditioners (CRACs), which is characterized by the PUE and covered in the electricity cost. And the other is the cost of water wasted in the cooling circulation, which is specified in the Water cost. Electricity cost. The electricity cost, denoted by C h elec,d , is the cost of the elec- tricity energy consumed while operating the datacenter. Since the total electricity power consumption of a datacenter can be obtained using Eqn. (4.4), the calcula- tion of C h elec,d is straightforward. C h elec,d =ElectricityPrice(d)·P h total,d (4.14) where ElectricityPrice(d) is the price of unit electricity energy at location d. Bandwidth cost. Thebandwidthcost,denotedbyC band,d ,isthecostofacquiring sufficientbandwidthforcommunicationsfromtheinternetserviceproviders(ISPs). If we allocate constant bandwidth to each server in the datacenter, then C band,d can be calculated as C band,d =BWPrice· (n d +n 0,d )·BWServer (4.15) where BWPrice is the price of unit amount of bandwidth set by the ISPs, and BWServer is the amount of bandwidth allocated to each server. Water cost. As introduced in the description of the cooling cost, the water cost, denoted by C water,d , is the cost of the water wasted by the water chiller, which 48 can be estimated from the maximum total power consumption of the datacenter. C water,d is calculated as C water,d =WaterPrice(d)·k p−w ·P total,d (4.16) whereWaterPrice(d) is the price of unit amount of water at locationd, andk p−w is the amount of water required by unit of maximum power consumption. Maintenance cost. The maintenance cost, denoted by C main,d , is the cost of hiring personnel to maintain and operate a datacenter and can be calculated as follows C main,d =k p−main ·P total,d +Salary(d)· (n d +n 0,d )/k s−p (4.17) where the first term on the right-hand side is the maintenance cost and the second term on the right-hand size is the operational cost. k p−main is the ratio between the maintenance cost and the maximum power consumption, Salary(d) is the salary to hire an administrator at location d, and k s−p describes how many servers one administrator can manage. To obtain the overall cost of a datacenter in one day, the one-time investment including the land cost, the infrastructure cost, the server cost, and the connection cost should be amortized over the expected life time of the datacenter, a server, the transmission line, or the optical fiber. Therefore, if y d = 1, i.e. we will build a new datacenter at location d, then the amortized overall cost per day of this datacenter, denoted by C total,d , can be calculated as C total,d = C land,d T dc + C infra,d T dc + C serv,d T serv + C conn,d T line + X h l h ·C h elec,d +C band,d +C water,d +C main,d (4.18) 49 whereT dc ,T serv , andT line are the expected lifetimes for the corresponding compo- nents. For an existing datacenter at location d with z d = 1, all the cost calculation is the same except that the connection cost, C conn,d , is set to zero. 4.2.2 Problem formulation The average latency is calculated as the weighted average of the latency values of requests routed to each datacenter. Hence, in the case that the requests from one user node are routed to multiple datacenters with different average latencies, the overall average latency can still satisfy the average latency constraint even if a small portion of requests routed to some datacenters experience extremely long latency. However, we eliminate this possibility by enforcing the latency constraint on every routing path because it can cause serious unfairness otherwise. This is because each user node in our problem consists of a large number of individual users, and the request from some individual users may be consistently routed to thedatacenterswithlonglatencyduetotheimplementedroutingschemeinreality. Based on the assumption of homogeneous servers, we can combine Eqn. (4.4) and (4.5), and get P h total,d =(n d +n 0,d )· [P idle + (E usage,d − 1)·P peak ] +μ s · (P peak −P idle )· X u x h u,d λ h u,d + (4.19) From Eqn. (4.9) to Eqn. (4.18), one can see that the total cost of a datacenter is a monotonically increasing function of the maximum power consumption of the datacenter. And from Eqn. (4.19), one can see that the maximum power con- sumption is a non-decreasing function of the number of servers in the datacenter. 50 Therefore, there is no reason to install extra servers to have to a datacenter with higher processing speed once the average latency constraint is satisfied. This being true, we can transform the constraint of maximum average latency to the required equivalent processing speed. Combining Eqn. (4.1) and (4.3), we get ψ h d,u = 1/(t h u,MAX −t prop,u,d −t prop,d,u ) +λ h u,d (4.20) where t h u,MAX is the maximum allowable average latency for user u during time period h. Given the values ofz d ’s,λ h u ’s,n 0,d ’s,N d ’s,μ s , and all other information required during the cost calculation, the optimization problem can be formulated as follows: Find x h u,d ,y d ,n d ,λ h u,d ,∀u∈U,∀d∈D,∀h∈H Minimize X d∈D (y d +z d )C total,d (4.21) Subject to P d λ h u,d =λ h u , ∀u,h (4.22) P u x h u,d ψ h d,u 6 (n 0,d +n d )μ s , ∀d,h (4.23) y d 6 1−z d , ∀d (4.24) λ h u,d λ h u 6x h u,d , ∀u,d,h (4.25) x h u,d 6x h th,u,d −δ, ∀u,d,h (4.26) n d N d 6y d +z d , ∀d (4.27) x h u,d 6y d +z d , ∀u,d,h (4.28) x h u,d ∈{0, 1}, ∀u,d,h (4.29) y d ∈{0, 1}, ∀d (4.30) 51 n d ∈ [N d,min ,N d,max ]∩N, ∀d (4.31) λ h u,d > 0, ∀u,d,h (4.32) where C total,d is specified in Eqn. (4.18), ψ h d,u is calculated in Eqn. (4.20), δ is set to a small positive value for the convenience of optimization, and x h th,u,d is defined as x h th,u,d = t h u,MAX −t prop,u,d −t prop,d,u t h u,MAX +t prop,u,d +t prop,d,u + 1 (4.33) x h th,u,d being less than or equal to 1 means that the propagation delay between the user and the datacenter is at least the maximum allowable average latency, and no matter how much computation resource is allocated to this portion of request, the delay constraint will be violated. The objective function is the sum of the total cost of all the datacenters over 24 hours of a day. Constraint (4.22) ensures that all the user requests at any time are routed to datacenters for processing. Constraint (4.23) ensures that the allocated computation resources in each datacenter do not exceed its maximum amount of available computation resources. We have Constraint (4.24) because the locations of existing datacenters are not the candidate locations for building new datacenters. We forceλ h u,d to be 0 when we do not choose to route any request from user u to the datacenter at location d during time period h by Constraint (4.25). With Constraint (4.26), no requests are routed to those datacenters that cannot respond within the latency constraint. With Constraint (4.27), no servers are installed where no datacenter already exists or is to be built. And with Con- straint (4.28), no requests are routed to non-existing datacenters. Constraints (4.29)(4.30)(4.31)(4.32) specify the ranges of value for variables x h u,d , y d , n d , and λ h u,d . 52 4.2.3 Solution method Generally, this is a mixed integer non-linear programming (MINLP) problem. Even though MINLP problems are NP-hard, it is acceptable to solve the problem directly with some standard solvers, such as CPLEX [63], regardless of the compu- tational complexity, since the problem is solved offline for only once. Nevertheless, some simplification can be made to the original problem. As is stated in Section 4.1.2, since the number of servers in a datacenter is considerably large, we can assume thatn d ’s can take continuous values with negligible error in cost and delay calculation, which reduce the number of integer variables in the problem. More- over, once the values ofx h u,d ’s andy d ’s are given, the objective function and all the constraints are linear. Therefore, we can apply stochastic optimization methods such as simulated annealing [64] or genetic algorithm [65] combined with linear programming to solve the problem. We also propose a greedy algorithm for sub-optimal solutions, which is applied in the following steps. Initially, we set all the y d ’s to 1 and all the n d ’s to 0. For each time period h, we first pick one user u and set the routing scheme to the one with the minimum cost increase among the routing schemes in which all the requests are routed to a single datacenter. This process is repeated for every user. When the routing for all the users in every time period is done, we can find out the minimum capacity for each datacenter, and the datacenter that is not used will not be built. In the simple case with uniform user behavior over time, no existing datacenters and in Eqn. (4.4) equal to 0, if the datacenter can be sized arbitrarily and all the candidate location can be connected to the power network and Internet with minimal cost, then the greedy algorithm can be proved to solve the problem optimally. 53 4.3 Experimental results In this section, we give an example on a service area of the United States. 4.3.1 Experimental setup We set the ten most populous cities in the U.S. as the user nodes. The popula- tion information is obtained from the U.S. Census Bureau website 1 . We select to buildthedatacentersamongeightcandidatelocations, whichareAustin, Bismarck, Los Angeles, New York City, Oklahoma City, Orlando, Seattle, and St. Louis. The World Geodetic System (WGS) coordinates of the user nodes and the candidate building locations are obtained from the GeoNames database 2 . The propagation delay between a datacenter and a user node is set to be linearly proportional to the distance between the two locations. The propagation delay increases by 5ms for every 100km of distance. For each user and time period, the maximum average latency, t h u,MAX , is set to t MAX uniformly. Each day is divided into peak hours and off-peak hours. Through the Google cluster data, we obtain the request arrival pattern by averaging the number of requests arriving in each hour within a period of one month. We then divide the 24 hours of a day into 12 peak hours and 12 off-peak hours and calculate the ratio between the average request arrival rates of peak hours and off-peak hours. We define this ratio to be also the ratio of request generation rate in each user node during the peak hours and the off-peak hours. We assume a 1 request per second generation rate per 40,000 population in each user node during the peak 1 http://www.census.gov/2010census/popmap/ 2 http://www.geonames.org/ 54 hours. And the request generation rate during the off-peak hours can be calculated accordingly. For each datacenter, the minimum and maximum number of servers are set to 1,000 and 50,000, respectively. We use Dell PowerEdge R610 as the server model, which has a peak power consumption of 260W and an idle power consumption of approximately 160W. Each server by itself is assumed to process requests with an average response time of 100s. The empirical constant, , is set to 0. The PUE of a datacenter is modeled as a function of the temperature as in [16]. Other parameters in the cost calculation are specified as follows. The land price for each location is calculated by averaging the prices looked up on real estate websites. k p−a in Eqn. (4.10) is set to 6,000 sf/MW as recommended in [62]. k p−infra in Eqn. (4.11) is set to $15/W [62]. Each server costs $2,000. The electric power grid map is obtained from the National Public Radio website 3 and each candidate city is covered by the existing power transmission lines. The Internet backbone maps can also be found online 4 , and the cost of optical fiber is set to $480,000 per mile. Each server is allocated 1Mbps of network bandwidth, and the price of bandwidth is set to $1/Mbps. The electric power prices in different states are obtained from the U.S. Energy Information Administration website 5 . The water prices are obtained from the government websites of each city, and k p−w in Eqn. (4.16) is set to 24,000 gal/MW/day. The maintenance cost is set to $0.05/W/month [66]. The salary of each administrator is set to $100,000 per year, and k s−p in Eqn. (4.17) is set to 1,000. The expected lifetime of a datacenter, a server, and the optical fiber is set to 12 years, 4 years, and 12 years, respectively. 3 http://www.npr.org/templates/story/story.php?storyId=110997398 4 http://www.nthelp.com/maps.htm 5 http://www.eia.gov/electricity/ 55 4.3.2 Simulation results In this part, we will present the simulation results under two different scenarios. Figure 4.1: Simulation Result of Scenario 1 Figure 4.2: Simulation Result of Scenario 2 56 In Scenario 1, there are no existing datacenters, i.e. z d = 0,∀d. Two baselines are chosen by randomly place three datacenters among the eight candidate loca- tions. In Baseline 1, datacenters are built in Los Angeles, New York City, and Seattle. In Baseline 2, datacenters are built in Bismarck, Oklahoma City, and St. Louis. The relationship between the overall cost per day andt MAX is shown in Fig. 4.1. The cost of Baseline 2 whent MAX = 100ms is not shown because the latency constraint cannot be satisfied in that case. Generally, the placement and capacity of the datacenters are different with different t MAX in our result. For example, when t MAX = 200ms, datacenters are built in Oklahoma City and St. Louis with 46,671 servers and 23,322 servers, respectively. When t MAX = 600ms, datacen- ters are built in Austin, Oklahoma City, and Seattle with 15,741 servers, 38,094 servers, and 9,158 servers, respectively. One can make the observation that the overall cost is lower with highert MAX , which is consistent with the fact that looser constraints result in better performance. It can also be seen from the result that the overall cost of the proposed schemes always lower than that of the baselines. When t MAX = 500ms, the proposed method saves $1.22M and $475k per month compared to Baseline 1 and Baseline 2, respectively. In Scenario 2, there already exists a datacenter in Seattle with 5,000 servers. The same baseline schemes are used for comparison. The result is shown in Fig. 4.2. As can be seen from the figure, the result is similar to that of Scenario 1. 57 Chapter 5 Optimal Offloading Control in a Mobile Cloud Computing System In this chapter, we address the problem of application management on a mobile deviceinanMCCsystem, whererequestsoftheapplicationcanbeeitherprocessed locally or sent to the remote server. The optimal offloading rate, DVFS policy, and transmission scheme are jointly solved using an SMDP model. 5.1 System model An MCC system, the framework of which is shown in Fig. 5.1, is comprised of a server (or a set of servers) and a set of mobile devices. Assume that the computation capability of the server is much higher than the mobile devices and the total number of mobile devices in the MCC system is relatively large, then the influence of one device’s behavior is negligible on the overall performance of the server. From the view of the specific mobile device that we are interested in, the server can be characterized by an average processing rateμ s and a utilization level ρ s . 5.1.1 Overall system modeling for an MCC system The wireless channel between a mobile device and the server is noisy and may cause error in transmission. Therefore, the ARQ protocol is applied. In an ARQ 58 Figure 5.1: System framework of an MCC system protocol, thereceiverusestheredundantinformationaddedbythetransmitterside through check sum or channel coding, to determine whether the received frame is correct, and then send back an ACK signal if no error is detected or an NAK signal otherwise. Given a modulation scheme (corresponding to a transmission bit rate R b,k ), letE s denote the average energy per symbol received at the receiver side. E s is calculated as the arithmetic average value of the energy consumption of all the symbols in the symbol set. AndE b =E s / log 2 n is the average transmission energy per bit, where n is the order of modulation. Given the power spectral density of the noise, denoted by N 0 , the symbol error rate (SER) of transmission can be expressed as a function of E b N 0 [67]. Assume that there are L bits in one frame, the frame error rate (FER) can be calculated as FER = 1− (1−SER) L/ log 2 n (5.1) 59 If the noise level is not so high such that there are multiple erroneous symbol in one frame, the FER can also be used as the probability that an NAK signal is sent back. According to [68], the energy required per transmitted bit on the transmitter side, denoted byE b,Tx , is calculated based onE b , which is measured on the receiver side, as E b,Tx =k T E b ·d β (5.2) where k T is a constant depending on the channel bandwidth, antenna gain and amplifier efficiency; d is the distance between the transmitter and the receiver, and β is the path loss exponent. In general, if we assume that the mobile device only moves within a short distance relative to its distance to the server during transmission, then the actual transmission energy per bit can be approximated as proportional to the received energy per bit. The mobile device is comprised of a task dispatching unit, a local processing unit (CPU), a processing queue, a transmitter, a transmission queue, and some other components. Each application is interpreted as a set of tasks in the form of computation requests. The local processing unit can only process one computation request at a time and the transmitter can only begin to transmit another request after finishing transmitting the current one. All the awaiting requests are placed in the corresponding (processing or transmission) queue in an FIFO manner. Any request arrival upon a full queue will simply be disregarded. In order to find an analytical form of the average processing delay of each request, we assume that the request generation follows a Poisson process with average generation rate of λ. Suppose that we are to offload a request to the server (cloud) with probability p off . Then according to the characteristics of Poisson process, the request arrivals at the transmission queue and the processing queue are independent of each other 60 and both follow a Poisson process, with average arrival rates given by λ t =p off ·λ and λ p = (1−p off )·λ, respectively. Since we apply DVFS to the mobile CPU, there is a set of operating frequencies{f 0 ,f 1 ,...,f M } that the CPU can choose from. Similarly, the RF transmitter can use any transmission bit rate in the set of {R b,0 ,R b,1 ,...,R b,K }, which corresponds to different modulation schemes. 5.1.2 Power modeling for a mobile device If we control the CPU to run at frequencyf m and the RF transmitter to trans- mit using bit rate R b,k , then the total power consumption of all the components, denoted by P (k,m) total , can be divided into three parts and calculated as P (k,m) total =P m p +P k t +P (k,m) x (5.3) whereP m p is the power consumption of the mobile CPU and is a superlinear (usu- ally 2nd to 3rd order) function of the execution frequency f m ; P k t is the power consumption of the RF transmitter and is given by P k t =E b,Tx R b,k ; and P (k,m) x is the total power consumption of the other components in the mobile device that cannot be controlled directly, e.g., memory, GPU, touch screen, I/O ports. As mentioned in [69, 68], we can calculate P m p and P k t precisely using correspond- ing power models, but it is difficult to accurately calculate the value of P (k,m) x . Therefore, we treat P (k,m) x as a random variable which has different probability distributions depending on the decision we make, i.e., the CPU frequency f m and the transmission bit rate R b,k . We find the probability distributions of P (k,m) x for different decision pairs through extensive experiments on the Qualcomm Snap- dragon Platform, and use the statistics to reflect the behavior of these components 61 Figure 5.2: Equivalent circuit model for Li-ion batteries [1] in general mobile devices, which provides us guidelines in deriving the optimal policy. 5.1.3 Modeling for the rechargeable battery In this paper, we use the battery model described in [1], which is in the form of an equivalent electrical circuit as shown in Fig. 5.2. All the parameters in the circuit, including the open circuit voltage (OCV) and the internal resistances and capacitances, are functions of the state-of-charge (SoC) of the battery defined as SoC = C b /C b,full , where C b is the remaining charge of the battery and C b,full is the total charge of the battery when it is fully charged. If we ignore the transient effect caused by the capacitances because the output voltage of the battery does not change rapidly during a continuous discharging process, then the open circuit voltage, denoted by V OC and the internal resistance of the circuit, which is the sum of R s , R ts , and R tl , and denoted by R in , can be calculated as follows: V OC =b 11 e b 12 ·SoC +b 13 · (SoC) 3 +b 14 · (SoC) 2 +b 15 ·SoC +b 16 (5.4) R in =b 21 e b 22 ·SoC +b 31 e b 32 ·SoC +b 41 e b 42 ·SoC +b 23 +b 33 +b 43 (5.5) where all the b ij ’s are specified as in [21]. 62 Figure 5.3: Conceptual diagram of a power conversion tree A power delivery network (PDN) comprised of multiple DC-DC converters is considered to connect between the battery and various components in the mobile device (Fig. 5.3) so as to provide the potentially different supply voltage levels of those components(e.g., on the Qualcomm Snapdragon Platform, the CPU cores require0.8V-1.225V;thedigitalcorerequires1.1V;thetouchscreenrequires2.85V; the display backlight requires 3.8V). Each DC-DC converter connects between the battery and a (set of) mobile components (with the same supply voltage level) [70]. In our battery model, the input voltage of a DC-DC converter, or the close circuit output voltage (CCV) of the battery, denoted by V CC , can be calculated through KVL as V CC =V OC −I in ·R in (5.6) where I in is the output current of the battery. Let P total denote the total power consumption of all the components in the mobile device, and η c denote the energy conversion efficiency of the DC-DC con- verters in the PDN, we have V CC ·I in = P total η c (5.7) 63 Combining Eqn. (5.6) and (5.7), we have (V CC ) 2 −V OC ·V CC + P total ·R in η c = 0 (5.8) (I in ) 2 ·R in −V OC ·I in + P total η c = 0 (5.9) Based on the values of V OC , R in , P total and η c , we can solve Eqn. (5.8) and (5.9) and get V CC = V OC + s (V OC ) 2 − 4P total ·R in η c ! /2 (5.10) I in = 1 2R in V OC − s (V OC ) 2 − 4 P total ·R in η c ! (5.11) Furthermore, the rate capacity effect causes the actual battery charge loss rate to be greater than the discharging current, or equivalently, the energy loss rate greater than the battery’s output power. According to the Peukert’s Law [36], which is an empirical rule that accurately relates the total discharging time and the discharging current of the battery, the total discharging time of the battery, denoted by T d , is determined by T d = Q ref I eq (5.12) where Q ref is the total charge of the battery measured by a small reference dis- charging current I ref (with negligible rate capacity effect), and the equivalent dis- charging current I eq can be calculated as: I eq = I in I ref ! γc I ref (5.13) 64 where γ c is the Peukert constant, which depends on the battery type (e.g., larger forlead-acidbatteriesandsmallerforLi-ionbatteries[22].) CombiningEqn. (5.11) and (5.13), we get I eq = " 1 2R in V OC − s (V OC ) 2 − 4P total R in η c !# γc (I ref ) 1−γc (5.14) ThisI eq value will be used to reflect the actual battery charge loss rate. Moreover, the energy loss rate of the battery is given by V OC ·I eq , which is from the view of the battery and higher than the sum of power consumptions of all the mobile components. 5.2 SMDP based problem formulation and solu- tion framework 5.2.1 Device modeling using SMDP An SMDP is comprised of a set of states S and a set of actions A, After a transition, if state s∈S is observed, an action is chosen from a subsetA s ⊆A. A policy, denoted byπ, is a description of what action to take in each state of the system. In general, different policies will result in different transition probabilities between states and different average stay times in each states. A policy in an SMDP can either be deterministic or randomized. In the case of a deterministic policy, We use π ={(s,a)|s∈ S,a∈ A s }. Each (state, action) pair addresses the action to take in one state. For an SMDP, the distribution of the direction of transition and the inter-transition time are functions of the current (state, action) pair and do not depend on the history of the state or the action. In this paper, we consider the case where the system can transit from a state to itself. 65 For the purpose of demonstrating the model clearly, we first model the trans- mission queue and the processing queue as separate SMDP’s. For the transmission queue, the state set is S (t) ={0, 1,...,Q t }, each state representing the corre- sponding length of the transmission queue (in this paper, the length of a queue includes the request that is being processed), where Q t is the maximum length of the transmission queue. And the action set isA (t) ={R b,0 ,R b,1 ,...,R b,K }, each action representing a bit rate the RF transmitter supports. We assume that every request is transmitted in a separate frame and there is no fragmentation of the frame. We also assume that the length of each request in terms of transmitted bit count follows an exponential distribution. Then the length of a transmitted frame also follows an exponential distribution. Denote the mean value of this dis- tribution by ¯ L, which is the average request length. Then transmission time for a request when action R b,k ∈A (t) is taken follows an exponential distribution with a mean value ofμ t (k) = ¯ L/R b,k . Due to the presence of the ARQ protocol, in case of receiving an NAK signal, the corresponding frame should be added back to the transmission queue and wait to be transmitted again. We assume that the receiver on the cloud can give out ARQ responses quickly and the ARQ response travels at a high speed so that the time between a request is transmitted and the NAK 66 signal is received can be omitted. Using the information provided above, the state transition probabilities of the transmission queue can be calculated as follows p t,k i,i 0 = 1, i = 0,i 0 = 1 q(k), i =i 0 =Q t 1−q(k), i =Q t ,i 0 =Q t − 1 λ t λ t +μ t (k) , 16i6Q t − 1,i 0 =i + 1 q(k)μ t (k) λ t +μ t (k) , 16i6Q t − 1,i 0 =i (1−q(k))μ t (k) λ t +μ t (k) , 16i6Q t − 1,i 0 =i− 1 0, otherwise (5.15) where p t,k i,i 0 is the probability that the system will make a transition to state i 0 under the condition that the system is currently in state i, and q(k) is the FER under bit rateR b,k . For the processing queue, the state set isS (p) ={0, 1,...,Q p }, where Q p is the maximum length of the processing queue, and the action set is A (p) ={f 0 ,f 1 ,...,f M }, each action representing an execution frequency of the processor. Assume that the execution time for a request follows an exponential distribution with mean value μ p (m) if frequency f m is chosen, then the transition probabilities can be calculated in the form similar to Eqn. (5.15) As mentioned in Section 5.1.3, the energy loss rate of the battery is a super- linear function of the total power consumption of all the mobile components. In order to accurately translate the components’ power consumption into its impact on the battery life-time, we need to calculate the actual energy loss rate of the battery which is affected jointly by the power consumptions of the two controllable parts (CPU and RF module) as well as the other components. Therefore, we 67 combine the two aforementioned processes into one SMDP as shown in Fig. 5.4. In the new SMDP, the state set isS ={(i,j)|06 i6 Q t , 06 j 6 Q p }, and the action set isA i,j ={(k,m)|R b,k ∈A (t) i ,f m ∈A (p) j }. Since all the events in the two processes are independent, the transition probabilities can be calculated in a straightforward way. For instance, with 16i6Q t − 1, 16j6Q p − 1, we have p (k,m) (i,j),(i 0 ,j 0 ) = λ t λ +μ t (k) +μ p (m) , i 0 =i + 1,j 0 =j λ p λ +μ t (k) +μ p (m) , i 0 =i,j 0 =j + 1 μ p (m) λ +μ t (k) +μ p (m) , i 0 =i,j 0 =j− 1 q(k)μ t (k) λ +μ t (k) +μ p (m) , i 0 =i,j 0 =j (1−q(k))μ t (k) λ +μ t (k) +μ p (m) , i 0 =i− 1,j 0 =j 0, otherwise (5.16) Also, we can calculate the average transition time for a state, denoted by τ (k,m) (i,j) . For instance, τ (k,m) (i,j) = 1/(λ +μ t (k) +μ p (m)) for 16i6Q t − 1, 16j6Q p − 1. For a given policyπ, the proposed SMDP is irreducible, aperiodic, and positive recurrent, and has only a finite number of states. Hence, the system will eventu- ally reach a steady state where the probability that any specific state is observed remains constant and is irrelevant to the initial state of the system. We denote the steady state probability of state (i,j) under policy π, i.e. the probability that state (i,j) is observed when the system is in the steady state, by ˜ p π (i,j) . 68 Figure 5.4: State transition diagram for the joint SMDP 5.2.2 Problem formulation Since a user is usually concerned about both the service quality and the bat- tery life-time, we use a linear combination of the average processing latency and the average energy loss per request as the cost function, denoted by C(p off ,π). C(p off ,π) can be calculated as C(p off ,π) = ¯ D(p off ,π) +k E · ¯ E eq (p off ,π) (5.17) where ¯ D(p off ,π) is the average processing latency per request, ¯ E eq (p off ,π) is the averageenergylossofthebatteryperrequest, andk E isthecoefficientthatcontrols the trade-off between the average latency and energy loss. The value of k E may vary for different mobile devices or for different type of users. In general, we can optimize ¯ D(p off ,π) and set ¯ E eq (p off ,π) as constraint, or vice versa, using the same optimization algorithm as shall be discussed in Section 5.2.3. According to Little’s Theorem [71], the average time that a request stays in a queueing system is equal to the average number of total waiting requests in the 69 system divided by the request generation rate. In our SMDP model, the average processing latency for a locally executed request, denoted by ¯ D p (p off ,π), can be calculated as ¯ D p (p off ,π) = P i,j j· ˜ p π i,j λ p (5.18) On the other hand, an offloaded request will be transmitted to the cloud for execu- tion and then be transmitted back. Therefore, these requests also has propagation delayonthetransmissionpathandprocessingdelayonthecloud. Theaveragepro- cessing latency for an offloaded request, denoted by ¯ D t (p off ,π), can be calculated as ¯ D t (p off ,π) = P i,j i· ˜ p π i,j λ t + ( ¯ T ps +RTT ) (5.19) where ¯ T ps is the average processing time for a request on the server, and RTT is the round trip time for the request. If the server has an average processing rate of μ s and a utilization level of ρ s , then ¯ T ps = 1/[μ s (1−ρ s )] (5.20) ¯ D(p off ,π) can be calculated as the weighted average of ¯ D p (p off ,π) and ¯ D t (p off ,π): ¯ D(p off ,π) =p off ¯ D t (p off ,π) + (1−p off ) ¯ D p (p off ,π) (5.21) ¯ E eq (p off ,π) can be calculated using the average value ofI eq , denoted by ¯ I eq , as ¯ E eq (p off ,π) = V OC · ¯ I eq λ (5.22) For given action pair (k,m), the equivalent discharging current, denoted by I (k,m) eq can be calculated using the value of P (k,m) total as in Eqn. (5.3). Since P (k,m) total is a 70 random variable,I (k,m) eq is also a random variable. Based on Eqn. (5.14), the mean value of I (k,m) eq , denoted by ¯ I (k,m) eq , can be calculated as ¯ I (k,m) eq = (I ref ) 1−γc · Z h I in (P m p +P k t +P x ) i γc ·f Px (P x ;k,m)dP x (5.23) where f Px (P x ;k,m) is the probability density function of the power consumption of other modules (that cannot be controlled directly) when action pair (k,m) is taken, and the function I in (P ) is given by I in (P ) = 1 2R in V OC − s (V OC ) 2 − 4P·R in η c ! (5.24) which is derived from Eqn. (5.11). ¯ I eq can be calculated as the weighted average of ¯ I (k,m) eq where the relative weights are functions of the policy π. Itcanbeseenfromabovethattheoverallcostfunction,C(p off ,π), isafunction oftheoffloadingprobabilityp off andpolicyπ. Sotheobjectiveistofindtheoptimal offloading probability p ∗ off and the optimal control policy π ∗ , satisfying p ∗ off = argmin p off min π C(p off ,π) (5.25) π ∗ = argmin π C(p ∗ off ,π) (5.26) As discussed in Section 5.1.3, the SoC of the battery decreases through the dis- charging process, causingV OC andR in to change continuously in the optimization problem formulation. However, the typical battery life time is much longer than the processing latency of a request. Therefore, we divide the whole discharging process into a series of sections. In each section, we see the SoC (and V OC and R in ) as a constant value in the optimization problem. The accuracy degradation will be negligible by setting the duration of a section small enough. 71 5.2.3 Solution method Theoptimaloffloadingprobabilityp off andtheoptimalpolicyπ arefoundusing an iterative method comprised of a outer loop to find the optimalp off , and a inner kernel algorithm to find the optimal policy π with the given p off value. In the kernel algorithm, we solve a linear programming problem, and in the outer loop, we conduct an effective one-dimensional search. We first discuss about the kernel algorithm. With given p off value, all the parameters in the SMDP including the transition probabilities are known to us. Therefore, the problem to find the optimal π can be transformed into a Markov renewal programming problem [72]. We formulate the optimization problem as follows: Find f (k,m) (i,j) Minimize C(p off ) Subject to X i 0 ,j 0 ,k,m f (k,m) (i 0 ,j 0 ) ·p (k,m) (i 0 ,j 0 ),(i,j) = X k,m f (k,m) (i,j) ,∀(i,j)∈S (5.27) X i,j,k,m f (k,m) (i,j) ·τ (k,m) (i,j) = 1 (5.28) f (k,m) (i,j) > 0,∀(i,j)∈S,∀(k,m)∈A (5.29) where f (k,m) (i,j) is the frequency that the system enters the state (i,j) and action (k,m) is taken, and C(p off ) is the cost function calculated with a specified p off value, taking into account the expected time a request spends in the device, the 72 energy it costs, as well as the extra time it spends if it is offloaded to the cloud. The relationship between f (k,m) (i,j) and ˜ p π (i,j) is given by ˜ p π (i,j) = X k,m f (k,m) (i,j) ·τ (k,m) (i,j) (5.30) Combining Eqn. (5.17), (5.18), (5.19), (5.21), (5.22), and (5.30), C(p off ) is given by C(p off ) = 1 λ X i,j,k,m i +j +k E V OC ¯ I (k,m) eq f (k,m) (i,j) τ (k,m) (i,j) +p off · ( 1 μ s (1−ρ s ) +RTT ) (5.31) Constraint (5.27) addresses the balance condition for the system in the steady state. Constraint (5.28) normalizes the sum of the steady-state probabilities in all states. Constraint (5.29) limits each frequency value to be non-negative. Note that now the objective function and the constraints are all transformed into linear functions of the optimization variables. This is a linear programming problem that can be solved using standard solver such as the MOSEK [73]. After all the f (k,m) (i,j) values are found, we formulate the optimal policy π using standard methods as mentioned in [74]. Once we know how to find the optimal policy π with any given p off , the prob- lem in the outer loop of finding p ∗ off becomes a one-dimensional unconstrained optimization problem. In general, C(p off ) is a quasi-convex (unimodal) function of p off between 0 and 1 with a unique minimum point. Therefore, we apply some heuristic searching technique, such as golden section search [75], to select the p off value in each iteration and find the optimal p ∗ off . We can apply this algorithm in an online manner, by monitoring the SoC and recalculating the solution every time when the SoC change exceeds a predefined threshold value compared to the SoC used to calculate the current solution. To 73 reduce the online computation complexity, we can pre-calculate the optimal solu- tions under different SoC’s offline, and store them into a lookup table for online use. In this way, we can simply monitor the SoC change online and index the optimal p off and π at different SoC levels along with battery discharging. 5.3 Experimental results We run a set of applications on the Qualcomm Snapdragon Platform including Google search (web browsing), YouTube (online video playing), AnTuTu Bench- mark(acomprehensivebenchmarktotestCPU,graphics, andI/O),andGLBench- mark(abenchmarkfocusedongraphics)andextractthepowerprofileofthedevice using Trepn. We monitor the CPU core sensor to characterize the power consump- tion of the CPU and the digital core sensor to characterize the power consumption of the RF module. The probability density function, f Px , in Eqn. (5.23) is esti- mated using the sampling results of the sensors. The simulation parameters are set as follows. The processor can perform a 5-level DVFS at 1x, 1.25x, 1.5x, 1.75x, and 2x of the minimum frequency, and the dynamic power consumption is proportional to the frequency to the power of 2.5. The static power of the CPU is 100mW, and the total power consumption of the five DVFS levels are 200mW, 275mW, 375mW, 505mW, 665mW, repectively. The RF transmitter has an static power consumption of 250mW and can use either QPSK or 16QAM modulation, with total power consumption of 450mW and 700mW, respectively. The FER for the two modulation schemes are set to 10 −3 and 5× 10 −4 . The processing rate of the CPU running at the minimum frequency is normalized to 1, and the transmission rate of QPSK is set to 8. The distribution of the power consumption of other modules are derived as mentioned above. We 74 scale this part of the power consumption to have a mean value of 500mW. TheV OC and R in are calculated using parameters in [21]. The power conversion efficiency of the converters in the PDN is 0.7 and the Peukert’s constant, γ c , is set to 1.05. The servers in the cloud have a normalized processing rate of 20. The coefficient, k E , in the cost function is set to 20W −1 . The normalized request generation rate, λ, varies from 0.6 to 2.0, the SoC level varies from 0.1 to 1.0, and the normalized RTT for a offloaded request varies from 0.4 to 1.4. Our proposed algorithm is compared to a number of baseline algorithms. Base- line 1–3 are baselines without DVFS or computation offloading. Baseline 1 uses only the minimum CPU frequency, Baseline 2 uses only the maximum CPU fre- quency, and Baseline 3 uses only the 1.5x frequency. Baseline 4 supports DVFS but does not offload any request to the cloud. In contrast, Baseline 5 offloads all the requests to the cloud. Figure 5.5: Simulation result with varying request generation rate Fig. 5.5 shows the simulation results when the SoC level is set to 0.5, the normalized RTT is set to 0.6, and the normalized request generation rate varies from 0.6 to 2.0. Fig. 5.6 shows the simulation results when the request generation 75 Figure 5.6: Simulation result with varying SoC Figure 5.7: Simulation result with varying utilization level of the server in the cloud rate is set to 1.5, the normalized RTT is set to 0.6, and the SoC level varies from 0.1 to 1.0. In all cases, the proposed algorithm results in the lowest total cost. Although Baseline 5 also seems to be an acceptable algorithm in the simulation results presented above, its performance can degrade significantly when offloading requests to the cloud suffers from high latency. Fig. 5.7 shows the simulation results when the request generation rate is set to 1.2, the SoC level is set to 0.5, 76 Figure 5.8: Simulation result with varying RTT the normalized RTT is set to 0.6, the utilization level of the server varies from 0.8 to 0.98. Fig. 5.8 shows the simulation results when the request generation rate is set to 1.5, the SoC level is set to 0.5, and the normalized RTT varies from 0.4 to 1.4. Since Baseline 1–4 do not support computation offloading, their performances remain the same all the time when the RTT changes. Therefore, we only show Baseline 4 which has the lowest cost among the four. It can be seen that the proposed algorithm still outperforms the baseline algorithms and both baseline algorithms result significantly higher cost under either high RTT’s or low RTT’s. 77 Chapter 6 Dynamic Voltage and Frequency Scaling in a Mobile Device with a Heterogeneous Computing Architecture While the SMDP model in Chapter 5 is primarily used to account for the dependency between the transmitting and processing elements in a mobile device, a similar model can also be derived to deal with a heterogeneous computing archi- tecture comprised of multiple types of processors/cores. An example of the het- erogeneous computing architecture is the ARM big.LITTLE architecture [76] in which powerful but power-hungry “big” cores are combined with energy-saving but slower “LITTLE” cores. With per-core DVFS applicable, there is the problem of both choosing the type of core for a user task and selecting the most adequate operating frequency. 6.1 System model The adopted system model of a mobile device with a heterogeneous computing architecture can be summarized as in Fig. 6.1. 78 Task Scheduler Core 1 Core m ... ... Core M Other Components DC/DC Converter Rechargeable Ba ery Incoming Requests Figure 6.1: System model of a mobile device with a heterogeneous computing architecture 6.1.1 Heterogeneous computing architecture Consider a system ofM cores indexed withm = 1,...,M where the operating frequency of them-th core can be selected from a set of available frequencies,F m = {F m,1 ,...,F m,km ,...,F m,Km }. Notethattheremayexistdependenciesbetweenthe frequencies of each core due to constraints set by the power budget or control logic. For instance, if the cores are organized using the “cluster migration” approach [77] for the big.LITTLE architecture, then big cores and LITTLE cores cannot be activated at the same time. We capture the feasible frequency combinations using a setC F ⊂ N M + . A tuple k = (k 1 ,...,k M ) belongs toC F if and only if (F 1,k 1 ,...,F M,k M ) is a feasible combination of operating frequencies for core 1 to core M. On the other hand, the average request generation rate of the system is denoted byλ. To obtain an analytical form of the average processing time, we assume that the request generation follows a Poisson process. For the task scheduler, we assume that there areL different dispatch schemes with different probabilities to dispatch 79 requests to each of the cores. In the l-th scheme, we denote the probability to dispatch a request to the m-th core by p l,m . It is obvious that for any l, we have X m p l,m = 1 (6.1) In addition, given a dispatch scheme, the request arrival process at each core is also Poisson with an average rate of λ l,m =p l,m ·λ. Each core is associated with a processing queue to hold pending requests. The maximum length of the queue associated with the m-th core is denoted by N m . For simplicity, we consider the case in which requests are placed into each queue in a first-in-first-out manner and incoming requests will be dropped if the queue is already full. The average processing rate for the m-th core corresponding to operating frequency F m,km is denoted by μ m,km . 6.1.2 System Power Consumption Because of system components with uncontrollable power consumption (“other components” in Fig. 6.1), we cannot accurately estimate the total power consump- tion of the system in real time even if the DVFS control policy is given. However, as discussed in Chapter 5, we can treat the total power consumption of the system, denoted by P total , as a random variable of which the distribution depends on the DVFS scheme. The probability density function under each frequency combination can be characterized beforehand. One can again apply Eqns. (5.6) – (5.14) to find the equivalent energy loss rate of the rechargeable battery. The mean value of the equivalent energy loss rate of the battery is denoted by ¯ E k 1 ,...,k M when theM cores are operating at frequencies F 1,k 1 ,...,F M,k M , respectively. 80 6.2 Problem formulation We will formulate the joint request dispatch and DVFS problem using a continuous-time Markov decision process (CTMDP), which is a simpler version of the SMDP formulation in Chapter 5. The state space, S, can be charac- terized as the M-ary Cartisian product of the sets of possible queue length of each of the M cores. A state, s ∈ S, can be uniquely defined using a tuple s = (n 1 ,...,n m ,...,n M ), where n m is the number of requests in the processing queue associated with core m. The action space,A, is defined as the Cartisian product of the set of request dispatch options and the set of feasible DVFS schemes across all cores. In other words, A ={(l,k)|1≤l≤L and k∈C F } (6.2) With the definition of the state space and action space, we can in turn derive the state transition probabilities and the average stay times in each state given an action. For instance, the transition probability from state (n 1 ,...,n M ) to (n 0 1 ,...,n 0 M ), with action (l,k), denoted by p (l,k) (n 1 ,...,n M )→(n 0 1 ,...,n 0 M ) , can be calculated as follows when none of the queues is empty or full: p (l,k) (n 1 ,...,n M )→(n 0 1 ,...,n 0 M ) = λ l,m 0 λ + P m μ m,km , n 0 m 0 =n m 0 + 1 and∀m6=m 0 ,n 0 m =n m μ m 0 ,km 0 λ + P m μ m,km , n 0 m 0 =n m 0 − 1 and∀m6=m 0 ,n 0 m =n m 0, otherwise (6.3) 81 The average stay time in state (n 1 ,...,n M ), when none of the queues is empty or full and action (l,k) is taken, denoted by τ (l,k) (n 1 ,...,n M ) , can be calculated as τ (l,k) (n 1 ,...,n M ) = 1 λ + P m μ m,km (6.4) The transition probabilities and average stay times in corner cases can be derived similarly. Since the aforementioned CTMDP is irreducible, aperiodic, and positive recur- rent, there exists a steady-state distribution independent of the initial state of the system. The probability that the system lies in state (n 1 ,...,n M ) under policy π will be denoted by ˜ p π (n 1 ,...,n M ) . We formulate the optimization problem with an objective function as a lin- ear combination of the average processing time and the average energy drain per request. The objective function, denoted by C(π), can be calculated as C(π) = ¯ D(π) +α· ¯ E(π) (6.5) where ¯ D(π) is the average processing time of a request, ¯ E(π) is the average energy drain from the battery per request, and α is a coefficient to balance between the energy efficiency and quality of service. The average processing time, ¯ D(π), can be calculated based on the Little’s theorem [71] as follows ¯ D(π) = 1 λ X n 1 ,...,n M ˜ p π (n 1 ,...,n M ) · X m n m ! (6.6) The average energy drain from the battery, ¯ E(π) can be calculated as the weighted average of the expected energy loss rate under each of the feasible combination of 82 operating frequencies (defined as ¯ E k 1 ,...,k M ’s), which will be elaborated in the next section. 6.3 Solution method For a CTMDP in the steady state, the optimal policy can be found by solv- ing a Markov renewal programming problem [72] using linear programming. The complete formulation is shown as follows: Find: f (l,k) (n 1 ,...,n M ) ’s Minimize: ¯ D(π) +α· ¯ E(π) Subject to: X n 0 1 ,...,n 0 M ,l,k 1 ,...,k M f (l,k) (n 0 1 ,...,n 0 M ) ·p (l,k) (n 0 1 ,...,n 0 M )→(n 1 ,...,n M ) = X l,k 1 ,...,k M f (l,k) (n 1 ,...,n M ) , ∀n 1 ,...,n M (6.7) X n 1 ,...,n M ,l,k 1 ,...,k M f (l,k) (n 1 ,...,n M ) ·τ (l,k) (n 1 ,...,n M ) = 1 (6.8) f (l,k) (n 1 ,...,n M ) ≥ 0, ∀n 1 ,...,n M ,l,k (6.9) wheref (l,k) (n 1 ,...,n M ) is the frequency (in Hz) that the system enters state (n 1 ,...,n M ) and action (l,k) is taken. The average processing time ¯ D(π) can be rewritten as ¯ D(π) = 1 λ X n 1 ,...,n M X l,k 1 ,...,k M f (l,k) (n 1 ,...,n M ) ·τ (l,k) (n 1 ,...,n M ) · X m n m (6.10) while the average energy drain ¯ E(π) can be calculated as ¯ E(π) = 1 λ X l,k 1 ,...,k M " ¯ E k 1 ,...,k M · X n 1 ,...,n M f (l,k) (n 1 ,...,n M ) ·τ (l,k) (n 1 ,...,n M ) # (6.11) 83 Constraint (6.7) guarantees the stability of the system by enforcing the balance between exiting old states and entering new states. Constraint (6.8) is a normaliza- tion constraint for the steady-state distribution. Constraint (6.9) sets the domain of the decision variables. After the values of f (l,k) (n 1 ,...,n M ) ’s are found, one can derive the optimal policy straightforwardly. 84 Chapter 7 Conclusion With the objective of joint optimization of multiple control variables in elec- trical or electronic systems, we propose solutions to four challenging problems in datacenters and mobile devices. First, we derive a scalable solution to the request dispatch, resource allocation, server consolidation, and ESD management problem in a geo-distributed cloud infrastructure to achieve peak shaving in a time-of-use utility pricing scheme. Second, we concurrently work on the design choices and control decisions in operation of a cloud infrastructure to minimize the total cost of ownership comprised of both capital expenses and operational expenses. Third, we design the control policy for a mobile device to balance between the energy con- sumption and quality of service in a mobile cloud computing system. Last but not least, we extend the SMDP formulation for the MCC system to address the joint request dispatch and DVFS control of a heterogeneous computing architecture. 85 Acknowledgment I would like to thank my advisor, Prof. Massoud Pedram, for guiding me through my exciting yet challenging Ph.D experiences, without whom I would not have the opportunity to learn about so many interesting research topics. I am also grateful to my qualifying exam and dissertation committee members, Prof. Murali Annavaram, Prof. Aiichiro Nakano, Prof. Shahin Nazarian, and Prof. Viktor Prasanna, for their helpful comments and advice on my work. I would like to thank my fellow group members and research collaborators for the stimulating discussions and the fantastic papers as a result. Last but by no means the least, I would like to thank my family: my parents and my wife in particular, for their patience and support. 86 Reference List [1] M. Chen and G. A. Rincon-Mora, “Accurate electrical battery model capable of predicting runtime and iv performance,” Energy conversion, IEEE trans- actions on, vol. 21, no. 2, pp. 504–511, 2006. [2] L. Benini, A. Bogliolo, and G. De Micheli, “A survey of design techniques for system-level dynamic power management,” IEEE transactions on very large scale integration (VLSI) systems, vol. 8, no. 3, pp. 299–316, 2000. [3] G. Cook, “How clean is your cloud? catalysing an energy revolution,” Green- peace International, 2012. [4] H. Lim, A. Kansal, and J. Liu, “Power budgeting for virtualized data centers,” in 2011 USENIX Annual Technical Conference (USENIX ATC?11), 2011, p. 59. [5] B. Aksanli, E. Pettis, and T. Rosing, “Architecting efficient peak power shav- ing using batteries in data centers,” in IEEE 21st International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Sys- tems. IEEE, 2013, pp. 242–253. [6] D. Wang, C. Ren, A. Sivasubramaniam, B. Urgaonkar, and H. Fathy, “Energy storage in datacenters: what, where, and how much?” ACM SIGMETRICS Performance Evaluation Review, vol. 40, no. 1, pp. 187–198, 2012. [7] P. Rong and M. Pedram, “Extending the lifetime of a network of battery- powered mobile devices by remote processing: a Markovian decision-based approach,” in Design Automation Conference, 2003. Proceedings, 2003, pp. 906–911. [8] “Google cluster data.” [Online]. Available: https://code.google.com/p/ googleclusterdata/ [9] A. K. Parekh and R. G. Gallager, “A generalized processor sharing approach to flow control in integrated services networks: the single-node case,” IEEE/ACM Trans. Netw., vol. 1, no. 3, pp. 344–357, 1993. 87 [10] Z.-L. Zhang, D. Towsley, and J. Kurose, “Statistical analysis of generalized processor sharing scheduling discipline,” in ACM SIGCOMM Computer Com- munication Review, vol. 24, no. 4. ACM, 1994, pp. 68–77. [11] J. Medhi, Stochastic processes. Wiley, 1982. [12] B. Hayes, “Cloud computing,” Commun. ACM, vol. 51, no. 7, pp. 9–11, Jul. 2008. [13] R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud com- puting and emerging it platforms: Vision, hype, and reality for delivering computing as the 5th utility,” Future Generation computer systems, vol. 25, no. 6, pp. 599–616, 2009. [14] R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. De Rose, and R. Buyya, “Cloudsim: a toolkit for modeling and simulation of cloud computing environ- ments and evaluation of resource provisioning algorithms,” Software: Practice and Experience, vol. 41, no. 1, pp. 23–50, 2011. [15] A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel, “The cost of a cloud: research problems in data center networks,” ACM SIGCOMM Computer Communication Review, vol. 39, no. 1, pp. 68–73, 2008. [16] I. Goiri, K. Le, J. Guitart, J. Torres, and R. Bianchini, “Intelligent place- ment of datacenters for internet services,” in Distributed Computing Systems (ICDCS), 2011 31st International Conference on, 2011, pp. 131–142. [17] A.-H. Mohsenian-Rad and A. Leon-Garcia, “Energy-information transmission tradeoff in green cloud computing,” Carbon, vol. 100, p. 200, 2010. [18] P. Rong and M. Pedram, “An analytical model for predicting the remaining batterycapacityoflithium-ionbatteries,” Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 14, no. 5, pp. 441–451, 2006. [19] D.Rakhmatov, “Batteryvoltagemodelingforportablesystems,” ACM Trans- actions on Design Automation of Electronic Systems (TODAES), vol. 14, no. 2, p. 29, 2009. [20] L. Benini, G. Castelli, A. Macii, E. Macii, M. Poncino, and R. Scarsi, “Discrete-time battery models for system-level low-power design,” Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 9, no. 5, pp. 630–640, 2001. [21] D. Shin, Y. Kim, J. Seo, N. Chang, Y. Wang, and M. Pedram, “Battery- supercapacitor hybrid system for high-rate pulsed load applications,” in 88 Design, Automation Test in Europe Conference Exhibition (DATE), 2011, March 2011, pp. 1–4. [22] T. B. Reddy, Linden’s Handbook of Batteries. McGraw-Hill, 2011. [23] A. Millner, “Modeling lithium ion battery degradation in electric vehicles,” in IEEE Conference on Innovative Technologies for an Efficient and Reliable Electricity Supply. IEEE, 2010, pp. 349–356. [24] S. McCluer and J.-F. Christin, “Comparing data center batteries, flywheels, and ultracapacitors,” White paper, vol. 65, p. 202, 2008. [25] R. Buyya, A. Beloglazov, and J. Abawajy, “Energy-efficient management of data center resources for cloud computing: A vision, architectural elements, and open challenges,” arXiv preprint arXiv:1006.0308, 2010. [26] M. Pedram, “Energy-efficient datacenters,” IEEE Transactions on Computer- Aided Design of Integrated Circuits and Systems, vol. 31, no. 10, pp. 1465– 1484, 2012. [27] R. Raghavendra, P. Ranganathan, V. Talwar, Z. Wang, and X. Zhu, “No power struggles: Coordinated multi-level power management for the data center,” in SIGARCH Computer Architecture News, vol. 36, no. 1. ACM, 2008, pp. 48–59. [28] D. Meisner, C. M. Sadler, L. A. Barroso, W.-D. Weber, and T. F. Wenisch, “Power management of online data-intensive services,” in ISCA’11. IEEE, 2011, pp. 319–330. [29] R. Nathuji, K. Schwan, A. Somani, and Y. Joshi, “VPM tokens: virtual machine-aware power budgeting in datacenters,” Cluster computing, vol. 12, no. 2, pp. 189–203, 2009. [30] N. Buchbinder, N. Jain, and I. Menache, “Online job-migration for reducing the electricity bill in the cloud,” in NETWORKING 2011. Springer, 2011, pp. 172–185. [31] V. Kontorinis, L. E. Zhang, B. Aksanli, J. Sampson, H. Homayoun, E. Pettis, D. M. Tullsen, and T. Simunic Rosing, “Managing distributed ups energy for effective power capping in data centers,” in ISCA’12. IEEE, 2012, pp. 488–499. [32] K. Kumar and Y.-H. Lu, “Cloud computing for mobile users: Can offloading computation save energy?” Computer, vol. 43, no. 4, pp. 51–56, 2010. 89 [33] Y. Wen, W. Zhang, and H. Luo, “Energy-optimal mobile application execu- tion: Taming resource-poor mobile devices with cloud clones,” in INFOCOM, 2012 Proceedings IEEE, 2012, pp. 2716–2720. [34] Y. Ge, Y. Zhang, Q. Qiu, and Y.-H. Lu, “A game theoretic resource alloca- tion for overall energy minimization in mobile cloud computing system,” in Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design (ISLPED), 2012. [35] C. Shankar and R. Campbell, “Managing pervasive systems using role-based obligation policies,” in PerCom Workshops 2006, Proceedings IEEE, 2006, pp. 373–377. [36] X. Gu, K. Nahrstedt, A. Messer, I. Greenberg, and D. Milojicic, “Adaptive offloading inference for delivering applications in pervasive computing envi- ronments,” in IEEE PerCom 2003, 2003, pp. 107–114. [37] D. Huang, P. Wang, and D. Niyato, “A dynamic offloading algorithm for mobile computing,” Wireless Communications, IEEE Transactions on, vol. 11, no. 6, pp. 1991–1995, 2012. [38] M.-R. Ra, A. Sheth, L. Mummert, P. Pillai, D. Wetherall, and R. Govindan, “Odessa: enabling interactive perception applications on mobile devices,” in Proceedings of the 9th international conference on Mobile systems, applica- tions, and services (Mobisys), 2011. [39] E. Cuervo, A. Balasubramanian, D.-k. Cho, A. Wolman, S. Saroiu, R. Chan- dra, and P. Bahl, “MAUI: making smartphones last longer with code offload,” in Proceedings of the 8th international conference on Mobile systems, applica- tions, and services (Mobisys), 2010. [40] L. Tassiulas and A. Ephremides, “Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks,” IEEE transactions on automatic control, vol. 37, no. 12, pp. 1936– 1948, 1992. [41] M. J. Neely, E. Modiano, and C.-P. Li, “Fairness and optimal stochastic con- trol for heterogeneous networks,” IEEE/ACM Transactions On Networking, vol. 16, no. 2, pp. 396–409, 2008. [42] R.Urgaonkar, B.Urgaonkar, M.J.Neely, andA.Sivasubramaniam, “Optimal powercostmanagementusingstoredenergyindatacenters,” in Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems. ACM, 2011, pp. 221–232. 90 [43] Y. Guo and Y. Fang, “Electricity cost saving strategy in data centers by using energy storage,” IEEE Transactions on Parallel and Distributed Sys- tems, vol. 24, no. 6, pp. 1149–1160, 2013. [44] S. Govindan, A. Sivasubramaniam, and B. Urgaonkar, “Benefits and limi- tations of tapping into stored energy for datacenters,” in ACM SIGARCH Computer Architecture News, vol. 39, no. 3. ACM, 2011, pp. 341–352. [45] M. Sun, Y. Xue, P. Bogdan, J. Tang, Y. Wang, and X. Lin, “Hierarchical and hybrid energy storage devices in data centers: Architecture, control and provisioning,” PloS one, vol. 13, no. 1, p. e0191450, 2018. [46] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B. Maggs, “Cut- ting the electric bill for internet-scale systems,” ACM SIGCOMM Computer Communication Review, vol. 39, no. 4, pp. 123–134, 2009. [47] A.Gandhi,M.Harchol-Balter,R.Das,andC.Lefurgy,“Optimalpoweralloca- tioninserverfarms,” inACM SIGMETRICS Performance Evaluation Review, vol. 37, no. 1. ACM, 2009, pp. 157–168. [48] “Battery performance characteristics,” 2007. [Online]. Available: http: //www.mpoweruk.com/performance.htm [49] T. Cui, S. Chen, Y. Wang, S. Nazarian, and M. Pedram, “Optimal control of pevs with a charging aggregator considering regulation service provisioning,” ACM Transactions on Cyber-Physical Systems, vol. 1, no. 4, p. 23, 2017. [50] S. Chen, Y. Wang, and M. Pedram, “A joint optimization framework for request scheduling and energy storage management in a data center,” in Cloud Computing (CLOUD), 2015 IEEE 8th International Conference on. IEEE, 2015, pp. 163–170. [51] L. Georgiadis, M. J. Neely, L. Tassiulas et al., “Resource allocation and cross- layer control in wireless networks,” Foundations and Trends R in Networking, vol. 1, no. 1, pp. 1–144, 2006. [52] M.J.Neely, “Stochasticnetworkoptimizationwithapplicationtocommunica- tion and queueing systems,” Synthesis Lectures on Communication Networks, vol. 3, no. 1, pp. 1–211, 2010. [53] J. L. W. V. Jensen, “Sur les fonctions convexes et les inégalités entre les valeurs moyennes,” Acta mathematica, vol. 30, no. 1, pp. 175–193, 1906. [54] M. Grant, S. Boyd, and Y. Ye, “CVX: Matlab software for disciplined convex programming,” 2008. 91 [55] “Lead acid is the cheapest battery: *conditions apply,” Aug 2011. [Online]. Available: https://saurorja.org/2011/08/30/lead-acid-is- the-cheapest-battery-conditions-apply/ [56] G. Albright, J. Edie, and S. Al-Hallaj, “A comparison of lead acid to lithium- ion in stationary storage applications,” 2012. [57] J. Pyper, “Tesla to miss 2020 delivery target by 402014. [Online]. Available: http://www.greentechmedia.com/articles/read/tesla- to-miss-2020-delivery-target-by-40-analyst-forecast [58] J. Cobb, J. Pritchard, S. Szymkowski, and E. Williams, “Chevy bolt production confirmed for 2016,” Oct 2015. [Online]. Available: http://www.hybridcars.com/chevy-bolt-production-confirmed-for-2016/ [59] M. Stevenson, “Lithium-ion battery packs now $209 per kwh, will fall to $100 by 2025: Bloomberg analysis,” Dec 2017. [Online]. Avail- able: https://www.greencarreports.com/news/1114245_lithium-ion-battery- packs-now-209-per-kwh-will-fall-to-100-by-2025-bloomberg-analysis [60] R. Brown et al., “Report to congress on server and data center energy effi- ciency: Public law 109-431,” Lawrence Berkeley National Laboratory, 2008. [61] X.Fan, W.-D.Weber, andL.A.Barroso, “Powerprovisioningforawarehouse- sized computer,” in ACM SIGARCH Computer Architecture News, vol. 35, no. 2. ACM, 2007, pp. 13–23. [62] W. P. Turner and J. H. Seader, “Dollars per kw plus dollars per square foot are a better datacenter cost model than dollars per square foot alone,” Uptime Institute White Paper, 2006. [63] I. CPLEX, “11.0 user’s manual,” ILOG SA, Gentilly, France, 2007. [64] P. J. Van Laarhoven and E. H. Aarts, Simulated annealing. Springer, 1987. [65] D. Whitley, “A genetic algorithm tutorial,” Statistics and computing, vol. 4, no. 2, pp. 65–85, 1994. [66] L. A. Barroso, J. Clidaras, and U. Hölzle, “The datacenter as a computer: An introduction to the design of warehouse-scale machines,” Synthesis lectures on computer architecture, vol. 8, no. 3, pp. 1–154, 2013. [67] J. Proakis and M. Salehi, Digital Communications, ser. McGraw-Hill higher education. McGraw-Hill, 2008. 92 [68] T.Rappaport,Wireless communications: principles and practice,ser.Prentice Hall communications engineering and emerging technologies series. Prentice Hall PTR, 1996. [69] T. D. Burd and R. W. Brodersen, “Energy efficient cmos microprocessor design,” in System Sciences, 1995. Proceedings of the Twenty-Eighth Hawaii International Conference on, vol. 1, pp. 288–297. [70] W. Lee, Y. Wang, D. Shin, N. Chang, and M. Pedram, “Power conversion efficiency characterization and optimization for smartphones,” in Proceedings of ISLPED, 2012, pp. 103–108. [71] L. Kleinrock, Queueing systems. volume 1: Theory. Wiley, 1975. [72] E. V. Denardo, “On linear programming in a markov decision problem,” Man- agement Science, vol. 16, no. 5, pp. 281–288, 1970. [73] E. D. Andersen and K. D. Andersen, The MOSEK interior point optimization for linear programming: an implementation of the homogeneous algorithm. Kluwer Academic Publishers, 1999, pp. 197–232. [74] U. Bhat and G. Miller, Elements of applied stochastic processes, ser. Wiley series in probability and statistics. Wiley, 2002. [75] J. Kiefer, “Sequential minimax search for a maximum,” Proceedings of the American Mathematical Society, vol. 4, no. 3, pp. 502–506, 1953. [76] P. Greenhalgh, “Big. little processing with arm cortex-a15 & cortex-a7,” ARM White paper, vol. 17, 2011. [77] B. Jeff, “Ten things to know about big.little.” [Online]. Avail- able: https://community.arm.com/processors/b/blog/posts/ten-things-to- know-about-big-little 93
Abstract (if available)
Abstract
Energy consumption, ensued by utility expenses, is a major concern in electrical and electronic systems ranging from warehouse-size datacenters to mobile phones and tablets. Cost-efficient management in these systems are usually partitioned into various sub-problems, each of which can be addressed by a number of methods and techniques. However, isolation of different sub-problems overlooks the inter-dependency in between, resulting in sub-optimality of the solution. Our work focuses on joint consideration and optimization of multiple power management techniques which has been studied separately in prior work. ❧ The methodology is demonstrated with four specific problems. We first consider the peak shaving problem in the context of a geo-distributed cloud infrastructure equipped with energy storage devices (ESDs). The ESD management problem is solved jointly with the request flow control and server consolidation problem using a Lyapunov optimization framework suitable for systems with a large number of servers. We then concurrently solve the total cost of ownership minimization problem of geo-distributed datacenters which involves both design-time choices such as capacity provisioning and run-time management policies such as user request routing. Third, we study the computation offloading problem in a mobile device using a semi-Markov decision process in which the processing power and the transmission power of the device are controlled in a unified fashion. And lastly, we discuss the problem of joint request dispatch and dynamic voltage and frequency scaling in a heterogeneous computing architecture. ❧ The proposed solution methods are validated with the support of realistic models and data. The experimental results reflect the superiority of our joint optimization methodology in these use cases.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
SLA-based, energy-efficient resource management in cloud computing systems
PDF
Energy efficient design and provisioning of hardware resources in modern computing systems
PDF
Energy proportional computing for multi-core and many-core servers
PDF
Architectures and algorithms of charge management and thermal control for energy storage systems and mobile devices
PDF
Towards green communications: energy efficient solutions for the next generation cellular mobile communication systems
PDF
Energy-efficient shutdown of circuit components and computing systems
PDF
Energy optimization of mobile applications
PDF
Thermal modeling and control in mobile and server systems
PDF
A framework for runtime energy efficient mobile execution
PDF
A joint framework of design, control, and applications of energy generation and energy storage systems
PDF
Integration of energy-efficient infrastructures and policies in smart grid
PDF
Multi-level and energy-aware resource consolidation in a virtualized cloud computing system
PDF
Demand based techniques to improve the energy efficiency of the execution units and the register file in general purpose graphics processing units
PDF
Towards energy efficient mobile sensing
PDF
Improving efficiency to advance resilient computing
PDF
Variation-aware circuit and chip level power optimization in digital VLSI systems
PDF
Power efficient design of SRAM arrays and optimal design of signal and power distribution networks in VLSI circuits
PDF
Design of low-power and resource-efficient on-chip networks
PDF
Modeling and optimization of energy-efficient and delay-constrained video sharing servers
PDF
Advanced cell design and reconfigurable circuits for single flux quantum technology
Asset Metadata
Creator
Chen, Shuang
(author)
Core Title
Energy-efficient computing: Datacenters, mobile devices, and mobile clouds
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
07/20/2018
Defense Date
06/01/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
cloud computing,data center,energy storage devices,mobile cloud,OAI-PMH Harvest
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Pedram, Massoud (
committee chair
), Annavaram, Murali (
committee member
), Nakano, Aiichiro (
committee member
)
Creator Email
chsh_8912@hotmail.com,shuangc@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-21233
Unique identifier
UC11671527
Identifier
etd-ChenShuang-6437.pdf (filename),usctheses-c89-21233 (legacy record id)
Legacy Identifier
etd-ChenShuang-6437.pdf
Dmrecord
21233
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Chen, Shuang
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
cloud computing
data center
energy storage devices
mobile cloud