Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Satisfying QoS requirements through user-system interaction analysis
(USC Thesis Other)
Satisfying QoS requirements through user-system interaction analysis
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
SATISFYING QOS REQUIREMENTS THROUGH USER-SYSTEM
INTERACTION ANALYSIS
by
Bo-Chun Wang
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
May 2015
Copyright 2015 Bo-Chun Wang
Table of Contents
List of Tables iv
List of Figures v
List of Algorithms viii
Abstract ix
Chapter 1: Introduction 1
Chapter 2: Related Work 9
2.1 P2P Streaming Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Virtual Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Chapter 3: A Comprehensive Study of the Use of Advertisements as Incen-
tives in P2P Streaming Systems 17
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 The Proposed System . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.1 System Framework . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.2 Token-Based Schemes . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.1 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.2 Contributions vs. Advertisements . . . . . . . . . . . . . . . . 34
3.3.3 Advertisement Distribution . . . . . . . . . . . . . . . . . . . . 35
3.3.4 Reusing tokens vs. Non-reusing tokens . . . . . . . . . . . . . 37
3.3.5 Content Interval . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4.1 Extending to multi-layer streaming . . . . . . . . . . . . . . . . 41
3.4.2 Peer selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.3 Overhead of distributing advertisement . . . . . . . . . . . . . . 42
3.4.4 Overhead of token decryption . . . . . . . . . . . . . . . . . . 42
3.4.5 Comparisons between token-based schemes . . . . . . . . . . . 43
ii
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Chapter 4: Resource Estimation for Network Virtualization through Users
and Network Interaction Analysis 48
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 Proposed Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.1 Traffic Equilibrium Analysis Model Extension . . . . . . . . . . 55
4.3.2 Performance Estimation Mechanism . . . . . . . . . . . . . . . 58
4.3.3 Virtual Network Embedding . . . . . . . . . . . . . . . . . . . 64
4.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4.1 Evaluation Environment . . . . . . . . . . . . . . . . . . . . . 66
4.4.2 Accuracy of Performance Estimation Mechanism . . . . . . . . 67
4.4.3 Robustness: Effect of Link Delay and Buffer Sizes . . . . . . . 69
4.4.4 Heterogeneous User Behavior and File Sizes . . . . . . . . . . . 71
4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Chapter 5: An Equilibrium-Based Modeling Framework for QoS-Aware Task
Management in Cloud Computing 82
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.2 Our Modeling Framework . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.2 Proposed Framework . . . . . . . . . . . . . . . . . . . . . . . 88
5.3 Evaluation and Validation . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3.1 Simulation Settings . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3.2 Re-launch model evaluation . . . . . . . . . . . . . . . . . . . 100
5.3.3 Re-instantiation model evaluation . . . . . . . . . . . . . . . . 107
5.4 Utility of Analytical Models . . . . . . . . . . . . . . . . . . . . . . . 108
5.4.1 Energy vs. Performance . . . . . . . . . . . . . . . . . . . . . . 109
5.4.2 Heterogeneous QoS Requirements . . . . . . . . . . . . . . . . 112
5.5 Extensibility of Framework . . . . . . . . . . . . . . . . . . . . . . . . 116
5.5.1 Combining mechanisms . . . . . . . . . . . . . . . . . . . . . 116
5.5.2 Other VM operations . . . . . . . . . . . . . . . . . . . . . . . 117
5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Chapter 6: Conclusions 121
References 122
iii
List of Tables
3.1 Notations list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Uploading bandwidth distribution . . . . . . . . . . . . . . . . . . . . 32
3.3 Default values of parameters . . . . . . . . . . . . . . . . . . . . . . . 33
5.1 Summary of Notation (r.v. = random variable) . . . . . . . . . . . . . . 89
5.2 Distribution of number of tasks per job . . . . . . . . . . . . . . . . . . 100
5.3 Default values of parameters . . . . . . . . . . . . . . . . . . . . . . . 100
5.4 Effect of approximation (model vs. mechanism in real systems . . . . . 106
5.5 Comparison between our model and the mechanism used in real systems
(uniform distribution: [0,
2
μ∗d()
]) . . . . . . . . . . . . . . . . . . . . . 106
5.6 Comparison between our model and the mechanism used in real systems
(normal distribution: standard deviation(
1
μ∗d()
)) . . . . . . . . . . . . . 107
5.7 Effect of approximation (model vs. mechanism in real systems,v=10) . 107
5.8 A summary of simulation settings . . . . . . . . . . . . . . . . . . . . 113
iv
List of Figures
3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Time Interval: peer views streaming content during time intervalt
i,p
and
views advertisements during time intervala
i,p
. . . . . . . . . . . . . . 25
3.3 Token Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4 Example of a function used by the token server to calculatef
i,p
. . . . . . 29
3.5 Population of peers viewing streaming data in the PPLive system . . . . 32
3.6 averagef
i,p
for peers with different uploading bandwidth in three schemes 35
3.7 averagef
i,p
for peers with different uploading ratio in three schemes . . 35
3.8 The average ofsurplus by using differentf . . . . . . . . . . . . . . . 37
3.9 The average ofrepeat by using differentf . . . . . . . . . . . . . . . . 37
3.10 The value ofα under differentδ . . . . . . . . . . . . . . . . . . . . . 38
3.11 The average ofdifference under differentt
i,p
. . . . . . . . . . . . . . 39
3.12 The average of difference when peer’s uploading ratio changes from
60% to 100%,t
i,p
= 10 min . . . . . . . . . . . . . . . . . . . . . . . . 40
3.13 The overhead in token servers. . . . . . . . . . . . . . . . . . . . . . . 41
4.1 Surfing session model (corresponds to Fig.3 in [76]). . . . . . . . . . . 52
4.2 Flow of downloads (corresponds to Fig.4 in [76]). . . . . . . . . . . . . 52
4.3 Two submodels: a user curve and a network curve (corresponds to Fig.5
in [76]). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Relationship between average TCP connection bandwidthb
TCP
, average
number of concurrent downloadsk, andp
abort
. . . . . . . . . . . . . . 59
v
4.5 The connection completion rate at equilibrium is where the user demand
(Eq.4.1) and the network supply (Eq.4.2) curves intersect. Notice how
the curves differ in shape between (a) and (b). . . . . . . . . . . . . . . 67
4.6 The average error rate for changing QoS and changingr
session
. . . . . 68
4.7 The equilibrium and our estimation mechanism are robust with respect
to uncertainty over link delay and buffer sizes. . . . . . . . . . . . . . . 70
4.8 The average error rate for changing QoS with different link delay and
buffer sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.9 Comparison between homogeneous settings and heterogeneous settings. 73
4.10 The average error rate for changing QoS and changingr
session
. . . . . 73
4.11 web traffic: Comparison under different file size distributions. . . . . . 74
4.12 video traffic: Comparison under different file size distributions. . . . . 75
4.13 The average error rate for changing QoS and changingr
session
. . . . . 76
4.14 The connection completion rate for the corresponding network capacity
andr
session
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.15 Resource estimation with different data points . . . . . . . . . . . . . . 77
5.1 Speculation mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.2 Running time of task copies . . . . . . . . . . . . . . . . . . . . . . . . 98
5.3 Equilibrium point - simulation vs. model (exponential distribution) . . . 102
5.4 Task QoS probability (exponential distribution) . . . . . . . . . . . . . 102
5.5 Equilibrium point - simulation vs. model (uniform distribution: [0,
2
μ∗d()
]) 103
5.6 Task QoS probability (uniform distribution: [0,
2
μ∗d()
]) . . . . . . . . . . 103
5.7 Equilibrium point - simulation vs. model (normal distribution: standard
deviation(
1
μ∗d()
)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.8 Task QoS probability (normal distribution: standard deviation(
1
μ∗d()
)) . . 104
5.9 Ground truth (simulation results) vs. estimation (using regression) . . . 105
5.10 Performance degradation functions . . . . . . . . . . . . . . . . . . . . 106
vi
5.11 Effect of performance degradation functions . . . . . . . . . . . . . . . 106
5.12 Effect of threshold on Re-launch vs. Re-instantiation . . . . . . . . . . 106
5.13 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.14 Number of active racks . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.15 Mean number of active racks as a function of dataset copies . . . . . . . 113
5.16 Number of active racks under all mechanisms . . . . . . . . . . . . . . 114
5.17 Task QoS probability that the service time of a task is less thant
q
under
all mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.18 Task QoS probability . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.19 Task QoS probability for each category . . . . . . . . . . . . . . . . . . 114
5.20 Job QoS probability for each category . . . . . . . . . . . . . . . . . . 114
5.21 Extensions: VM migration and cloning . . . . . . . . . . . . . . . . . . 118
vii
List of Algorithms
1 Calculation of the minimum amount of capacity needed to satisfy mul-
tiple QoS constraints simultaneously. . . . . . . . . . . . . . . . . . . . 63
viii
Abstract
Provision of quality of service (QoS) is an important issue for service providers. To
support required QoS, service providers need to insure that the amount of resources
allocated are sufficient to support required QoS and choose appropriate resource man-
agement approaches. Since users react to experienced QoS, service providers should
consider user-system interaction in resource allocation and resource management. For
example, when users open a web page or watch a video, if the downloading rate is slow,
they may abort the connection before it finishes. In such a case, the resources used by
the connection are wasted. Moreover, high aborting rate implies that required QoS is not
satisfied. This user behavior affects resource allocation. Service providers should con-
sider how to allocate resources so as to reduce aborting rate and the amount of wasted
resources.
In this dissertation, we focus on several services and study how service providers can
satisfy required QoS through appropriate resource allocation or resource management.
The first service we study is P2P streaming. In order to provide satisfactory perfor-
mance in P2P streaming systems, the number of peers with high upload capacities in
streaming systems should be sufficiently high. Thus, one important problem in provid-
ing streaming services is that of providing appropriate incentives for peers to contribute
their upload capacity. To this end, we propose and evaluate the use of advertisements as
an incentive for peers to contribute upload capacity.
ix
We then consider web and video services and study resource estimation in virtual
networks (VNs). Existing efforts focusing on resource allocation in VNs assume that
VN requests indicate the exact amount of required resources. However, they do not
consider how to determine the amount of resources required to support a needed QoS.
To this end, we propose an alternative approach - namely that of considering QoS as
a constraint. That is, when VN requests are made, service providers should be able to
use the minimum required QoS as constraints of that request, rather than the amount of
resources needed. The infrastructure provider must then determine the resource allo-
cation necessary for this QoS. In particular, the provider must take into account user
reaction to perceived performance and adjust the allocation dynamically. Consequently,
we study web and video services and propose an estimation mechanism that is based
on analyzing the interaction between user behavior and network performance. The pro-
posed approach can satisfy user performance requirements through appropriate resource
estimation. Moreover, our approach can adjust resource estimations efficiently and accu-
rately.
Finally, we focus on cloud computing. The use of cloud computing has been widely
adopted; however, satisfying QoS requirements while reducing costs can be a complex
problem for cloud providers. To facilitate resource management in such systems, we
focus on developing a modeling framework that can capture the interaction between
demand for service and system resources as well as facilitate the tradeoff between
QoS and cost as needed. Our extensive simulation-based model validation and eval-
uation indicates that our framework is accurate and can provide insight into appropriate
resource management for facilitating tradeoffs between performance commitments and
cost. Moreover, we demonstrate that our modeling framework is extensible and can be
applied to a number of task management mechanisms.
x
Chapter 1
Introduction
Provision of quality of service (QoS) is an important consideration for service providers.
When service providers deploy a service, they have to make sure that (1) the amount of
resources allocated are sufficient to support the needed QoS, and (2) the corresponding
resource management approaches can allocate resources efficiently while satisfying QoS
requirements. In such a system, users react to QoS. For example, when users open a web
page or watch a video, if the downloading rate is slow, they may abort the connection
before it finishes. In such a case, the resources used by the connection are wasted. More-
over, a high aborting rate could mean that a required QoS is not provided. Such user
behavior affects resource allocation. That is, to reduce aborting rates and to reduce the
amount of wasted resources, service providers need to allocate resources appropriately.
Therefore, service providers should consider user-system interaction when designing
a system, determining the amount of resources, and choosing resource management
approaches.
In this dissertation, we consider two types of user-system interactions, and we
demonstrate how service providers can satisfy the needed QoS by taking these inter-
actions into account. The first type of interactions is one that is incorporated into a
system’s design. Here, we use P2P file sharing systems as an example. In P2P file shar-
ing systems, to provide satisfactory performance, the total upload capacity in a system
should be sufficiently high. Therefore, when service providers design a new P2P system,
they have to consider how they can provide incentives to encourage users to contribute
greater upload capacity.
1
The second type of interaction is one that is due to system performance, especially
in the context of delay-sensitive services. Here, we consider client-server systems and
use web and video services as an example. In web and video services, users react to QoS
they experienced. For example, users may abort a connection due to a slow downloading
rate. Therefore, when service providers deploy resources, they need to take user reaction
into account.
Of course, both types of these interactions can exist in a variety of services simul-
taneously. For instance, in P2P streaming systems, service providers have to provide
incentives to encourage peers to contribute their upload capacity. Moreover, since
streaming services are delay-sensitive, peers will also react to system performance they
experience by canceling the connection or leaving the system. Another example is
cloud computing services. In this case, the user is not a real person, but rather a vir-
tual machine (VM) that reacts to system performance (the second type of interaction)
by going through migration processes or generating a copy. Moreover, when cloud
providers design a system, they have to determine what types of actions VMs can take
(the first type of interaction), e.g., migrating a VM from one server to another server or
making a copy to address straggler problems [5, 8, 89].
In this dissertation, we consider several different services and explore different
QoS requirements, resource types, and interaction types. We demonstrate how service
providers can satisfy QoS requirements through user-system interaction analysis. In
Chapter 3, we study the incentive problem in P2P streaming systems. We propose a
mechanism that uses advertisements as incentives. In Chapter 4, we propose a resource
estimation mechanism that is based on analyzing the interaction between user behavior
and network performance. In Chapter 5, we then focus on evaluating system perfor-
mance in the context of cloud computing. We propose a modeling framework which can
provide insight into appropriate resource management for facilitating tradeoffs between
2
performance commitments and cost. We now give a high level overview of each of these
works and corresponding contributions.
The first service we study is P2P streaming systems. Such systems have become
popular with the widespread deployment of broadband networks and have been used to
support different services, including file downloading and streaming. In our work, we
focus on video streaming. In P2P streaming systems, peers download data from other
peers. In such a case, the video quality a peer can obtain is a function of uploading
capacity other peers contribute. R. Kumar et al [47] argued that, in order to provide
satisfactory performance, the quantity of peers with high upload capacities in streaming
systems should be sufficiently high. Thus, one important problem in providing QoS
in streaming systems is that of providing appropriate incentives for peers to contribute
their upload capacity.
To address the incentives problem in streaming systems, several techniques have
been proposed in the literature, such as [39, 53–55, 59, 64, 65, 67, 72]. Briefly, the basic
idea in these techniques is that peers obtain a better video quality (supported by layered
coding/MDC based schemes) when they upload more data. Orthogonal to these works,
we pursue an alternative direction for providing incentives in P2P streaming systems.
Specifically, motivated by popular advertisement business models, in Chapter 3, we
propose the use of advertisements as an incentive for peers to contribute upload capac-
ity. In our proposed system, peers enjoy the same quality of streamed media, with the
difference in quality of service being achieved through different amounts of advertise-
ments viewed, based on the resource contributions to the system. Our approach can be
combined with previous efforts whose goals are to provide better video quality.
Our contributions in Chapter 3 can be summarized as follows.
3
• We propose the use of advertisements as an incentive in P2P streaming systems
for peers to contribute uploading capacity. To the best of our knowledge, our work
is the first to propose the use of advertisements as incentives in P2P systems.
• We develop a token-based framework and several token-based schemes and
explore their characteristics, such as overhead, reliability, and token management.
Our results provide system developers with insight into efficient development of
P2P-based streaming systems that utilize advertisements as incentives for resource
contribution.
In Chapter 3, we study user-system interaction in P2P streaming systems. We
demonstrate that, by providing appropriate incentives (e.g., the use of advertisements),
service providers can encourage peers to contribute more uploading capacity. We then
shift our focus in Chapter 4 and consider client-server based web and video services in
the context of resource estimation in virtual networks (VN).
In a network virtualization environment, several virtual networks with heteroge-
neous resources can coexist over a shared physical infrastructure. Service providers can
create their own VNs by leasing resources from infrastructure providers and offering
customized end-to-end services without significant modifications to the physical infras-
tructure [30,80]. That is, one advantage of network virtualization is that VNs can satisfy
a variety of customized requirements.
When service providers generate VN requests, they can indicate the amount of
resources needed. Thus, when a VN embedding process maps a VN request onto phys-
ical networks, it must result in satisfaction of constraints on resources. Since physi-
cal resources shared by VNs are finite, the VN embedding problem has been proven
to be NP-hard (in offline and online scenarios [9, 46]). Addressing this problem has
been an active area of research, with a number of heuristics having been proposed,
4
e.g., [18,23,29,31,52,57,75,87,91,95]. These approaches are designed to achieve objec-
tives defined from the perspective of infrastructure providers. Moreover, they assume
that a VN request requests a specific amount of resources, such as network capacity or
computing power. However, from the perspective of service providers, the main con-
cern is having sufficient resources to support a certain level of service quality. Existing
efforts are not concerned with this, i.e., they do not consider what resources are required
to support a needed QoS, which is an important consideration for service providers.
To this end, in Chapter 4, we propose an alternative approach - namely that of con-
sidering QoS as a constraint. That is, when VN requests are made, service providers
should be able to use the minimum required QoS as constraints of that request, rather
than the amount of resources needed. The infrastructure provider must then determine
the resource allocation necessary for this QoS. In particular, the provider must take into
account user reaction to perceived performance and adjust the allocation dynamically.
Consequently, we study web and video services and propose an estimation mechanism
that is based on analyzing the interaction between user behavior and network perfor-
mance.
Our contributions in Chapter 4 can be summarized as follows:
• Unlike previous efforts, we focus on the use of QoS as a constraint for VN
requests. We develop models and corresponding techniques that allow deter-
mination of resource amounts needed to satisfy QoS constraints. Our proposed
approach is orthogonal to (and can be combined with) existing efforts for VN
embedding.
• Our approach can also dynamically adjust resource estimations when QoS
requirements change. Our simulation-based experiments demonstrate that the
5
proposed approach can satisfy user performance requirements through appropri-
ate resource estimation. Moreover, our approach can adjust resource estimations
efficiently and accurately.
Continuing with user-system interaction but now in the context of data centers, in
Chapter 5 we focus on evaluating system performance in the context of cloud computing.
In a cloud computing environment, cloud providers have to consider several goals.
First, cloud providers have to satisfy service level agreements (SLAs), especially under
worst-case scenarios (e.g., peak loads). Therefore, data centers are typically over-
provisioned. However, supporting such large server resources results in energy costs
on the order of millions of dollars per year [13]. Therefore, energy cost reduction is
another goal for cloud providers. Satisfying QoS requirements while reducing costs can
be a complex problem.
How to achieve these goals is the focus of numerous studies, and several approaches
have been proposed to address different challenges. Many such efforts focus on a spe-
cific design goal, e.g, improving system performance or reducing energy cost. How-
ever, our goal is to develop a modeling framework that can (i) incorporate different
approaches and (ii) be used to evaluate system performance under different goals simul-
taneously (e.g., performance and cost). Because the interaction between the different
resource management approaches is complex and their affect on performance character-
istics is (often) not straightforward, it is challenging to develop such an accurate model-
ing framework while maintaining its tractability.
Specially, the goal of our framework is to facilitate evaluation of effects of the var-
ious resource management approaches and to offer an abstraction for researchers and
engineers to reason intuitively about system performance. The challenges here include
the following. (1) The framework should be able consider multiple resource man-
agement approaches simultaneously, and it should be extendable, as new approaches
6
become available. (2) The framework should be able to evaluate several performance
metrics under different goals. (3) It should be simple (and efficient) for cloud providers
to use the framework and understand the effects of considered resource management
approaches.
To this end, we propose a modeling framework which includes two main parts. The
first part is an inflow based on system workload. The second part is an outflow controlled
by the system. The advantage of this decomposition is that it makes it easy to observe
the effects of different factors. Thus, it can can facilitate efficient system development
and provide cloud providers with insight into how to adjust system settings to satisfy
different performance and cost goals. In this dissertation, to explore characteristics and
utility of our framework, we focus on two common resource management approaches,
VM consolidation [33,50,51,58,94] and speculation [5,8,89] and propose corresponding
models. We also give an overview of how cloud providers can extend our framework to
include other approaches, including VM migration and task iteration.
Our contributions in Chapter 5 can be summarized as follows:
• We develop a modeling framework that can consider several resource management
approaches simultaneously and be used to evaluate system performance.
• We propose models that can evaluate system performance and expose effects of
different approaches easily.
• Our framework can provide insight into appropriate resource management tech-
niques for facilitating tradeoffs between performance commitments and cost.
• Our modeling framework can be extended to include a number of resource man-
agement approaches.
The remainder of this dissertation is organized as follows. Chapter 2 gives an
overview of related work. In Chapter 3, we present our incentive mechanism in P2P
7
streaming systems. In Chapter 4, we describe our resource estimation mechanism
through user and network interaction analysis. Chapter 5 proposes our modeling frame-
work for evaluating system performance in the context of cloud computing. We then
conclude in Chapter 6.
8
Chapter 2
Related Work
In this chapter, we give an overview of literature related to this dissertation. In Chapter
2.1, we describe existing literature of two main factors in our framework for P2P stream-
ing systems, incentive mechanisms and token-based scheme. In Chapter 2.2, we focus
on virtual networks. We describe existing efforts solving VN embedding problems and
discuss the difference between existing efforts and our resource estimation mechanism.
In Chapter 2.3, we give an overview of literature related to our modeling framework for
evaluating system performance in cloud computing.
2.1 P2P Streaming Systems
Since P2P streaming services became popular, there has been a number of efforts focus-
ing on measurement and analysis of P2P streaming systems, e.g., as in [38, 48, 82]. For
instance, [38] and [82] both focus on the PPLive system, where [38] characterizes user
behavior and traffic profiles, while [82] indicates characteristics of the PPLive system
that are different from those of P2P file-sharing systems. In [48] the authors focus on the
Coolstreaming system and discuss various associated issues, including highly skewed
resource distribution, excessive start-up time, and high failure rates during flash crowds.
These works demonstrate several characteristics of existing P2P streaming systems that
could be helpful for improving current streaming systems or designing future systems.
Although several deployed systems have a significant number of users, such as Cool-
Streaming [92], PPLive, PPStream, and TVUPlayer, the architectures of these systems
9
are not open. Therefore, a number of other design efforts can be found in the litera-
ture, e.g., [66, 83]. For instance, [66] proposes an unstructured, mesh-based system and
indicates that in such systems, the tit-for-tat mechanism can result in reasonable stream-
ing performance. In [83], the authors investigate a framework for multi-channel P2P
streaming systems, which can address two problems, excessively long channel switch-
ing delays and poor performance in channels with a small number of peers.
One of the main challenges in P2P streaming systems is provision of incentives for
users to contribute their resources to the system - these are the efforts that, in terms of a
high level goal (i.e., incentives) are closer to ours. A number of efforts have considered
this problem, from a variety of perspectives, e.g., [39, 53–55, 64, 65, 67]. For instance,
[54] and [53] propose layered coding/MDC schemes with a tit-for-tat type strategy. The
schemes use video quality as an incentive for peers to increase their upload rates. In [39],
the authors design an unstructured protocol that can reduce the end-to-end streaming
delay and improve delivered content quality. Specifically, they combine a push-pull
scheme and a score-based scheme as an incentive method. In [67], the authors address
two main problems in tree-based systems and propose an unstructured swarm overlay
system. A credit-based incentive scheme is adopted in order to fully utilize upload
bandwidth. In [65] the authors study the effects of a local pairwise incentive mechanism
and focus on resource availability and the average quality of paths. The work in [64]
demonstrates that a tit-for-tat scheme may not work well in streaming systems due to
bandwidth and delay constraints. In order to solve such a problem, an incentive scheme
(termed Token Stealing), based on an Iterated Prisoner’s Dilemma, is proposed. Lastly,
[55] proposes a substreaming framework that can be applied to a variety of video coding
schemes, including single-layer video, layered video, MDC, and simulcast. This work
designs a partner selection scheme, based on a tit-for-tat mechanism. The goal of this
10
partner selection scheme is to provide better streamed media qualify for peers who can
make greater contributions.
While the above works provide interesting insight into P2P streaming systems, to
the best of our knowledge, our work is the first to use advertisements as an incentive
in P2P streaming systems. Our results demonstrate, that, in addition to streamed media
quality, advertisements could provide another useful incentive for peers to contribute
their resources to the system. Thus, our work provides system developers another choice
for peer incentives. Moreover, our efforts are orthogonal to those on media quality
incentives, i.e., both can be combined in one system. For instance, a peer can choose
different levels of media quality and different amounts of advertisements, based on its
needs. That is, our schemes can be combined with other related efforts - e.g., system
developers can provide streamed media quality and fewer advertisements as incentives
by applying peer selection strategies or layered coding techniques in the context of our
framework.
Another factor in our framework is the use of token-based schemes, in order to
account for peers’ contributions. The use of token-based scheme in P2P file sharing
systems can be found, e.g., in [49, 60, 77, 84]. Most of these efforts focus on credit type
systems or micro-payment type systems for P2P service, using token-based schemes.
In these systems, peers can exchange tokens, as virtual currency, to receive resources
or service. For instance, [60] discusses two common incentive methods, reputation and
payment protocols, which share some common characteristics. Specifically, a general
protocol is discussed (termed, stamp trading), that has properties of the two methods
and can satisfy trust and token-compatibility requirements. In [84], the authors design
a micro-payment system for P2P applications, while [77] uses self-issued tokens for
accounting in a Grid computing type system. In [49], the authors design a decentral-
ized token-based accounting scheme, where tokens are regarded as proof of resource
11
or service usage. This scheme is used to collect transaction accounting information,
which can provide a basis for pricing and can be used to control the behavior of peers,
to achieve better system performance. In our work, we adopt the concept of using tokens
as payment in designing our system. Our main purpose in utilizing tokens is to create a
mechanism for accounting for peers’ contributions. Our main goal, however, is to pro-
pose an incentive scheme that uses advertisements as incentives for peers to contribute.
This goal is different from previous works that use tokens for charging for services,
representing reputation, or downloading data.
2.2 Virtual Networks
Network virtualization is viewed as a potential approach to overcome the ossification
problem of the Internet [10]. In a network virtualization environment, several vir-
tual networks (VNs) with heterogeneous resources can coexist over a shared physical
infrastructure. Service providers can create their own VNs by leasing resources from
infrastructure providers and offering customized end-to-end services without significant
modifications to the physical infrastructure [30, 80]. That is, one advantage of network
virtualization is that VNs can satisfy a variety of customized requirements. Virtual-
ization can also provide a heterogeneous experimental environment for researchers to
evaluate new protocols [15, 41, 42]. When service providers generate VN requests, they
can specify resources for every virtual node and link. Thus, when a VN embedding pro-
cess maps a VN request onto specific physical nodes and links in the substrate network,
it must result in satisfaction of constraints on virtual nodes and links.
Since substrate resources shared by VNs are finite, the VN embedding problem is
one of the fundamental challenges in network virtualization. That is, an efficient VN
12
embedding is necessary to increase resource utilization. However, the VN embedding
problem has been proven to be NP-hard (in offline and online scenarios [9, 46]).
Addressing the VN embedding problem has been an active area of research, with a
number of heuristics having been proposed, e.g., [18,23,24,26,29,31,34,52,57,74,75,
78, 87, 91, 95], where some of the works restrict the problem space in order to enable
efficient heuristics and reduce complexity. These limitations include: (i) considering
offline versions of the problem and assuming all virtual network requests are known
in advance [57, 75, 95]; (ii) assuming that resources are unlimited without the need for
admission control [29, 57, 95]; and (iii) focusing on specific topologies [57].
Fan et al. [29] focus on finding optimal reconfiguration policies that can minimize
cost where cost is considered to be either (a) the occupancy cost or (b) the reconfigura-
tion cost. Lu et al. [57] focus on a family of backbone-star topologies and aim at finding
the best topology. Szeto et al. [75] use the multi-commodity flow algorithm to solve the
VN embedding problem. The works in [23,52,87] do not restrict the problem space and
consider link and node constraints together with admission control mechanisms. All of
these works [23, 52, 87] consider the online version of the problem.
Yu et al. [87] propose a two stage mapping algorithm based on shortest path and
multi-commodity flow algorithm, handling the node mapping in the first stage and the
link mapping in the second stage. In addition, they allow path splitting and migration.
Lischka and Karl [52] consider node and link mapping in one stage. Their approach
is faster than the two-stage approach, especially for large virtual networks with high
resource consumption.
Beck et al. [78] propose a distributed, parallel and generic framework to reduce
message overhead. This framework can also be combined with cost-reducing embed-
ding algorithms. Cui et al. [24] propose an algorithm based on maximum convergence-
degree that can (a) improve network utilization efficiency, (b) decrease the complexity
13
of embedding problem, and (c) improve load balancing. In [26], Dietrich et al. focus
on multi-domain virtual network embedding problem. They propose a framework that
enables VN request partitioning under limited information disclosure.
Existing works assume that VN requests indicate the exact amount of required
resources (e.g., “the capacity of a link between two nodes should be 10 Mbps”). Given
this, heuristics are designed to achieve objectives defined from the perspective of infras-
tructure providers, e.g., balancing load [23,24,95], maximizing revenue [23,52,87], and
minimizing cost [23, 29, 34, 52, 57, 74, 78].
However, from the perspective of service providers, the main concern is having
sufficient resources to support a certain level of service quality. Existing efforts are
not concerned with this, i.e., they do not consider what resources are required to sup-
port a needed quality of service (QoS), which is an important consideration for service
providers.
To this end, we propose an alternative approach - namely that of considering QoS as
a constraint. That is, when VN requests are made, service providers should be able to
use the minimum required QoS as constraints of that request, rather than the amount of
resources needed. We develop an estimation mechanism that is based on analyzing the
interaction between user behavior and network performance.
2.3 Cloud Computing
In this dissertation, we propose a modeling framework for evaluating system perfor-
mance in cloud computing. In our framework, we consider speculation and performance
degradation due to resource contention and propose corresponding models. Moreover,
Our analytical models can be used to explore the tradeoff between system performance
14
(benefits) and energy usage (cost). Here, we describe existing literature related to our
modeling framework.
Speculation: The straggler problem was originally identified in [25], and specu-
lation has been proposed as an approach to fixing this problem in a number of efforts
[5, 7, 8, 44, 89]. The basic idea is to launch duplicate copies for tasks which have been
detected as stragglers or have a high probability of becoming stragglers (e.g. hav-
ing unusually high service times). Unlike techniques that wait for an indication of a
task being a straggler, [5] focuses on small jobs and generate copies immediately upon
job arrival. In contrast to all these works, our goal is not to propose new speculative
approaches but rather to focus on an analytical framework that can include one or more
speculative techniques in modeling data centers, in order to evaluate resulting perfor-
mance characteristics as well as tradeoffs between performance and cost. We demon-
strate later how our framework can be applied and extended to existing speculative tech-
niques.
Energy Cost Reduction: Efforts on energy cost reduction can be divide into three
categories: single server level, single data center level, and multiple data centers level.
For the single server level, power-speed scaling has been proposed to reduce energy
consumption (e.g., as in [11, 88]). Briefly, the basic idea is to save energy usage by
adjusting the CPU speed of a single server. At the single data center level, energy cost
reduction can be achieved by dynamically managing the number of activated servers
in a data center to save energy (e.g., [33, 50, 51, 58, 94]). VM consolidation is widely
adopted at this level, where VMs are colocated on physical servers, while either shutting
down the resulting idling physical servers or allowing them to idle in sleep mode. At the
multiple data centers level, a basic approach is to dynamically route jobs to data centers
with lower prices (e.g., [68, 85]). Our goal is not to propose a new energy management
technique but rather illustrate how our modeling framework can be used to explore an
15
appropriate tradeoff between performance requirements and energy costs. We illustrate
later how our framework can be applied at the single datacenter level.
Performance Degradation: Resource contention affects VM performance. Even
if certain types of resources, such as CPU cores, memory, and disk capacities, can be
isolated by schedulers or static partition of memory and disk capacities, it can still be
challenging to isolate performance of other resources, including CPU caches, memory
bandwidth, network, and disk I/O bandwidth. Therefore, resource contention results in
VM performance degradation and impacts overall system performance. A number of
efforts in the literature studied performance degradation due to various resource con-
tention types, e.g., [16, 22, 36, 45, 61, 93, 94]. Our goal was not to propose a mechanism
for modeling performance degradation due to specific types of resources, but rather to
design a modeling framework that is generic and can incorporate various contention
(performance degradation) models.
16
Chapter 3
A Comprehensive Study of the Use of
Advertisements as Incentives in P2P
Streaming Systems
3.1 Introduction
The use of Peer-to-Peer (P2P) technology has led to efficient distribution of content over
large scale networks. For instance, BitTorrent
1
is one of the most successful file sharing
applications in use today. In the past few years, P2P-based streaming has become
another popular service, with the widespread deployment of broadband networks. P2P-
based approaches to streaming have become popular, as compared to traditional client-
server-based approached, due to the following advantages: low cost, scalability, and ease
of deployment. There are already several P2P streaming applications deployed on the
Internet, such as PPLive
2
, PPStream
3
, and TVUPlayer
4
. Moreover, a number of efforts
have focused on measurement and analysis of such P2P streaming systems [38, 48, 82].
However, P2P streaming systems still suffer from free-riding problems, i.e., simi-
larly to the free-riding problems observed in P2P file sharing systems. For example,
1
BitTorrent: http://www.bittorrent.com/
2
PPLive: http://www.pplive.com/
3
PPStream: http://www.ppstream.com/
4
TVUPlayer: http://www.tvunetworks.com/
17
X. Hei et al [38] provide four measurement results of PPlive (two from university cam-
puses and two from residential locations). Peers in one of the residential locations do
almost no uploads. These peers may be considered as free-riders. Moreover, they have
also shown that deployed streaming systems’ performance depends on peers with high
upload capacities, given that there is significant upload bandwidth heterogeneity on the
Internet. In order to provide satisfactory performance, the quantity of peers with high
upload capacities in streaming systems should be sufficiently high [47]. Therefore, the
problem of how to provide appropriate incentives for peers to contribute their upload
capacity to the system is an important one in the context of P2P-based streaming sys-
tems. Although some approaches exist for addressing this problem in the context of
P2P file sharing systems, providing incentives in streaming systems can be a more com-
plex problem. For instance, such approaches in file-sharing systems typically provide
a reduction in file downloading time as an incentive. However, in streaming systems, a
peer’s quality of service depends on video quality (e.g., smoothness of video delivery),
rather than on how fast the entire stream (e.g., video) can be downloaded. Hence, there
is a need for re-considering the problem of providing appropriate incentives for peers to
contribute resources in the context of P2P streaming systems.
To address the incentives problem in streaming systems, several techniques have
been proposed in the literature, such as [39, 53–55, 59, 64, 65, 67, 72]. Most of these
works use improved video quality as an incentive, achieved through layered coding-
based or MDC-based techniques [53, 54, 56]. Briefly, the basic idea here is that peers
will have better video quality when they upload more data.
In this chapter we consider an alternative direction for providing incentives in P2P
streaming systems, namely that of using the amount of advertisements viewed as an
incentive to contribute more resources, as described next. Our approach is orthogonal
18
to (and can be combined with) previous efforts whose goals are to provide better video
quality (as detailed below).
Advertisement supported service has become a popular business model. Many pop-
ular systems ask users to watch advertisements before users start to watch videos (e.g.,
Youtube
5
and Hulu
6
). Since advertisements could provide additional revenue for a ser-
vice provider in a P2P streaming system, such providers could include advertisements in
the distributed content (e.g., as is currently done on television). Since all peers (whether
high or low capacity) are concerned with video quality (e.g., in the form of smooth-
ness in data delivery), controlling the amount of advertisements viewed by a peer would
be another approach to providing incentives for peers to contribute resources in a P2P
streaming system. (In a sense, our system is analogous to having broadcast (free) TV
and paid TV channels, e.g., such as HBO. For instance, a user can view movies on a
paid TV channel without advertisements, or the user can view the same movie on a
free TV channel with advertisements. In our case, the “payment” we consider is the
peers’ resource contribution.) Briefly, in our proposed system, peers who contribute
more upload capacity view fewer advertisements than peers who contribute less upload
resources. Thus, unlike in layered coding/MDC based schemes, low capacity peers can
still enjoy good video quality but with more advertisements than higher capacity peers.
In this chapter, we focus on the architecture of such a system and its evaluation, includ-
ing issues such as tracking of peers’ contributions while preventing malicious behavior
as well as efficient distribution of advertisements.
One important challenge in such a system is accounting for peers’ contributions, on
which the system can in turn base the computation of the amount of advertisements a
5
YouTube: http://www.youtube.com/
6
Hulu: http://www.hulu.com/
19
peer should view. One possible approach is to allow peers to report their own contribu-
tions. However, since malicious peers may report more than their actual contribution,
such a method is open to abuse by free-riders or malicious peers. Therefore, in this
chapter, we design token-based schemes to address the problem of determining how
much contribution each peer is making. Token-based schemes have been used in P2P
file sharing systems [49,60,77,84] as well as other systems. In [49,77,84], token-based
schemes are considered as a form of credit systems or micro-payment systems, where
peers can exchange tokens, as virtual currency, to receive resources or services, and a
service provider can charge based on the amount of tokens attained. In [60], token-
based schemes are used for two incentive purposes, reputation and payment. Peers
should maintain high reputation, otherwise, other peers may refuse to interact with them.
In addition, peers “pay” with tokens for downloading content and receive tokens for
uploading content to other peers. Thus, a free-rider cannot download data due to lack of
tokens. In this chapter, we adopt the idea of using tokens as payment in designing our
system. Our main purpose in utilizing tokens is in creating a mechanism for accounting
for peers’ contributions. Unlike other efforts where tokens are used to limit downloading
or support reputation, we use tokens to determine the amount of advertisements peers
should view. (The details of our system architecture are given below.)
Briefly, in our token-based schemes, a peer “pays” with tokens when downloading
steaming content. A peer receives tokens, when contributing its upload capacity for
streaming data to other peers. Then, the amount of advertisements shown to each peer is
based on the number of tokens each peer possesses. The advantage of this token-based
approach is that the use of tokens can prevent malicious peers from reporting incorrect
contributions. In addition, we insure that malicious peers do not “fake” tokens (i.e.,
generate tokens on their own) through the use of cryptographic-based signatures.
Our contributions in this chapter can be summarized as follows.
20
• We consider the use of advertisements as an incentive in P2P streaming systems
for peers to contribute upload resources and propose a corresponding P2P-based
streaming system architecture. Since, advertisement supported services are likely
to become more and more popular, our approach could provide peers an important
incentive for contributing their upload capacity. In Chapter 3.3, our simulation-
based study demonstrates the utility of this approach - for instance, when peers
increase their uploading rate, the system provides reduced amount of advertise-
ment based on true contributions. Moreover, we also focus on efficient distribution
of advertisements in Chapter 3.3.3.
• We propose a token-based framework as an approach to address the problem
of accounting for peers’ contributions so as to determine the amount of adver-
tisements peers should view (see Chapter 3.2). We propose three token-based
schemes and explore their characteristics, such as overhead, reliability, manage-
ment, and resilience to malicious behavior (see discussion in Chapter 3.4). In
Chapter 3.3, we present our simulation-based study that illustrates several useful
characteristics of our schemes. These include, a demonstration of how to reduce
overhead needed for implementing such token-based schemes in Chapter 3.3.4
and what are the trade-offs between overhead reduction and accuracy in Chapter
3.3.5. Our results provide system developers with insight into efficient develop-
ment of P2P-based streaming systems that utilize advertisements as incentives for
resource contribution.
3.2 The Proposed System
In this chapter, we present our system framework in Chapter 3.2.1, including the system
architecture and the token-based framework. We propose three token-based schemes in
21
our work. The details of these schemes are described in Chapter 3.2.2. Moreover, we
discuss how to reduce system overhead towards the end of Chapter 3.2.2.
3.2.1 System Framework
In this chapter, we present the proposed system which uses advertisements as incentives
in a P2P streaming system. In this system, peers view different amounts of advertise-
ments, based on their resource contributions to the system. An advertisement in our
system can be any object - e.g., it can be a video clip, flash animation, streamed media,
and so on. In general, the type of advertisement is determined by the content provider.
For ease of exposition, in our work we focus on one type of advertisement, namely
streamed media.
The main challenge in our system is that of determining peers’ true resource con-
tributions. One possible approach is to allow peers to report their own contributions.
However, this method does not prevent malicious peers from reporting incorrect contri-
butions. That is, in order for advertisement-based incentives to produce a desired effect,
we need to construct a system where peers’ contributions can be determined reasonably
accurately (at least relative to each other).
To this end, we design token-based schemes. Our token-based schemes include three
functions:
• token generation: How are tokens generated?
• token exchange: How do peers exchange tokens in order to download streamed
content?
• contribution calculation: How are peers’ contributions calculated, and how is the
amount of advertisements peers have to view determined?
22
Since token generation and contribution calculation could be done at the peer-side or
the server-side, we consider different combinations and propose the following schemes
in our work:
• Token generation at Peer-side and Contribution calculation at Peer-side (TPCP)
• Token generation at Server-side and Contribution calculation at Peer-side (TSCP)
• Token generation at Peer-side and Contribution calculation at Server-side (TPCS)
Note that, using different token-based schemes does not affect the quantity of peers’ con-
tributions. The motivation for considering different schemes is to explore overhead and
reliability characteristics of each schemes. The information can help system developers
choose appropriate schemes by considering the trade-offs between overhead and relia-
bility. A further discussion and comparison of three schemes are presented in Chapter
3.4.5. Note that, we do not consider token generation at the server-side and contribution
calculation at the server-side because this scheme is completely centralized and is not
appropriate for P2P systems.
Fig. 3.1 illustrates our system architecture. The system consists of a streaming
server and peers. In the TSCP and TPCS schemes, the system also includes a token
server. The streaming server and the token server are managed by the service provider.
The streaming server is responsible for generating the streamed content as well as the
advertisements, while the token server is responsible for generating tokens or comput-
ing peers’ resource contributions. The sources of advertisements could be the service
provider or sponsors.
The streaming format considered in the remainder of the work is single-layer video
with substreams, as in [55]. The basic idea of substreams is to encode a video stream into
several substreams, each having the same video rate, i.e., the streaming server encodes
23
Streaming Server Token Server
Streamed media
Tokens
Advertisement
Peer with high
uploading rate
Peer with low
uploading rate
Figure 3.1: System Architecture
streamed content into k substreams and distributes k substreams through the P2P sys-
tem. The main advantage of a system using substreams is that it can be extended to
using different coding schemes. In [55], the authors have shown that such a substream-
based mechanism can be applied to several different coding schemes, including single-
layer video, layered coding-based video, and MDC. Since the use of video quality as
an incentive is not our goal here, for simplicity of illustration, we use single-layer video
with substreams. However, as noted earlier, our approach can be combined with layered
coding-based video or MDC schemes.
In the TSCP and TPCS schemes, when a peer wants to join or depart from the overlay
system, it has to contact the token server. The token server keeps login information of
each peer, as a way of checking whether the peer exists in the system or not. The login
information is used when the token server distributes tokens to peers or calculates a
peer’s contribution.
After a peer joins the overlay network, it connects to n neighbors to download
streaming content and advertisements. In addition, peers randomly choose new neigh-
bors periodically. In the system, a peer has to provide tokens when it downloads data
from other peers. Conversely, it receives tokens from other peers when it streams
(uploads) data to them. The amount of data,δ, that a peer can download upon providing
one token is determined by the service provider. Moreover, uploading advertisements is
also considered as a peer’s contribution, since such upload bandwidth contribution can
24
ti,p ai,p ti,p+1 ai,p+1
Figure 3.2: Time Interval: peer views streaming content during time interval t
i,p
and
views advertisements during time intervala
i,p
Pi Pj Pk
(a)
(b)reuse (c)non-reuse
Serial Number
Timestamp
Pi’s ID
Pj’s ID
Pk’s ID
Serial Number
Timestamp
Pi’s ID
Pj’s ID
# of tokens
Figure 3.3: Token Structure
reduce the load on the streaming server. A peer making a greater contribution accord-
ingly receives a greater number of tokens from other peers.
In our system, the downloaded content and advertisements are kept in the local cache
of a peer. Peers must view advertisements after they have viewed some amount of
content. For instance, consider the example in Fig. 3.2, where peer i views content
in the p-th time interval, t
i,p
. Then, the system displays advertisements of length a
i,p
which is calculated based on peeri’s contribution duringt
i,p
. Therefore,a
i,p
is different
from interval to interval. The length of time interval t
i,p
is the same for every peer,
and it is determined by the service provider. At the end of each time interval, t
i,p
,
peers’ contributions are calculated and used to determine the amount of advertisements
that each peer should view. Below, we describe in detail our token-based schemes and
how peers’ contributions are calculated in our system. For convenience, a summary of
notation used in the work is given in Table 5.1.
25
k number of substreams
n number of a peer’s neighbors
δ amount of data a peer can download
per one token
N number of tokens the token server sends
to a peer for each content interval
t
i,p
p-th time interval for viewing content
a
i,p
p-th time interval for viewing
advertisements
c
i,p
peeri’s contribution duringt
i,p
r
i,p
number of tokens received by peeri
duringt
i,p
g
i,p
number of tokens generated by peeri
duringt
i,p
s
i,p
number of tokens sent by peeri
to other peers duringt
i,p
f
i,p
ratio of advertisement time length
to time length of streamed content
Max(f) maximum value off
i,p
Def(f) default value off
i,p
Min(f) minimum value off
i,p
Min(c) a peer’s contribution if it is free-rider
Equal(c) a peer’s contribution while
its income is equal to its spending amount
Table 3.1: Notations list
3.2.2 Token-Based Schemes
We now present the three token-based schemes in detail. We describe how to perform
the above described functionality at the peer-side and at the server-side. Finally, we
explore how to reduce overhead in these schemes.
Token generation: Now, we describe how to generate tokens in our system. In the
TPCP and TPCS schemes, tokens are generated by peers; tokens are generated by the
token server in the TSCP scheme. The corresponding token structure is depicted in Fig.
3.3(b). Each token has a different serial number and timestamp for uniqueness. More-
over, a token includes the generator’s ID and the receiver’s ID. In the TSCP scheme, the
26
generator’s ID is the token server’s ID, e.g., in Fig. 3.3, peeri is the generator and peer
j is the receiver. After peer i generates a token, it encrypts the token to prevent other
peers from faking tokens.
7
In the TPCP and TPCS schemes, peers can generate tokens
as needed. In the TSCP scheme, tokens are generated by the token server. Then, the
token server sends tokens to peers at the start of each content interval,t
i,p
, as depicted in
Fig. 3.2. The number of tokens sent,N, is sufficient for that peer to download (stream)
all content needed for that content interval. Therefore, a peer does not need to generate
tokens on its own.
Token exchange: The token exchange function is the same for all schemes. When
a peer downloads a stream from other peers, it can “pay” for it by either (i) using tokens
it received from other peers, or by (ii) generating new tokens, e.g., if that peer does not
have sufficient tokens accumulated from serving other peers. If a peer pays for content
by using received tokens, then the new owner of that token appends its ID to the token
(e.g., peer k’s ID in Fig. 3.3(b)). In our system, received tokens can be reused, and
a peer does not have to generate new tokens each time it wants to download (stream)
content.
Contribution calculation: Now, we describe how to calculate peers’ contributions
and how to calculate the amount of advertisement, a
i,p
, that each peer should watch.
In our system, the amount of advertisement is based on a peer’s resource contribution.
Therefore, we have to calculate the peers’ contributions first. Contribution calculation is
done at the end oft
i,p
. In the TPCP scheme, peer’s contribution is calculated the at peer-
side. The number of tokens received by peer i, r
i,p
, is regarded as the peer’s income,
and the number of tokens generated by peeri,g
i,p
, is considered as the peer’s spending
amount. Note that, if a peer reuses its received tokens, those tokens are not accounted
7
In our system, we encrypt tokens, rather than use digital signatures. The main reason is that use of
encryption results in a more secure protocol than one with digital signatures. We discuss this in more
detail in Chapter 3.4.
27
for in r
i,p
or g
i,p
. We define the peer’s contribution, c
i,p
, as the difference between r
i,p
andg
i,p
, normalized by time, i.e.,
c
i,p
=
r
i,p
−g
i,p
t
i,p
(3.1)
In the TPCS scheme, the token server is responsible for calculating contributions.
Therefore, peers have to report received tokens to the token server. The definition of
a peer’s contribution in TPCS is the same as Eq. (3.1). In the TSCP scheme, a peer’s
contribution is also calculated at the peer-side. However, the definition of a peer’s con-
tribution is different from Eq. (3.1). In the TSCP scheme, tokens are generated by the
token server. Therefore, a peer’s income includes the number of tokens received from
the token server,N, and the number of tokens received from other peers, r
i,p
. A peer’s
spending amount is the number of tokens sent to other peers,s
i,p
. A peer’s contribution
in the TSCP scheme,c
i,p
, is
c
i,p
=
N +r
i,p
−s
i,p
t
i,p
(3.2)
The method used to calculatea
i,p
is the same in all schemes. We definef
i,p
, the ratio
of the advertisement time length to the time length of streaming content, e.g., in Fig.
3.2, f
i,p
is the ratio of a
i,p
to t
i,p
. Intuitively, the value of f
i,p
is inversely proportional
to c
i,p
. Therefore, there are many functions which can be used to determine the value
of f
i,p
, e.g., exponential, linear, logarithm, etc. This choice depends on how “quickly”
the system would like to “reward” peer contributions, e.g., if an exponential function
is used, the value of f
i,p
would decrease quickly as a peer’s contribution increases. In
our work, we use a piecewise linear function, as explained next. We classify peers
into three groups based on their c
i,p
. (The service provider defines default, minimum,
28
and maximum values of f
i,p
, which are described next.) Intuitively, if a peer is a free-
rider who does not contribute resources, its resource contribution isMin(c). The value
of Min(c) derived from Eq. (3.1) in TPCP and TPCS is−g
i,p
/t
i,p
, and the value of
Min(c) derived from Eq. (3.2) in TSCP is(N−s
i,p
)/t
i,p
. Then, a free-rider’sf
i,p
is set
to the maximum length of advertisements for free-riding,Max(f). Otherwise, a peer’s
f
i,p
decreases as its resource contribution increases. If a peer’s income is equal to its
spending amount, its c
i,p
is Equal(c) and its f
i,p
is set to the default value, Def(f),
whereas if a peer’s c
i,p
exceeds a threshold, λ, its f
i,p
is set to the minimum value,
Min(f). The value of Equal(c) derived from Eq. (3.1) in TPCP and TPCS is 0, and
the value ofEqual(c) derived from Eq. (3.2) in TSCP isN/t
i,p
. The definition off
i,p
is
given in Eq. (3.3), with a corresponding depiction in Fig. 3.4.
f
i,p
=
Max(f)−Def(f)
Min(c)−Equal(c)
∗c
i,p
+
Min(c)∗Def(f)−Equal(c)∗Max(f)
Min(c)−Equal(c)
, ifc
i,p
< 0
(Def(f)−Min(f))
Equal(c)−λ
∗c
i,p
+
Equal(c)∗Min(f)−λ∗Def(f)
Equal(c)−λ
, if0≤c
i,p
< λ
Min(f) , ifλ≤c
i,p
(3.3)
Min(f)
Def(f)
Max(f)
Min(c) Equal(c) λ
f
i,p
c
i,p
Figure 3.4: Example of a function used by the token server to calculatef
i,p
.
29
Finally, we define the amount of advertisement,a
i,p
, for each peeri asa
i,p
= f
i,p
∗t
i,p
.
In the TPCP and TSCP schemes, the software at the peer-side calculates each peer’s
a
i,p
. In the TPCS scheme, the token server sends each peer a message with itsa
i,p
value.
After receiving thea
i,p
value, the display of advertisements, kept in local cache, begins.
In addition, the value of f
i,p
is reported to the streaming server. The streaming server
can use this information to decide on the proportion of advertisements to content, when
it distributes streamed media. We definef as the ratio of advertisements to content. In
our system, the streaming server choosesf to be equal to the average value off
i,p
.
Reducing overhead: We, now, discuss the issue of token reuse. In our system,
received tokens can be reused, and a peer does not have to generate new tokens each time
it wants to download (stream) content. An advantage of reusing tokens is the decreased
overhead, particularly the amount of computation needed to perform decryption at the
token server (as each token is signed by its generator to prevent malicious users from
generating fake tokens). We give an example to demonstrate the computational overhead
of decryption, using an Intel Xeon E5420 and 32GB of memory. There are 10000 strings
whose lengths are 1500 bytes, and these strings are encrypted by 1024 bits RSA. The
total time to decrypt these strings is about 4 minutes. Although the token server can
afford such computational overhead, reusing tokens can decrease it. We depict the token
reporting packet structure of reusing tokens and non-reusing tokens in Fig. 3.3. The
two structures are similar - both include serial number, timestamp, generator’s ID, and
receiver’s ID. The difference is that in the reusing mechanism, there is a signature of the
token’s new “owner”. As a result, one drawback of reusing tokens is an increase in token
reporting packet’s length, as the token’s new “owner” has to append its signature to the
token reporting packet. If tokens cannot be reused, a peer can send a packet which
includes several tokens, i.e., the last row of the non-reuse packet structure gives the
“number of tokens”, as depicted in Fig. 3.3. However, even though a packet can include
30
several tokens, the amount of reported packets in the non-reusing mechanism is still
higher than the amount in the reusing mechanism. The reason is that peers have to report
“every” packet in the non-reusing mechanism. We note that, in the reusing mechanism,
each packet indicates only one token because the receiver cannot split the packet into
two packets with a smaller number of tokens. For example, if peer i sends peer j a
packet which indicates 5 tokens, and the packet has been encrypted by peeri, then, peerj
cannot divide this packet into two packets with 2 and 3 tokens, respectively, as the packet
has been encrypted. This limitation may result in more transmitted packets between
peers in the reusing tokens mechanism. In Chapter 3.3, we compare the overhead and
corresponding performance of the reusing tokens and non-reusing tokens mechanisms.
3.3 Evaluation
In this chapter, we evaluate the proposed token-based schemes. We note that the goal of
our work is to explore the utility of an approach that uses advertisements as incentives in
P2P streaming systems, rather than exploring such a system’s implementation. To this
end, we use simulations to demonstrate essential characteristics of our system and to
study its characteristics in a more controlled environment. An advantage of using simu-
lations is that we can control different parameters and observe effects of each parameter.
3.3.1 Experimental Setting
We develop a simulator to evaluate the proposed token-based schemes, where we use
traces to simulate peer dynamics. The traces are available from the PPLive Project
8
,
collected from the PPLive streaming system. Specifically, the PPLive crawler takes
8
PPLive Project: http://cairo.cs.uiuc.edu/˜longvu2/pplive.html
31
0
500
1000
1500
2000
2500
0:00 6:00 12:00 18:00 24:00
Number of peers
Time
Figure 3.5: Population of peers viewing streaming data in the PPLive system
Upload bandwidth (kbps) 256 384 512 768 1024 2048
Distribution (%) 12 40 31 4 7 6
Table 3.2: Uploading bandwidth distribution
system snapshots every ten minutes and records peer IDs. The traces used in our sim-
ulation are collected during one day. Fig. 3.5 depicts the population of peers viewing
media during a one day period.
The simulator assigns the uploading bandwidth of peers based on the distribution
given Table 3.2, which corresponds to measurements in [27]. Because peers typically
do not contribute all of their uploading capacity, by default (i.e., unless otherwise stated)
we begin with a conservative view and consider peers contributing 60% of their upload-
ing capacity. We also vary this contribution in our simulations (as detailed below).
Furthermore, we assume that the downloading bandwidth of peers is sufficient to stream
the video (i.e., greater than the streaming rate).
We consider single-layer video in our simulations. As stated in [38], most video
rates in PPLive are between 250 kbps and 400 kbps; thus, in our simulations, the video
rates of streaming content is 300 kbps. The number of substreams is k = 10, which
is the same as in [55], i.e., the video is divided into 10 substreams, each with the same
video rate. We assume a peer can download an entire substream from one neighbor.
Therefore, in our simulations, peers connect to 10 neighbors simultaneously. A peer
32
parameter default value
k 10
n 10
δ 500kb
t
i,p
10min
Max(f) 1
Def(f) 0.3
Min(f) 0.1
Table 3.3: Default values of parameters
randomly chooses new neighbors every 30 seconds. (We do not consider peer selection
algorithms here since video quality is not the focus of our work.)
In these experiments, the default amount of data a peer can download from its neigh-
bor when paying one token,δ, is 500kb. Moreover, the default value of content interval,
t
i,p
, is 10 minutes (which is similar to content intervals used in TV programs), and the
amount of advertisements a peer has to view,a
i,p
, is calculated at the end of eacht
i,p
. In
what follows, we vary the values ofδ andt
i,p
(as detailed below). Table 3.3 summarizes
the default parameter values.
All results are based on the following assumptions: tokens are received or reported
correctly and there are no malicious attacks. Because tokens are encrypted, we can
assume malicious peers cannot fake tokens, and tokens can be received or reported cor-
rectly. We discuss malicious behavior in Chapter 3.4. In our simulations, we need to
set the maximum, default, and minimum values of f
i,p
. We set Max(f) = 1. The
time length of advertisement a peer views is not greater than the time length of content
a peer views. In addition, if peer i is a free-rider, the time length of advertisements it
has to view is equal to the time length of content it has viewed. Then, we set Def(f)
and Min(f) to 0.3 and 0.1, respectively. We base these on statistics given in [70],
which show that from 1952 to 2010, the average off
i,p
of TV programs increased from
0.15 to 0.47. Because service providers of P2P streaming systems may provide fewer
33
advertisements than TV , to attract users, we believe these are reasonable settings for
Def(f) andMin(f). In addition, in our experiments, we useMin(f) > 0, motivated
as follows. As a service provider can receive revenue from displaying advertisements to
peers, it is likely to have peers view some amount of advertisements. We also set λ to
g
i,p
/t
i,p
in TPCP and TPCS schemes and to(N+s
i,p
)/t
i,p
in TSCP scheme - that is, if a
peer’s received tokens are more than twice of its generated or sent tokens (derived from
Eqs. (3.1) and (3.2)), its f
i,p
= 0.1. Since, the ratio of average uploading bandwidth
to streaming rate is less than 2 (in our work), we believe this is a reasonable threshold
setting.
3.3.2 Contributions vs. Advertisements
We illustrate that our system can provide peers differentiated service, using the amount
of advertisements they view, based on their resource contributions. Fig. 3.6 demon-
strates the average of f
i,p
for different peer classes in three token-based schemes. The
results of the three schemes are similar because we use the same equation to calculate a
peer’s contribution. The results are not affected by where tokens are generated and con-
tributions are calculated. Our results show that the average off
i,p
decreases as a peer’s
contribution increases. Since the uploading rate of a peer in the first three classes is less
than its downloading rate, itsf
i,p
is higher thanDef(f). If a peer wants to view fewer
advertisements, it needs to increase its uploading rate. Thus, in the next experiment, we
change a peer’s uploading rate from 60% of total uploading capacity to 40% and 80%.
The trends off
i,p
for different peer classes, under 40% and 80% uploading capacity, are
qualitatively similar to those when uploading bandwidth is 60% of the capacity. There-
fore, we only depict the average off
i,p
in Fig. 3.7. This figure indicates that the average
off
i,p
decreases about 28% when a peer increases its uploading rate from 60% to 80%
of the capacity. If peers are selfish and not willing to contribute their resources, our
34
0
0.2
0.4
0.6
0.8
1
256 384 512 768 1024 2048
f
i,p
uploading bandwidth (kbps)
(a) TPCP
0
0.2
0.4
0.6
0.8
1
256 384 512 768 1024 2048
f
i,p
uploading bandwidth (kbps)
(b) TPCS
0
0.2
0.4
0.6
0.8
1
256 384 512 768 1024 2048
f
i,p
uploading bandwidth (kbps)
(c) TSCP
Figure 3.6: averagef
i,p
for peers with different uploading bandwidth in three schemes
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
40 60 80
f
i,p
uploading ratio (%)
TPCP
TPCS
TSCP
Figure 3.7: averagef
i,p
for peers with different uploading ratio in three schemes
system can also adjust the amount of advertisements for such behavior. When a peer
decreases its uploading rate from 60% to 40%, the average off
i,p
increases about 23%.
This indicates that we can use advertisements as incentives in P2P streaming systems,
i.e., in order to view fewer advertisements, a peer needs to contribute more.
3.3.3 Advertisement Distribution
Recall that in our system, after peers download advertisements, they are kept in the local
cache. If the amount of downloaded advertisement is more than the amount of adver-
tisement a peer has to view, some downloaded advertisements would not be displayed.
On the other hand, if the amount of downloaded advertisement is less than the amount of
advertisement a peer has to view, some downloaded advertisements would be repeated.
35
Therefore, it is an important consideration for the streaming server to determine how
much advertisement should be distributed. That is, distributing too much advertisement
data causes network overhead and results in some useless advertisements being down-
loaded by peers. However, distributing too little advertisement content results in peers
viewing a lot of repeated advertisements. This situation may not be satisfactory for a
service provider as their profit might depend on how often a particular advertisement
is viewed. We now consider simple strategies for advertisement distribution. (More
sophisticated techniques are possible but are outside the scope of our work)
To this end, we define the following two performance metrics:
• surplus: the ratio of advertisement which is not viewed relative to all of down-
loaded advertisement. The value ofsurplus is
P
i
[max(f,f
i,p
)−f
i,p
]
P
i
f
, wheref is the
ratio of advertisement to streamed media content and f
i,p
is based on a peer’s
contribution;
• repeat: the ratio of repeated advertisement to all of downloaded advertisement.
The value ofrepeat is
P
i
[f
i,p
−min(f,f
i,p
)]
P
i
f
.
In our experiments, we use three different values of f: Min(f), Max(f), and the
average off
i,p
, Avg(f), and evaluate theirsurplus and repeat. Because the results of
the three schemes are similar, we show the result from the TPCP scheme only. Based on
our experiments, the average ofsurplus andrepeat are depicted in Fig.3.8 and Fig.3.9,
respectively.
The results indicate that using Min(f) as f causes less network overhead, with
no surplus advertisements. However, this strategy has to display the same advertise-
ment repeatedly. On the other hand, using Max(f) guarantees that peers do not view
repeated advertisement; however, it results in significant surplus in advertisement con-
tent. Fig.3.8 demonstrates that the average surplus advertisement is more than 50% of all
36
0
0.1
0.2
0.3
0.4
0.5
0.6
Min(f) Max(f) Avg(f)
surplus
f
Figure 3.8: The average ofsurplus by
using differentf
0
0.5
1
1.5
2
2.5
Min(f) Max(f) Avg(f)
repeat
f
Figure 3.9: The average of repeat by
using differentf
downloaded advertisement. One approach to finding a balance between these two met-
rics is to useAvg(f). Our results indicate that the average ofsurplus andrepeat (when
using Avg(f)) are both about 0.1. That is, compared to Min(f) and Max(f), using
Avg(f) asf can reduce surplus advertisement, while also reducing repeated advertise-
ment.
3.3.4 Reusing tokens vs. Non-reusing tokens
We now focus on reusing tokens vs. non-reusing tokens mechanisms. In the above
experiments, peers were allowed to reuse received tokens. As noted earlier, the main
advantage of reusing tokens is to reduce overhead of the system, especially in the TPCS
scheme, where peers have to report tokens to the token server. Since each token has been
encrypted by its generator, there is significant overhead involved in the token decryption
process; hence, it is the motivation for reusing tokens in our system. In this chapter,
we use results from the TPCS scheme. Now, recall that in the non-reuse mechanism,
after a peer receives tokens from its neighbors, it reports these tokens to the token server
immediately, whereas in the reuse mechanisms, a peer can reuse received tokens.
37
0
5
10
15
20
25
30
100 250 500
α
δ (kb)
reuse
nonreuse
Figure 3.10: The value ofα under differentδ
Firstly, we focus on the overhead of these two mechanisms. We first define the
following parameter.
• α: the average number of packets reported to the token server by each peer during
time interval t
i,p
. The value of α is
# of packets
(# of peers)*t
i,p
. Since the token server’s
overhead increases as the value ofα increases, our goal is to reduceα.
Fig.3.10 depicts the value of α in these two mechanisms under different values of
δ. In the non-reuse mechanism, α remains constant as a function of δ. The reason for
this is that in the non-reuse scheme, each packet can include several tokens, as shown
in Fig. 3.3. Fig.3.10 illustrates thatα decreases in the reuse mechanism asδ increases.
This indicates that the reuse mechanism reports fewer packets than the non-reuse mech-
anism when δ is sufficiently large (e.g., 250kb and 500kb in our experiments). Hence,
reusing tokens can reduce overhead. However, the performance of the reuse mechanism
degrades when δ is too small (e.g., 100kb in our experiments). Ifδ is too small, a peer
has to generate too many tokens. Then, the reuse mechanism behaves worse in terms of
overhead than the non-reuse mechanism. Thus, the value ofδ should not be too small;
otherwise, peers would generate too many tokens, thus increasing the token server’s
overhead.
38
0
0.2
0.4
0.6
0.8
1
1.2
2 5 10 20
difference (%)
t
i,p
(min)
Figure 3.11: The average ofdifference under differentt
i,p
3.3.5 Content Interval
Now, we focus on the effects of t
i,p
. If the service provider chooses a small value of
t
i,p
, then the system can reflect peers’ behavior better. However, if t
i,p
is too small,
the system interrupts peers frequently to display advertisements, which would not lead
to good service. Thus, we focus on whether choosing different content intervals, t
i,p
,
affects accuracy when calculating peers’ contribution. We also consider overhead at
the token server under different t
i,p
values. Therefore, we use results from the TPCS
scheme.
We vary t
i,p
from 1 minute to 20 minutes and calculate f
i,p
under different t
i,p
’s.
We use f
i,p
with the 1 minute setting as our baseline. Then, we define difference
as the difference between f
i,p
under different t
i,p
’s and the baseline. The equation for
difference is
P
i
|f
i,p
−f
baseline
i,p
|
P
i
f
baseline
i,p
, and the results are given in Fig.3.11. (Here, we focus on
the overall system performance rather than on per class behavior.)
When t
i,p
increases, the difference also increases. By using short t
i,p
, the system
can account for peers’ contributions in real time. On the other hand, if the system
updates using longer time intervals, the results represent peers’ average contributions
duringt
i,p
.
39
0
0.5
1
1.5
2
2.5
3
3.5
60 80 100
difference (%)
uploading ratio (%)
Figure 3.12: The average ofdifference when peer’s uploading ratio changes from 60%
to 100%,t
i,p
= 10 min
In addition tot
i,p
, peers’ uploading rate also affects results. In Fig.3.12,t
i,p
is fixed at
10 min, and a peer’s uploading rate changes from 60% of upload capacity to 100%. The
results indicate that thedifference increases when a peer increases its uploading rate.
When a peer increases its uploading rate to 100% of the capacity, the average uploading
bandwidth is higher than the video rate. Then, a peer has more choices when selecting
from which peers to download. Therefore, the number of uploading connections for
each peer may change dramatically while the number of downloading connections stays
the same. In such a case, thedifference may increase.
We now consider the token server’s overhead under differentt
i,p
’s. We depictα, the
average number of reported tokens, in Fig.3.13(a). In addition, we defineβ as the aver-
age number of times each token has been reused, with results depicted in Fig.3.13(b).
Fig.3.13 demonstrates thatα decreases andβ increases, when the token server chooses
longert
i,p
’s. The results indicate that using longert
i,p
’s can reduce overhead at the token
server, i.e., tokens can be reused more times whent
i,p
is longer.
40
0
1
2
3
4
5
6
5 10 20
α
t
i,p
(min)
(a)α: the average number of reported
tokens
0
1
2
3
4
5
6
7
8
9
5 10 20
β
t
i,p
(min)
(b)β: the average number of times
each token has been reused
Figure 3.13: The overhead in token servers.
3.4 Discussion
3.4.1 Extending to multi-layer streaming
Our experiments use single-layer streaming. However, our token-based schemes can be
extended to multi-layer-based streaming. Briefly, in the TPCP and TPCS schemes, a
peer with low uploading bandwidth can just download the base layer and generate fewer
tokens. A peer with high upload bandwidth can download more enhanced layers and
pay more tokens. In the TSCP scheme, the token server sends peers tokens. Although
the number of tokens is sufficient to download all streaming content, including base and
enhanced layers, a peer with low upload bandwidth can just download the base layer
and keep the rest of the tokens. In such a case, its contribution can increase because its
s
i,p
decreases as shown in Eq. (3.2). The usage depends on a peer’s preference. If a peer
prefers higher video quality, it can use the tokens to obtain enhanced layers and view
more advertisement. Otherwise, a peer can choose lower video quality streaming and
view fewer advertisements.
41
3.4.2 Peer selection
In our work, our goal is to demonstrate that service providers can use advertisements
as an incentive for resource contribution, rather than to focus on a specific neighbor
selection approach. Service providers can choose peer selection algorithms based on
their needs, e.g., they may use tit-for-tat type of an approach, which is widely used
in P2P systems. A number of peer selection algorithms have been proposed in the
literature [3,32,37,71], where some approaches consider locality-aware selection, based
on ISPs [71] or ASs [32], which may improve the overall performance.
3.4.3 Overhead of distributing advertisement
In our system, peers have to download advertisements. When peers’ connection capac-
ities are poor, distribution of advertisements may make such a situation even worse.
There are two possible solutions for solving the problem. The first solution is to pre-load
advertisements before peers start to download content. Then, peers can just download
content later. The second solution is to display downloaded advertisements repeatedly.
Therefore, peers do not have to download new advertisements and can save bandwidth
for downloading content. However, the two solutions may conflict with the service
provider’s profit. The service provider may prefer that peers watch new advertisements
each time. In such a case, the service provider can deliver low quality advertisements
or different types of advertisements whose sizes are small, such as flash animation, to
reduce transmission overhead.
3.4.4 Overhead of token decryption
In our system, every token is encrypted to prevent peers from faking tokens. However,
decryption is a significant computational overhead in our system. Therefore, in order to
42
reduce such overhead, peers are allowed to reuse tokens. When the length of a token is
sufficiently large, an alternative mechanism that can reduce computational overhead is
the use of digital signatures. In such a mechanism, peeri uses hash functions to generate
a message digest of a token. Then, peer i encrypts the message digest with its private
key. The result is peer i’s digital signature for the token. Finally, peer i appends the
digital signature to the token. After peer j receives this token from peer i, peer j can
decrypt the signature by using peer i’s public key and verify whether the token is sent
by peeri. Because the length of a digital signature is shorter than the length of a token,
the overhead of decrypting a digital signature is less than the overhead of decrypting
an encrypted token. However, this would prove that a token is sent by a specific peer,
but will not prevent peers from faking tokens. After malicious peers receive tokens
with digital signatures, they can extract tokens’ content and fake tokens. In addition,
malicious peers can just remove appended digital signatures from tokens and modify
tokens. This is in contrast to encryption algorithms where a token can be decrypted only
by token servers. Hence, we adopt encryption algorithms.
3.4.5 Comparisons between token-based schemes
In Chapter 3.3, we show that using different schemes does not affect peers’ contribu-
tion. In our work, the motivation for considering different schemes is to explore several
issues. We discuss these issues below.
Reliability: First, we discuss reliability characteristics of the three schemes. Among
these schemes, TPCP is more reliable than TPCS and TSCP. The reason is that TPCP is a
distributed scheme and it does not have to worry about token server failure. However, in
the other schemes, the system has to deal with token server failure. In TPCS, the token
server is responsible for calculating peers’ contributions and informing peers about the
amount of advertisement they have to watch. If the token server fails, peers do not know
43
how much advertisement they have to watch. To solve this problem, we can set a timer.
If the waiting time for a token server’s message exceeds the timer, the system starts to
display advertisements. The length of advertisements can be based on peers’ history or
the default value set by the service provider. In TSCP, the token server is responsible
for delivering tokens to peers. If the token server fails, there are no tokens sent to peers,
and peers cannot download data because of lack of tokens. In our future work, we will
focus on how to increase system reliability.
Overhead: In all token-based schemes, token generation, distribution, and decryp-
tion all represent overhead for the system. In TPCP, the overhead is shared by peers.
However, in TPCS and TSCP, the token server has to deal with tokens. In TPCS, the
token server needs to decrypt tokens before it calculates peers’ contributions. To reduce
decryption overhead, we propose the reusing token mechanism, and our results demon-
strate that this mechanism can reduce the amount of reported tokens. In the TSCP
scheme, the token server is responsible for encrypting and distributing tokens to peers.
To reduce such overhead, the token server can send tokens using a staggered sched-
ule, rather than sending tokens to all peers at once. However, this may reduce system
reliability.
Moreover, in TPCS and TSCP, the token server has to keep login information of each
peer. The information is used as a way of checking whether the peer exists in the system
or not. Keeping login information is also overhead for the token server. To reduce such
overhead, the token server can reduce the frequency of updating login information. For
example, the token server can increase the time interval between two heartbeat packets.
Another method is for the token server to keep the newest login information, but not the
past login information. Because the information is used to check whether the peer is in
the system, past information is not useful in such a case.
44
Token Management: Although TPCS and TSCP are less reliable than TPCP, they
result in better token management. In TSCP, tokens are generated by the token server.
Therefore, the token server can control the number of tokens distributed to peers and
force peers to contribute their resources. In our work, we assume the number of tokens
generated by the token server is enough for peers to download the entire content. In
order to force peers to contribute resources, the token server can generate fewer tokens.
In such a case, peers would have to contribute more to earn more tokens for downloading
the entire content. In TPCS, tokens are reported to the token server. Therefore, the token
server has better information about peers’ contributions and behavior. The information
can be used for further system analysis and development. Service providers can use
the information to decide how to distribute advertisements and which advertisements
should be distributed. In our system, we use the information to determine the amount of
advertisements distributed during each time interval. Service providers can use differ-
ent mechanisms. For example, since service providers must satisfy advertisers’ needs,
they can design a display schedule of advertisements based on how long peers watch
advertisements and make sure every advertisement is viewed at least a specific number
of times. They can also have several schedules to maximize revenues based on different
peers’ behavior. Another example is that service providers can change system settings
based on peers’ behavior. They can measure how much bandwidth peers are willing
to contribute and determine whether they should increase or decrease the amount of
advertisements. Such directions are outside the scope of our work and are part of future
efforts. In addition, peers’ contributions are calculated by the token server in TPCS. It
can eliminate falsifying contributions at the peer-side.
Malicious attacks: The motivation for using token-based schemes, in addition
to reporting contributions by peers, is to determine true peers’ contributions. Using
45
encrypted tokens can also prevent peers from faking tokens. Here, we discuss some
attack methods and show how to protect our system from these attacks.
One possible attack is from colluding peers. For instance, a peer with high upload
bandwidth can forward its tokens to other peers without receiving uploads, in order to
help them increase their contributions. To address this attack, when the system cal-
culates a peer’s contribution, it can consider a peer’s income and spending amount
simultaneously. For example, when malicious peers forward their tokens to others, the
tokens would be considered as their spending. When their spending amount increases,
their contribution would decrease. Thus, malicious peers would hurt their contributions
by forwarding their tokens to other peers. Another solution is for service providers to
change contribution calculation functions. We give two possible methods to demon-
strate the idea of how to modify such functions. We define t
i
as the number of tokens
peeri has andc
i
as its contribution. In our work, we use a linear function to calculatec
i
.
Therefore, one possibility for service providers is to change coefficients of the function.
For example, the contribution calculated by the new function, c
new
i
, is
c
i
k
, where k is
larger than 1. In such a case, a peer needs k times the number of tokens to maintain
the same contribution. Another alternative for service providers is to consider non-liner
functions. Then, when a peer receives a token, the increased value of its contribution is
different and is based on the number of tokens it has. For example, service providers can
use an exponential function to calculate peers’ contributions (e.g., c
new
i
= e
t
i
). In such
a case, when a peer forwards its tokens to other peers, the decreased value of its contri-
bution may be larger than the increased value of other peers’ contributions. Therefore,
it may cause a significant loss for malicious peers’ contributions to forward their tokens
to others. The solutions may not eliminate colluding attacks totally if a peer has a lot of
tokens. However, such solutions can reduce the the number of peers that can participate.
46
Another possible attack in TPCS is to ignore messages from the token server. For
this attack, the solution is the same as the method used when the token server is down.
The software will display advertisements after a set timer is exceeded.
Finally, in TPCP and TSCP, contributions are calculated at the peer-side. Therefore,
malicious peers may try to modify software parameters, such asMax(f),Def(f), and
Min(f), and affect results of contribution calculation. However, to execute such attacks,
malicious peers need to modify the software. It is typically difficult (i.e., requires a
certain level of expertise) to make such modifications. This issue is outside the scope of
our work and is part of future efforts.
3.5 Conclusions
In this chapter, we proposed an incentive mechanism, in the context of P2P streaming
systems, for peers to contribute their upload resources. In contrast to previous efforts
that used quality of streamed media as an incentive, we suggested that service providers
can use advertisements as an incentive and proposed a framework for doing so. Our
experimental evaluations indicate that our approach can encourage peers to contribute
greater amounts of their upload resources. Our framework design includes three token-
based schemes, which help avoid malicious peer behavior, i.e., reporting of fake contri-
butions. We discuss various properties of our token-based schemes. Our experimental
results provide guidelines and intuition for the various parameter settings in our frame-
work. Our approach is orthogonal to earlier efforts in the literature, using media quality
as an incentive.
47
Chapter 4
Resource Estimation for Network
Virtualization through Users and
Network Interaction Analysis
4.1 Introduction
Although the Internet has been a tremendous success in providing a variety of (packet-
delivery based) services, the Internet’s rapid growth and deployment have also become
obstacles to modification of its current architecture and adoption of new protocols.
Specifically, due to the coexistence of multiple ISPs, such modifications require their
mutual agreement. However, given their conflicting policies and goals, ISP agreement
is hard to achieve.
Network virtualization is viewed as a potential approach to overcome the ossifica-
tion problem of the Internet. Service providers can create their own VNs by leasing
resources from infrastructure providers and offering customized end-to-end services.
When service providers generate VN requests, they can specify resources. Thus, when
a VN embedding process maps a VN request onto specific physical nodes and links in
the substrate network, it must result in satisfaction of constraints on virtual nodes and
links.
However, as we mentioned in Chapter 2.2, the VN embedding problem has been
proven to be NP-hard (in offline and online scenarios [9, 46]). Addressing this problem
48
has been an active area of research, with a number of heuristics having been proposed,
e.g., [18,23,29,31,52,57,75,87,91,95]. These approaches are designed to achieve objec-
tives defined from the perspective of infrastructure providers. Moreover, they assumed
a VN request will ask for specific resources, such as network capacity or computing
power. However, from the perspective of service providers, the main concern is having
sufficient resources to support a certain level of service quality. Existing efforts are not
concerned with this, i.e., they do not consider what resources are required to support a
needed quality of service, which is an important consideration for service providers.
To this end, in this chapter we propose an alternative approach - namely that of
considering QoS as a constraint. That is, when VN requests are made, service providers
should be able to use the minimum required QoS as constraints of that request, rather
than the amount of resources needed. It is typically not straightforward to determine the
amount of resources that should be requested in order to satisfy a desired level of QoS.
Thus, we reconsider the problem in this light.
There are three roles in virtual networks: infrastructure providers, service providers,
and customers, where each role has different goals:
• Infrastructure providers typically focus on balancing the load, maximizing rev-
enue, and minimizing cost.
• Service providers typically focus on maximizing revenue (e.g., by supporting as
many customers or requests as possible); they are the intermediaries between cus-
tomers and infrastructure providers.
• Customers typically focus on quality of service, such as downloading time (in file
downloading applications) or bit-rate (in video viewing applications).
Previous efforts largely focus on infrastructure providers. Our work also includes a
focus on service providers as well as customers. Specifically, in contrast to previous
49
efforts, we consider and focus on the use of QoS as a constraint in the VN embedding
problem.
In our work, we consider two types of services, web and videos (e.g., YouTube [43]).
These two types of traffic correspond to more than 60% of the Internet’s traffic in North
America [69]. As mentioned above, from the perspective of service providers, revenue
maximization is an important goal. Several metrics can be used to evaluate revenue.
For example, service providers can measure how many customers they can support
or how many requests (connections) can be completed over a certain time period. In our
work, we use connection completion rate as our QoS constraint, i.e., when generating a
VN request, service providers would specify a desired connection completion rate.
Since connection completion rates are affected by network conditions and user
behavior, service providers should consider these two factors when requesting resources
for a specific connection completion rate. The difficulty with using connection comple-
tion rates for provisioning is that users react to QoS, so their demand becomes a moving
target.
To model this user reaction, Tay et al. [76] decompose traffic equilibrium into user
demand and network supply and then apply this model to web traffic. The user curve
and network curve intersect to determine traffic equilibrium. We adopt their model and
extend it to support video traffic by considering different user behavior characteristics.
Our contributions in this chapter can be summarized as follows. Unlike previous
efforts, we focus on the use of QoS as a constraint in the VN embedding problem.
We develop models and corresponding techniques that allow determination of resource
amounts needed to satisfy QoS constraints. Specifically:
• We adopt the model proposed in [76] in the context of treating traffic equilibrium
as a balance between user and network behavior and extend it to video traffic by
50
considering different user behavior characteristics (see Chapter 4.3.1). By com-
bining the original model with the extension, we can model the two most popular
types of traffic in North America.
• Moreover, we develop a corresponding mechanism for estimating traffic equilib-
rium when different link capacities are given (see Chapter 4.3.2). This mechanism
allows infrastructure providers and service providers to determine the amount of
resources needed to satisfy QoS constraints (e.g., connection completion rate) effi-
ciently, particularly when user behavior and QoS constraints change dynamically.
• In addition to connection completion rate, we also consider QoS metrics from the
perspective of customers (e.g., bit-rate). We propose an algorithm which estimates
the amount of resources needed by considering multiple QoS constraints simul-
taneously. In Chapter 4.3.3, we demonstrate how our approach can be used by
infrastructure providers in offering them insight into efficient resource mapping
techniques in a virtual network environment.
• Our extensive validation results demonstrate that our estimation mechanism is
accurate and robust under heterogeneous settings, including link delay, buffer
sizes, user behavior, and file sizes (see Chapter 4.4). We also compare our mecha-
nism with traditional regression analysis based approaches. We explore the limita-
tions of the regression analysis based approaches and demonstrate that our mech-
anism is more accurate (see Chapter 4.5). Our proposed approach is orthogonal
to (and can be combined with) existing efforts (see Chapter 4.5).
51
wait-abort wait-complete
wait-state
p
retry
q
retry
p
abort
1-p
abort
r
click
r
session
think
r
supply
p
next
q
next
Figure 4.1: Surfing session model (corresponds to Fig.3 in [76]).
wait-state
k downloads
in progress
aborted
downloads
completed
downloads
click
rate of unaborted clicks
r
click
r
out
p
abort
r
click
r
in
=(1-p
abort
)r
click
Figure 4.2: Flow of downloads (corresponds to Fig.4 in [76]).
4.2 Background
In our work, we adopt the traffic equilibrium analysis model for web traffic proposed
in [76]. For completeness of presentation, we first provide a summary of background
information needed in the remainder of this chapter; for clarity of presentation, several
figures are reproduced here (the corresponding figures in [76] are noted accordingly).
We present our proposed model in Chapter 4.3.
Tay et al. consider traffic equilibrium as a balance between an inflow controlled by
users and an outflow controlled by the network (e.g., link capacity, congestion control,
52
k downloads
in progress
r
click
r
out
p
abort
r
click
r
in
=(1-p
abort
)r
click
k downloads
in progress
user curve
network curve
interaction
through:
r
in
=r
out
Figure 4.3: Two submodels: a user curve and a network curve (corresponds to Fig.5
in [76]).
etc) [76]. That is, the number of active connections is controlled by users, where the
network conditions affect how fast a connection can be completed. Since users react to
congestion, the interaction between users and the network form a loop:
1. TCP’s control mechanism reduces congestion window due to network congestion.
2. The reduced congestion window causes downloading time to increase, so users
may generate fewer connections or abort connections.
3. User reaction reduces the number of active connections and, consequently, TCP
increases transfer rate per connection.
4. The increased downloading rate encourages users to launch more connections,
thus causing congestion to increase, looping back to 1 (above).
This loop makes it hard to reason about the shift in an equilibrium when there is a flash
crowd, or some link breaks.
53
Tay et al. use a surfing session model for user behavior, as depicted in Fig.4.1; it
is assumed that the session arrival rate, r
session
, is independent of network congestion
because users are unaware of network conditions until they arrive.
In each session, a user generates requests, e.g., by clicking on hyperlinks, buttons,
etc.
1
r
click
is defined as the click rate. Each click may launch multiple responses, and the
traffic sent to the user is termed a download. For simplicity, we use the terms download
and connection interchangeably in what follows.
After a click, a user waits for download completion (wait-state is Fig.4.1).
When the user enters the wait-state, two different cases are possible, wait-abort
state andwait-complete state, i.e., if the downloading time is too long due to net-
work congestion, the user may decide to abort the download. The probability of aborting
a download isp
abort
. After the user aborts a download, it may retry again with probabil-
ityp
retry
. Otherwise, it quits the session with probabilityq
retry
= 1−p
retry
. If the user
completes the download, it enters thethink state (when viewing downloaded content).
Then, it may click (generate another request) with probabilityp
next
or finish the session
with probabilityq
next
= 1−p
next
.
Focusing on the wait-state (in Fig.4.2),k is defined as the average number of ongoing
concurrent downloads in the wait-state. Thisk is a measure of network congestion.
The wait-state is decomposed into a user−network model (depicted in Fig.4.3).
This model includes a user demand curver
in
and a network supply curver
out
. The user
curve represents the relationship between the unaborted click rate and the congestion
levelk, and the network curve describes the relationship between the rate of completed
downloads andk. The decomposition in Fig.4.3 gives
r
in
=
(1−p
abort
)r
session
1−p
retry
p
abort
−p
next
(1−p
abort
)
(4.1)
1
Typing in a URL is also regarded as a click.
54
and
r
out
=
k
p
abort
1−p
abort
T
abort
+
S
completed
b
TCP
, (4.2)
whereT
abort
,b
TCP
, andS
completed
are the average time spent in thewait-abort state,
the average bandwidth provided by TCP for a download, and the average size of a com-
pleted download, respectively. These two equations describe a pair of user and network
curves that determine the traffic equilibrium where they intersect (see Fig. 4.5(a)).
Analyzing how a flash crowd affects the equilibrium thus reduces to examining how
the user curve is affected. Similarly, analyzing the impace of a link failure reduces to
examining how the network curve moves. We thus break the feedback loop illustrated
earlier for TCP.
4.3 Proposed Mechanism
Below, we extend the traffic equilibrium analysis model to video traffic (see Chapter
4.3.1). Moreover, based on the traffic equilibrium analysis model, we design a mecha-
nism to estimate the amount of resources needed to satisfy QoS constraints (see Chapter
4.3.2). We then demonstrate how infrastructure providers can use our mechanism (see
Chapter 4.3.3). We present the evaluation of our mechanism in Chapter 4.4.
4.3.1 Traffic Equilibrium Analysis Model Extension
The work presented in [76] only considers web traffic. We now extend the model
to videos by considering different user behavior characteristics. In our work, we use
YouTube [43] as an example of video traffic. YouTube provides video sharing services
55
and allows users to upload videos of up to 15 minutes in length.
2
We also describe how
to apply our model to long videos (e.g., movies).
In web services, a user aborts a download mostly because it is slow. In video sharing
services, however, a user may abort a download for other reasons. Gill et al. have
studied the YouTube system and found that approximately 24% of video downloads
were interrupted [35]. They argued that there were two main reasons for users to abort
connections before the video ended: (1) poor performance due to slow downloading rate
(i.e., as in web services); (2) poor content quality (e.g., video content is uninteresting,
or video resolution is low).
Because user behavior is different in video sharing services, the model described
above cannot be applied directly, and the two cases described above need to be recon-
sidered. We definep
rate
as the probability that the downloading rate of a connection is
too slow andp
quality
as the probability that users abort a connection because of poor con-
tent quality, even if the downloading rate is fast. Then, p
video
abort
in video sharing services
can be given as:
p
video
abort
=p
rate
+(1−p
rate
)p
quality
(4.3)
This equation gives the user curve a shape that is different from the one for web traffic
(compare Figs. 4.5(a) and 4.5(b)).
We assumep
retry
is the same, whether users abort a connection due to having a poor
rate or poor quality. Therefore, we can simply replace p
abort
in Eq.4.1 byp
video
abort
, which
can be calculated using Eq.4.3. Then, the equation for the user curve (r
in
) for web
services (Eq.4.1) still works for video sharing services.
2
By default, YouTube allows users to upload videos that are 15 minutes long. If users want to upload
videos that are longer than 15 minutes, they have to verify their accounts. In our work, we use the default
setting.
56
We also define T
rate
as the average amount of time users wait before aborting a
connection because of slow downloading rate, andT
quality
as the average amount of time
users wait before aborting a connection because of poor content quality. By Little’s Law,
the value ofk is
k =[p
rate
T
rate
+(1−p
rate
)p
quality
T
quality
+(1−p
video
abort
)t
completed
]r
click
,
where t
completed
is the average amount of time spent in the wait-complete state.
Then, the equation forr
click
in video sharing services is:
r
click
=
k
prateTrate+(1−prate)p
quality
T
quality
+(1−p
video
abort
)t
completed
and the equation for the network curve,r
out
, is:
r
out
= (1−p
video
abort
)r
click
(4.4)
=
k
prate
1−p
video
abort
T
rate
+
(1−prate)p
quality
1−p
video
abort
T
quality
+
S
completed
b
TCP
,
wheret
completed
=
S
completed
b
TCP
.
Again, this equation gives the network curve a shape that is different from that for
web services (see Fig. 4.5).
The above assumes that users will view the entire video, if they do not abort a video
download. Since more than 75% of videos on YouTube are within 300 seconds [19], we
believe that this assumption is reasonable. However, this assumption may not be valid
57
when considering Video-on-Demand (V oD) services that provide long videos, such as
movies.
Yu et al. have studied user behavior in V oD systems [86], where a typical file is
roughly 100 minutes in length. They found that more than 75% of sessions were ter-
minated within 25 minutes. Most of the users just scanned through videos rather than
watched the entire movie. In such a case, we can use a threshold to define what is a
completed connection. If a connection is killed after this threshold is reached, then we
consider this kill as not an abort. The threshold can be defined in terms of time, fraction
of video length, etc.
Moreover, compared to short video sharing systems, user behavior in V oD type sys-
tems is more complicated, e.g., users may seek forward or backward while viewing a
movie [17]. Such user behavior may change time spent viewing a video. In our model,
the time users spend viewing a video is required information. Thus, given such infor-
mation, our model can be applied to long videos.
An important consideration in our work is that users do react to network conges-
tion. It is a reasonable model for web and video traffic when users are monitoring their
session’s progress. However, when users download a large file, such monitoring may
not occur (they may take a break, do something else, etc). Tay et al. have studied such
non-reactive connections and found that such traffic may cause a loss of equilibrium and
induce a performance collapse [76]. We therefore assume there is a separate mechanism
(e.g., admission control) for dealing with non-reactive flows, and focus only on reactive
connections in our work.
4.3.2 Performance Estimation Mechanism
In our work, we consider service providers as well as customers. We have described
the traffic equilibrium analysis model for web and video traffic above. This model is
58
0
50
100
150
200
0 10 20 30 40 50 60 70 80
b
TCP
(kbps)
k
(a)b
TCP
vs. k
0
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0.008
0.009
0.01
0 10 20 30 40 50 60 70 80
p
abort
k
(b)p
abort
vs. k
0
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0.008
0.009
0.01
20 40 60 80 100 120 140 160 180
p
abort
b
TCP
(kbps)
(c)p
abort
vs. b
TCP
Figure 4.4: Relationship between average TCP connection bandwidth b
TCP
, average
number of concurrent downloadsk, andp
abort
based on a user−network model that separates user and network behavior into two
curves, and the traffic equilibrium occurs where these two curves intersect. We use con-
nection completion rate at the equilibrium point as a QoS metric from the perspective
of service providers. In Chapter 4.3.2, we develop a mechanism to estimate the amount
of resources needed to satisfy such a QoS constraint. Then, in Chapter 4.3.2, we pro-
pose an algorithm for estimating the amount of resources needed when different QoS
constraints are requested.
Connection Completion Rate
To calculate the connection completion rate, we would need to know user behavior
characteristics and network conditions. For example, user arrival rate in the daytime is
typically higher than at night. Thus, to maintain the same QoS, infrastructure providers
should allocate more capacity in the daytime. Moreover, service providers may make
requests for higher QoS. For example, when service providers release new services or
new videos, they may expect a higher connection load from customers. If the neces-
sary information is not available (e.g., initially), traffic conditions need to be measured.
However, since network equilibrium changes with time, it is not efficient to measure
traffic conditions continuously.
59
One possible approach for infrastructure providers to cater to changing demands of
customers and service providers is to increase capacity gradually and then check whether
the assigned resources can satisfy the QoS requirements. Although straightforward, this
method is inefficient and slow. We therefore use our demand/supply model to develop
a mechanism that can estimate the amount of necessary capacity based on measurement
data collected by infrastructure providers.
We note that each VN can run its own services. Because user behavior differs from
service to service, we assume that the data used for estimation purposes must be col-
lected from the same type of service, e.g., data collected from web services should not
be applied to video services. Since, in their basic construction, the models presented
above are similar, we use the model for web traffic as an example to demonstrate the
proposed mechanism.
Recall that the traffic equilibrium is a balance between an inflow controlled by users
(Eq.4.1) and an outflow due to the network (Eq.4.2). We can classify parameters in
Eq.4.1 and Eq.4.2 into three categories based on what causes them to change:
• user: p
abort
,r
session
,p
retry
,p
next
,T
abort
,k
• network: b
TCP
,k
• application: S
completed
Because data are collected from the same service, we can assume that S
completed
and
T
abort
are unchanged
3
. Since users do not know network conditions before they arrive,
3
The change in T
abort
(depending on time of day) is a slow change that can be measured leisurely
(e.g. offline). Within the time granularity of a connection, T
abort
can be considered constant. Value of
content can determine abort time; this would make abort time a random variable, where we would use the
average value,T
abort
, of this random variable.
60
we can assume that r
session
remains constant over a time window. The analysis of col-
lected traces in [79] indicates that p
retry
is fairly constant and that p
next
can be rep-
resented as a function of p
abort
. Therefore, our proposed mechanism focuses on the
relationship betweenp
abort
,k, andb
TCP
.
The first step is to determine the relationship betweenk andb
TCP
. Recall thatb
TCP
is the average bandwidth provided by TCP for a download, and k is the number of
concurrent connections.When k is 1, b
TCP
will be the throughput of that connection.
Altman et al. [4] and Barakat et al. [12] have studied TCP throughput in single-hop and
multiple-hop paths, respectively. Thus, we can calculate TCP throughput of a connec-
tion based on the path type.
However, the TCP throughput calculated by [4] or [12] is an ideal value. Reach-
ing the ideal value requires a large enough maximum congestion window size. If the
maximum congestion window size is small, we can estimate TCP throughput by using
Wm
RTT
, where W
m
is the maximum congestion window size and RTT is the round trip
time [63]. We define thrp
ideal
as the TCP throughput calculated by models proposed
in [4] and [12]. We can then have the following equation of TCP throughput,thrp:
thrp = min(thrp
ideal
,
W
m
RTT
)
Whenk increases, the value ofb
TCP
is determined by the capacityC of the virtual link.
We therefore have
b
TCP
=
thrp , ifthrp∗k≤C
C
k
, ifthrp∗k > C
(4.5)
Fig.4.4(a) illustrates Eq.4.5, using data from ns2 simulations [40]. In this example,C is
2 Mbps, and users abort a connection if the connection cannot be completed within 20
seconds.
61
Users will react to network congestion: When b
TCP
decreases, p
abort
will increase.
It is difficult to determine their relationship directly. Since Eq. 4.5 already relatesb
TCP
to k, we can relate p
abort
to b
TCP
through k instead. There are tools (e.g., [79]) that
can extract information from traffic measurements for expressing p
abort
in terms of k.
Fig.4.4(b) illustrates such a relationship using simulation data.
Given these relationships (b
TCP
-k andp
abort
-k), we can determine the value ofp
abort
for different b
TCP
. Fig.4.4(c), derived from Figs. 4.4(a) and 4.4(b), is an example
illustration of the relationship between p
abort
and b
TCP
. We can think of this relation-
ship as representing how users react to network congestion. When a connection can
be completed with a reasonable downloading time, users will not abort the connection.
However, when network congestion becomes worse andb
TCP
becomes small, users may
become impatient when waiting for connection completion. From Fig.4.4(c), we can see
thatp
abort
increases dramatically whenb
TCP
decreases.
After determining the relationship between b
TCP
and p
abort
based on existing data
from earlier measurements, we can use it to estimate QoS in an offline manner. For
example, when service providers ask for higher QoS, infrastructure providers can use
our mechanism to estimate the corresponding amount of resources needed in an offline
manner. They can select a new capacity of the virtual link, C, and calculate the corre-
spondingb
TCP
and p
abort
based on existing data from earlier measurements. They can
then plot the user and network curves, using Eq.4.1 and Eq.4.2; the intersection point
(Fig.5) gives the connection completion rate.
Next, they can change the value ofC iteratively, until they find the new traffic equi-
librium that satisfies the QoS requirement. Resources can then be assigned to service
providers based on estimated results. This approach allows infrastructure providers to
avoid having to take more of an online approach (in contrast to the offline approach
62
Algorithm 1 Calculation of the minimum amount of capacity needed to satisfy multiple
QoS constraints simultaneously.
1: QoS Requirement 1 (R1): connection completion rate
2: QoS Requirement 2 (R2): bit-rate
3: //R1 andR2 are evaluated by intersecting the
4: //user curve (Eq.4.1) and network curve (Eq.4.4); see Fig. 4.5
5: //assumes there is a feasible solution for capacity
6: C
low
← C
min
7: C
high
← 2∗C
min
8: while true do
9: ifR1(C
low
) ANDR2(C
low
) then
10: returnC
low
11: else ifR1(C
high
) ANDR2(C
high
) then
12: if
C
high
−C
low
C
low
≤δ then
13: returnC
high
14: end if
15: C
temp
←
C
high
+C
low
2
16: ifR1(C
temp
) ANDR2(C
temp
) then
17: C
high
←C
temp
18: else
19: C
low
←C
temp
20: end if
21: else
22: C
low
← C
high
23: C
high
← 2∗C
high
24: end if
25: end while
described above), i.e., they can avoid having to increase resources gradually and mea-
sure traffic continuously.
Multiple QoS Constraints
Above, we gave an approach for estimating the amount of resources needed to satisfy
connection completion rate requirements. However, from the perspective of customers,
different QoS metrics may be of interest, such as downloading time (in file downloading
63
applications) or bit-rate (in video viewing applications). In order to attract customers,
service providers need to consider multiple QoS constraints simultaneously.
As an example, we use video services and consider two QoS constraints, connection
completion rate and bit-rate, from the perspective of service provider and customers,
respectively. We design an algorithm to calculate the amount of resources needed to
satisfy multiple QoS constraints.
The equation of the network curve for video,r
out
, is derived in Eq.4.4. Initially, we
assume there is no user reactions and all probabilities in Eq.4.4 are 0. In such a case, the
equation of the network curve is r
out
=
k
S
completed
b
TCP
. Then, we replace b
TCP
by
C
k
where
C is the capacity of a link, and the equation of the network curve isr
out
=
C
S
completed
.
If the connection completion rate requested by service providers isr
rate
, the amount
of resources needed should be at leastC
min
whereC
min
= r
rate
∗S
completed
. Then, we
can use Algo.1 to calculate the amount of resources, C
required
, that is needed to satisfy
multiple QoS constraints (file throughput and the video bit rate). The idea here is to set
bounds C
low
and C
high
for link capacity and iteratively adjust them, to converge on a
valueC that satisfies the QoS requirements
4
.
In Algo.1, Eq.4.1 and Eq.4.4 are used to determine the user and network curves; the
intersection point (Fig.4.5) gives the connection completion rate (which corresponds to
R1 in Algo.1), while the equilibrium k value determines b
TCP
(Eq.4.5) (which corre-
sponds toR2 in Algo.1).
4.3.3 Virtual Network Embedding
We presented the traffic equilibrium analysis model in Chapter 4.2 and Chapter 4.3.1
and proposed our estimation mechanism for the amount of resources in Chapter 4.3.2.
4
In Algo.1, we assume there is a feasible solution for capacity. If a feasible solution does not exist,
infrastructure providers would reject this request.
64
In this chapter, we explain in detail how infrastructure providers can use our mechanism
to calculate the amount of resources needed.
Here, service providers can ask for a certain QoS (e.g. connection completion rate)
instead of the amount of resources (e.g. capacity) in virtual network requests. Infras-
tructure providers need to estimate the amount of resources which should be assigned
to satisfy the QoS specifications. Moreover, the infrastructure providers should also
be able to modify resource allocations dynamically. Our mechanism allows infrastruc-
ture providers to efficiently estimate the amount of capacity necessary to satisfy such
requests, based on existing data from earlier measurements.
However, there is a requirement for our resource estimation mechanism: infrastruc-
ture providers should have measured p
abort
, r
session
, p
retry
, p
next
, T
abort
, b
TCP
, k, and
S
completed
for a particular application. If infrastructure providers do not have such data,
they cannot use the estimation mechanism proposed in Chapter 4.3.2. To address this
issue, we propose an iterative approach.
Suppose the service provider specifies a value for connection completion rater
rate
,
and the infrastructure provider must determine the capacity allocation C for the corre-
sponding virtual link. The infrastructure provider has to collect data to computer
session
,
b
TCP
, k, and S
completed
. Since they do not have data on user behavior at the begin-
ning of this process, they initially set all probabilities to 0, and use a lower bound
C = r
rate
∗ S
completed
, as in Chapter 4.3.2. After collecting sufficient data for p
abort
,
T
abort
, etc., the infrastructure provider uses collected data to calculate the user and net-
work curves and adjust C accordingly, as in Algo.1. This can be repeated if better
estimates of user behavior data become available.
65
4.4 Evaluation
4.4.1 Evaluation Environment
We use measurement results from [73] to generate a substrate network topology. In
[73], Spring et al. designed Rocketfuel, a mapping engine, to measure router-level ISP
topologies. We use the Tiscali topology of the Rocketfuel’s trace. The topology has 276
nodes and around 400 links, a scale that corresponds to a medium-size ISP. The link
capacities follow a uniform distribution from 40Mbps to 200Mbps.
We consider two services, web and video. For web services, each download transfers
thirty 536-byte packets, and T
abort
is 20 seconds. The settings are the same as used
in [76]. For video services, the file size of a video is 10 MBytes. T
rate
andT
quality
are 6
minutes and 30 seconds, respectively. These settings are based on measurement results
in [35].
We use the ns-2 simulator [40] to evaluate the network curve. We determine p
abort
from the network curve simulation. We randomly select two end nodes from the Tiscali
topology to form a path. We consider the path as a virtual link between these two nodes.
Then, we use dumbbell configurations withk sources sending data tok destinations over
this virtual link, where thek sources are attached to one end node, and thek destinations
are attached to the other node. Although there is only one topology in the experiments,
traffic is generated over multiple virtual links, with each link having its own topology.
To maintain k concurrent connections, all sources are in busy states all the time.
After we determine p
abort
, the user curve (Eq.4.1) can be plotted as follows: In [76],
collected traces are analyzed and p
retry
is found to be close to 0.97. We can then use
p
abort
determined from the network curve simulation and do a linear regression fit to
determinep
next
.
66
2
4
6
8
10
12
14
16
18
20
22
24
0 20 40 60 80 100 120 140 160 180 200
connection completion rate
the number of concurrent connections
network curve(2MBps)
network curve(1MBps)
user curve(2MBps)
user curve(1MBps)
(a) The network curve and the user
curve in web services.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
20 40 60 80 100 120 140 160 180
connection completion rate
the number of concurrent connections
network curve
user curve(r
session
=0.02)
user curve(r
session
=0.03)
(b) The network curve and the user
curve in video services.
Figure 4.5: The connection completion rate at equilibrium is where the user demand
(Eq.4.1) and the network supply (Eq.4.2) curves intersect. Notice how the curves differ
in shape between (a) and (b).
We give examples of user/network curves for web and video traffic in Fig.4.5(a)
and Fig.4.5(b), respectively. In Fig.4.5(a), the r
session
is fixed at 0.4, and we use two
capacity setting for web traffic: 1Mbps and 2Mbps. Fig.4.5(a) demonstrates that dif-
ferent capacities result in different connection completion rates when the network traf-
fic reaches equilibrium. As expected, the connection completion rate increases as the
capacity increases.
In Fig.4.5(b), the capacity of a virtual link for video traffic is 30Mbps. We use two
r
session
values: 0.02 and 0.03. In this example, when r
session
increases, the connection
completion rate at equilibrium decreases. Therefore, to satisfy the same QoS require-
ment, infrastructure providers should increase the capacity. Fig.4.5 demonstrates that
network conditions (e.g., capacity) and user behavior (e.g., r
session
) affect the network
equilibrium.
4.4.2 Accuracy of Performance Estimation Mechanism
We presented our performance estimation mechanism in Chapter 4.3.2. Here, we eval-
uate its accuracy. As mentioned above, measurement data is necessary to carry out the
67
0
1
2
3
4
5
6
7
changing QoS changing r
session
average error rate (%)
web
video
Figure 4.6: The average error rate for changing QoS and changingr
session
estimation mechanism. We collect data from ns2. The capacity for web traffic and for
video traffic is 1Mbps and 20Mbps, respectively. We use this data as measurement data
in the estimation mechanism.
First, we fix the value ofr
session
for web and video traffic and use different connec-
tion completion rate requirements to simulate service providers changing QoS require-
ments. For web service, the maximum r
out
value is about 8, so we pick a QoS value
uniformly at random between 4 and 80 for the connection completion rate; for video ser-
vice, the maximumr
out
value is about 0.25, and we similarly pick a QoS value between
0.2 and 2.
To satisfy this changing QoS, an infrastructure provider should be able to allocate
capacity dynamically. When a connection completion rate is given, we determine the
estimated capacity, C
model
, using our mechanism; we use ns2 to determine the actual
amount of capacity needed, C
sim
. Then, we compare C
model
with C
sim
and calculate
the error rate =|
C
sim
−C
model
C
sim
|. The average error rate is presented in Fig.4.6.
We then fix the connection completion rate (QoS) requirement and change the values
ofr
session
. This scenario simulates the case where the user arrival rate changes dynam-
ically. For web service, r
session
is selected uniformly at random between 0.2 and 2;
for video service, r
session
is similarly chosen between 0.01 and 0.1. The infrastructure
68
provider has to adjust resource allocation to satisfy the fixed QoS requirement despite
the changingr
session
We calculate the error rate and depict the results in Fig.4.6.
Fig.4.6 shows that our mechanism can estimate the amount of resources needed
accurately in both scenarios. In this evaluation, we only use measurement data corre-
sponding to a particular capacity for each service. If infrastructure or service providers
have multiple sets of measurement data (e.g., data measured from different bandwidth
capacities), they should be able to improve accuracy.
4.4.3 Robustness: Effect of Link Delay and Buffer Sizes
Infrastructure providers use our mechanism to estimate the amount of resources needed.
Till now, the main link information used in our work is link capacity. Since the data
measurements can be collected from different links, we need to consider the different
possible settings of each link (e.g., RTT and buffer sizes). Here, we evaluate the impact
of link delay and buffer sizes on our performance estimation mechanism.
We use web traffic as an example. The default settings are: capacity of 1Mbps,
link delay of 10ms, and buffer size of 50. We increase link delay to 40ms in the first
experiment, and we increase buffer size to 100 in the second experiment. Then, we
observe the network equilibrium under different settings. The results are depicted in
Fig.4.7 wherer
session
is 0.4.
Fig.4.7(a) shows that link delay affects the network curve when the number of con-
current connections is small, because, in this case,b
TCP
is dominated by RTT (Eq.4.5);
when the number of concurrent connections is large enough, b
TCP
is determined by
bandwidth capacity, rather than RTT. Moreover, the user curves are almost the same and
are not affected by link delay. Thus, variability in link delays does not affect the network
equilibrium significantly.
69
1
2
3
4
5
6
7
8
9
10
0 20 40 60 80 100 120 140 160
connection completion rate
the number of concurrent connections
network curve(40ms)
network curve(10ms)
user curve(40ms)
user curve(10ms)
(a) The network and user curves with
different link delay
2
3
4
5
6
7
8
9
10
0 20 40 60 80 100 120 140 160 180 200
connection completion rate
the number of concurrent connections
network curve(100)
network curve(50)
user curve(100)
user curve(50)
(b) The network and user curves with
different buffer sizes
Figure 4.7: The equilibrium and our estimation mechanism are robust with respect to
uncertainty over link delay and buffer sizes.
Fig.4.7(b) shows that buffer sizes affect the network curve when the number of con-
current connections is large, because packets experience longer queuing delays. When
congestion starts, long queuing delays make the situation worse, so the network curve
decreases more rapidly. Nonetheless, the connection completion rate at equilibrium is
robust with respect to congestion level.
To evaluate the effect of link delay and buffer sizes on our mechanism, we ran-
domly select pathi from our topology and collect traffic data from this path. Then, we
randomly select 50 paths which have different link delays and buffer sizes and assign
different QoS requirements to each of them. We use the data collected from path i to
estimate the amount of resources needed for these 50 paths. We also collect traffic data
from the 50 paths. We compare the estimation results based on data collected from path
i with the actual measurement results from the 50 paths. The resulting average error rate
is less than 10% (in Fig. 4.8).
In summary, the results demonstrate the robustness of our mechanism with respect
to variability in link delays and buffer sizes.
70
0
1
2
3
4
5
6
7
changing QoS
average error rate (%)
web
video
Figure 4.8: The average error rate for changing QoS with different link delay and buffer
sizes
4.4.4 Heterogeneous User Behavior and File Sizes
In the experiments above, we use homogeneous user behavior and file sizes, i.e., the
values of T
abort
, T
rate
, T
quality
, and S
completed
are fixed. However, in real systems, user
behavior and file sizes are heterogeneous. Here, we evaluate the accuracy of using
average values in estimating performance under heterogeneous environments.
In our first experiment, we use uniform distributions to simulate heterogeneous user
behavior and file sizes. We now describe the settings of the homogeneous and the het-
erogeneous environments.
• Homogeneous: For web services, each download transfers thirty 536-byte pack-
ets, andT
abort
is 20 seconds. For video services, the average video file size is 10
MBytes. T
rate
andT
quality
are 6 minutes and 30 seconds, respectively.
• Heterogeneous: For web services, the number of packets each download transfers
follows a uniform distribution from 25 to 35. The size of each packet is 536 bytes.
T
abort
follows a uniform distribution from 15 seconds to 25 seconds. For video
services, the video file size follows a uniform distribution from 5 MBytes to 15
71
MBytes. T
rate
and T
quality
also follow uniform distributions, where T
rate
varies
from 5 minutes to 7 minutes, andT
quality
varies from 20 seconds to 40 seconds.
We give examples of network/user curves for each traffic type in Fig.4.9(a) and
Fig.4.9(b). The capacity for web and video traffic is 1 Mbps and 30 Mbps, respectively.
From Fig.4.9, we can observe that the network curves of the homogeneous and hetero-
geneous settings are similar when the number of concurrent connections is small. When
the number of concurrent connections is large, the performance of the heterogeneous
setting is worse than the performance of the homogeneous setting. This is not unex-
pected, as the variability in the heterogeneous setting (as compared to the homogeneous
one) likely results in degraded performance.
We then use the same settings used in Chapter 4.4.2 to verify the accuracy of our
mechanism. We vary the connection completion rate (QoS) requirement and the values
ofr
session
. The results are shown in Fig.4.10. Compared to Fig.4.6, the error rate under
heterogeneous settings is higher. However, the error rate increases only by about 2%.
This can be explained as follows. We use a uniform distribution for file size and abort
time in this experiment. Since the variance of a uniform distribution is not large, our
mechanism still can estimate the amount of resources needed accurately (with a less
than 10% error) in the homogeneous and heterogeneous environments. Because it is
reasonable not to expect a large variance in user behavior with respect to the parameters
in our model (e.g., abort time should not vary from seconds to hours), we believe that it
is reasonable to use a distribution that does not have a very large variance. In the next
experiment, we use a number of different file size distributions to demonstrate that, even
if we consider a file size distribution with large variance, our mechanism still provides
good accuracy.
In the second experiment, we focus on file sizes. In real systems, the variance of
file sizes could be large. Here, we study the effect of different file size distributions.
72
0
2
4
6
8
10
0 20 40 60 80 100 120 140 160 180 200
connection completion rate
the number of concurrent connections
network curve(homo)
network curve(hetero)
user curve(homo)
user curve(hetero)
(a) The network curve and the user
curve in web services.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 20 40 60 80 100 120 140 160 180
connection completion rate
the number of concurrent connections
network curve(homo)
network curve(hetero)
user curve(homo
user curve(hetero)
(b) The network curve and the user
curve in video services.
Figure 4.9: Comparison between homogeneous settings and heterogeneous settings.
0
1
2
3
4
5
6
7
8
9
changing QoS changing r
session
average error rate (%)
web
video
Figure 4.10: The average error rate for changing QoS and changingr
session
We consider normal, exponential, and Zipf distributions and compare these with the
homogeneous environment. The settings of the homogeneous environment are the same
as the settings in the previous experiment. We generate different file sizes using the
three distributions, but with the same mean (thirty 536-byte packets in web traffic and
10 MBytes in video traffic).
We show example results for web and video traffic in Fig.4.11 and Fig.4.12, respec-
tively. For web traffic, the capacity is 1 Mbps, andr
session
is 0.4. For video traffic, the
capacity is 30 Mbps, andr
session
is 0.03. We can observe that the performance under dif-
ferent distributions is very similar. We also use the same settings as in Chapter 4.4.2 to
73
2
3
4
5
6
7
8
9
10
0 20 40 60 80 100 120 140 160 180 200
connection completion rate
the number of concurrent connections
network curve
user curve
(a) homogeneous
2
3
4
5
6
7
8
9
10
0 20 40 60 80 100 120 140 160 180 200
connection completion rate
the number of concurrent connections
network curve
user curve
(b) normal distribution
2
3
4
5
6
7
8
9
10
0 20 40 60 80 100 120 140 160 180 200
connection completion rate
the number of concurrent connections
network curve
user curve
(c) exponential distribution
2
3
4
5
6
7
8
9
10
0 20 40 60 80 100 120 140 160 180 200
connection completion rate
the number of concurrent connections
network curve
user curve
(d) Zipf distribution
Figure 4.11: web traffic: Comparison under different file size distributions.
validate the accuracy of our mechanism under different file size distributions. The aver-
age error rates under all file size distributions are shown in Fig.4.13. We can observe
that the error rate does not change significantly as compared to Fig.4.6. It increases only
by about 1% to 2%. From these results, we can observe that the large variance in file
sizes does not affect the accuracy significantly. This can be explained as follows. When
a user aborts a connection, even if this connection was downloading a large file, the
file size does not affect results beyond a certain point because the rest of the file is not
downloaded due to the connection abortion. In summary, these results demonstrate that
our mechanism can estimate the amount of resources needed accurately under different
file size distributions.
74
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 20 40 60 80 100 120 140 160 180
connection completion rate
the number of concurrent connections
network curve
user curve
(a) homogeneous
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 20 40 60 80 100 120 140 160 180
connection completion rate
the number of concurrent connections
network curve
user curve
(b) normal distribution
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 20 40 60 80 100 120 140 160 180
connection completion rate
the number of concurrent connections
network curve
user curve
(c) exponential distribution
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 20 40 60 80 100 120 140 160 180
connection completion rate
the number of concurrent connections
network curve
user curve
(d) Zipf distribution
Figure 4.12: video traffic: Comparison under different file size distributions.
4.5 Discussion
Estimation without Recalculating User and Network Curves: In our mechanism,
to estimate the amount of resources needed, infrastructure providers have to select a
capacity of the virtual link,C, and calculate the corresponding network and user curves
from earlier measurements. They can change the value ofC iteratively, until they find a
new traffic equilibrium that satisfies the QoS requirement. In Chapter 4.4.2, we demon-
strated that, even when we only use measurement data corresponding to a particular
capacity, our mechanism can estimate the amount of resources needed accurately. Now,
if infrastructure providers have multiple sets of measurement data (e.g., data measured
from different bandwidth capacities), is it possible for them to estimate the amount of
75
0
1
2
3
4
5
6
7
8
changing QoS changing r
session
average error rate (%)
web
video
Figure 4.13: The average error rate for changing QoS and changingr
session
5
10
15
20
25
30
35
40
45
50
55
60
1 2 3 4 5 6 7 8 9 10
connection completion rate
network capacity (MBps)
measured
estimated
Figure 4.14: The connection completion rate for the corresponding network capacity
andr
session
resources needed directly without using our mechanism? We now explore this direction
and its limitations.
In Fig.4.5, we had demonstrated that changing network capacity orr
session
can shift
the equilibrium point. Here, we focus on network capacity. We use web traffic and
the same homogeneous settings as in Chapter 4.4. We increase network capacity from
1MBps to 10MBps in ns2 simulations and collect simulation results as our measure-
ment data. The value of r
session
is 1.2. We calculate the connection completion rate at
the equilibrium point for the corresponding network capacity and r
session
. The results
are shown in Fig.4.14, where we observe that the connection completion rate increases
76
0
10
20
30
40
50
60
70
80
1 2 3 4 5 6 7 8 9 10
connection completion rate
network capacity (MBps)
linear
polynomial
measured
(a) Extrapolation:
Estimation by using the
first four points
0
10
20
30
40
50
60
1 2 3 4 5 6 7 8 9 10
connection completion rate
network capacity (MBps)
linear
polynomial
measured
(b) Extrapolation:
Estimation by using the last
four points
0
10
20
30
40
50
60
1 2 3 4 5 6 7 8 9 10
connection completion rate
network capacity (MBps)
linear
polynomial
measured
(c) Interpolation:
Estimation by using the
first, the forth, the seventh,
and the tenth points
Figure 4.15: Resource estimation with different data points
when network capacity increases. However, sincer
session
is fixed, the connection com-
pletion rate is bounded. Therefore, there are diminishing returns in the connection com-
pletion rate as capacity increases. Moreover, in Fig.4.14, we also compare the measured
connection completion rates with the estimated connection completion rates using our
mechanism. We use data collected under 1MBps network capacity setting to estimate
the connection completion rates under the remaining 9 capacities. The results demon-
strate that our estimated results are very close to the measured results.
We now explore the possibility of infrastructure providers estimating the amount of
resources needed without using our mechanism when there are multiple sets of mea-
surement data. A typical approach is to apply regression analysis to the data. Here,
we use linear regression and polynomial regression. In applying such techniques one
typically encounters two types of problems (1) extrapolation and (2) interpolation. We
can extrapolate by using the first few data points or the last few points. Here, we use
four out of the ten points to do extrapolation. The results are show in Fig.4.15(a),(b).
These results demonstrate that, in the case of extrapolation, regression based estimation
lacks accuracy. This is expected since, typically, extrapolation is less reliable and, in
some cases, is not doable [20, 21]. In our scenario, even if we use nearly half of the
data points, the results are still quite poor. In some cases, the error rates are greater than
77
100%. If we choose fewer data points to do extrapolation, the results will be worse. To
interpolate, we also use four points, the first, the forth, the seventh, and the tenth points.
The results are shown in Fig.4.15(c). Here, we can observe that in the case of interpo-
lation, polynomial regression provides accurate estimation because the curve is simple.
However, to perform interpolation, we need to collect sufficient data points for accurate
estimation.
From these results, we can observe that the considered regression analysis methods
do not provide accurate results in the case of extrapolation. Moreover, there is one
requirement in the regression analysis method: r
session
is fixed. When r
session
is fixed,
we can calculate the connection completion rate for the corresponding network capacity.
Then, we can use the measured data to perform regression analysis. However, since
changes inr
session
can shift the equilibrium point, regression analysis may not work in
such a scenario. In contrast, by using our mechanism, infrastructure providers can select
a capacity of the virtual link and plot the user and network curves; the intersection point
gives the connection completion rate. They can change the capacity value iteratively
and calculate the corresponding network and user curves, until they find the new traffic
equilibrium that satisfies the QoS requirement. Therefore, our mechanism does not have
such limitations.
Resource Estimation at Service Providers: Here, we discuss how service
providers can use our mechanism. Previous work assumes that service providers will
ask for a specific amount of resources when they send infrastructure providers a virtual
network request. However, previous work does not explain how to calculate the amount
of resources needed and does not consider how to satisfy QoS requirements. Thus, our
approach can complement previous efforts. Specifically, when service providers want
to create a new virtual network, they can define the minimum QoS which should be
achieved. They can then calculate the required capacity allocation. Then, they can send
78
infrastructure providers VN requests and ask for that resource amount. Infrastructure
providers can then use other approaches (e.g., [23, 52, 87]) to assign resources.
However, there is a limitation when service providers use our mechanism. In our
mechanism, b
TCP
equals TCP throughput of an isolated connection when the number
of concurrent connections is low (Eq.4.5). There are several factors that can affect TCP
throughput, such as RTT and buffer sizes. Since service providers do not know which
physical paths will be assigned, the only information they have is the capacity of the
virtual link. Other network information (e.g., RTT and buffer sizes) is typically unavail-
able. Therefore, in our mechanism, the equation used by service providers to calculate
b
TCP
is
C
k
. This estimation ofb
TCP
does not affect our mechanism significantly. Service
providers can monitor QoS metrics. If service providers realize that their QoS isn’t being
satisfied, they can adjust their request for resources, made to infrastructure providers.
Network or application configurations: In our work, we do not focus on any
particular configuration for specific services or customized protocols. However, today,
some service providers tune TCP to improve their performance. For instance, Google
modifies TCP on their routers for better performance. Some applications may open
multiple connections to reduce downloading time. Moreover, Dobrian et al. find that
buffering algorithms in video services have an impact on user engagement [28]. These
configurations may change network equilibrium by affecting the network curve or the
user curve. Since our approach focuses on a generic case, such issues are outside the
scope of our work and are part of future efforts.
Combining with existing efforts: A number of heuristics have been proposed to
address the VN embedding problem. Such work assumes that VN requests indicate the
exact amount of resources required. Our approach can be combined with these efforts
to estimate the amount of required resources.
79
In [87], Yu et al. propose two mechanisms to simplify virtual network embedding:
i) split a virtual link over multiple substrate paths and ii) path migration. These mecha-
nisms have also been considered in [52] and [23].
However, there is no conflict between our approach and path splitting and migration
mechanisms, and our approach can also be used together with these mechanisms. We
described how to use our mechanism at the infrastructure provider side in Chapter 4.3.3
and at the service provider side above. If path splitting and migration are allowed, our
mechanism can be used by the infrastructure provider only.
We believe that service providers should ask for specific QoS requirements, rather
than for specific capacity. The rational for this is that the sum of QoS metrics obtained
from split paths may not be equal to the QoS achieved on a single path. Because service
providers do not know whether infrastructure providers will split paths or not, it is dif-
ficult for service providers to ask for appropriate amounts of resources to achieve their
QoS goals.
Therefore, service providers should send infrastructure providers virtual network
requests with QoS requirements instead. The goal of infrastructure providers would then
be to make sure that the QoS achieved over all paths can satisfy these QoS requirements.
4.6 Conclusions
In this chapter, we introduced an alternative direction for generating VN requests,
namely that of using QoS as a constraint. In contrast to previous efforts that assume
a VN request will indicate the amount of resources needed, we suggested that service
providers can use QoS as a constraint when generating VN requests.
80
We focused on two popular services, web and videos, and proposed an estimation
mechanism based on analysis of interaction between user behavior and network perfor-
mance under different services. This mechanism can be used to estimate the amount
of resources needed by a VN request. Our approach can also dynamically adjust the
resource estimation when user behavior and QoS constraints change dynamically.
Our simulation-based experiments demonstrate that our mechanism can satisfy QoS
requirements through appropriate resource estimation. Moreover, our approach can
adjust resource estimations efficiently and accurately.
81
Chapter 5
An Equilibrium-Based Modeling
Framework for QoS-Aware Task
Management in Cloud Computing
5.1 Introduction
With the ability to provide scalable and heterogeneous computing resources, cloud com-
puting has achieved widespread adoption and has been applied to many areas, such as
data analysis, search, image processing, social networks, and others. To execute their
jobs, users/companies can rent computing resources from public cloud providers (e.g.,
Amazon EC2 [1] and Microsoft Azure [2]) or build their own private clouds. In the
latter case, they act in both roles, tenants and cloud providers, and manage performance
and cost.
Cloud providers have several goals. Firstly, to satisfy service level agreements
(SLAs), they have to guarantee quality of service (QoS), especially under worst-case
scenarios (e.g., peak loads). QoS here can be resource availability, computing power,
or response time. Therefore, data centers are typically over-provisioned. As reported,
a number of cloud providers own thousands of servers (e.g., Google (∼ 1 million),
Microsoft (∼ 200K)) [13]. However, supporting such large server resources results in
energy costs on the order of millions of dollars per year [13]. Therefore, energy cost
reduction is another goal for cloud providers.
82
How to achieve these goals has been studied in previous efforts, and several
approaches have been proposed to address different challenges. Many such efforts focus
on a specific design goal. In this work, our goal is to develop a modeling framework
that can (i) incorporate different approaches and (ii) be used to evaluate system perfor-
mance under different goals simultaneously (e.g., performance and cost). Because the
interaction between the different resource management approaches is complex and their
affect on performance characteristics is (often) not straightforward, it is challenging to
develop such an accurate modeling framework while maintaining its tractability.
In evaluating the utility of our modeling framework, in this work, we consider two
typical goals for cloud providers - performance and energy cost. Due to the scale of
data centers and their heterogeneous nature, one important challenge in satisfying SLAs
is dealing with the unpredictable nature of datacenters’ performance; even when data-
centers are over-provisioned, it is still difficult to guarantee certain performance char-
acteristics (e.g., a bound on response time). Common causes of such unpredictabil-
ity include hardware failure and performance interference due to resource contention.
Although in cloud computing a job is typically divided into multiple (relatively small)
tasks where these tasks are run in parallel, in a distributed setting, it is not easy to predict
the response time of a job. Stragglers, i.e., tasks running slower than others, are still a
fundamental challenge in cloud computing [25]. Because a job cannot be finished until
its last task finishes, the response time of a job can be significantly delayed by stragglers.
These stragglers are caused by various factors, including hardware failure, resource con-
tention, and management issues. Thus, it is typically difficult to predict which tasks will
become stragglers. To mitigate stragglers, an approach termed speculation has been pro-
posed [5, 8, 89]. In brief, the idea of speculation is to launch duplicate copies of tasks,
called speculative copies, hoping that the copies will complete faster than the original
task.
83
Energy cost reduction is also an important goal for cloud providers. Because energy
requirements are huge in datacenters, a reduction in energy use can result in savings
of millions of dollars. Energy cost reduction has been extensively studied, with one
widely adopted approach being virtual machine (VM) consolidation [33, 50, 51, 58, 94].
The basic idea of VM consolidation is to colocate VMs on fewer physical servers while
shutting down those physical servers that become idle after consolidation. Thus, VM
consolidation can result in improved utilization of physical servers and reduction in
energy cost.
Although these proposed approaches are designed to achieve particular (individual)
goals, most of them cannot satisfy multiple goals simultaneously. For instance, VM
consolidation can reduce energy cost. However, it can violate performance require-
ments (SLAs). Due to contention for resources, such as memory, disk I/O, CPU cache,
and network bandwidth, a large number of VMs colocated on physical servers, will
inevitably result in noticeable performance degradation, as observed in a number of
studies [36, 62]. Another example is the use of speculation to mitigate stragglers. Since
speculation generates multiple copies of tasks, it can also result in performance degra-
dation as copies compete for the same resources. Although provisioning more physi-
cal servers/resources can reduce performance degradation, it conflicts with the goal of
reducing cost. Therefore, there is a tradeoff between reducing cost and improving VM
performance.
Thus, to achieve a better understanding of how multiple (conflicting) goals inter-
act, one needs a modeling framework, which is the focus of this work. Specifically,
the goal of our framework is to facilitate evaluation of effects of the various resource
management approaches (VM consolidation, speculation) and to offer an abstraction for
84
researchers and engineers to reason intuitively about system performance. The chal-
lenges here include the following. (1) The framework should be able consider multi-
ple approaches simultaneously, and it should be easily extendable, as new approaches
become available. (2) The framework should be able to evaluate several performance
metrics under different goals. (3) It should be simple (and efficient) for cloud providers
to use the framework and understand the effects of considered resource management
approaches.
To this end, here we propose a modeling framework which includes two main parts.
The first part is an inflow based on system workload. The second part is an out-
flow controlled by the system (task service rate, performance degradation caused by
resource contention, aggressiveness of speculation mechanism, etc.). The advantage of
this decomposition is that it is makes it easy to observe the effects of different factors
(as illustrated in the remainder of the work). Thus, it can can facilitate efficient system
development and provide cloud providers with insight into how to adjust system settings
to satisfy different performance and cost goals.
Our contributions in this chapter can be summarized as follows.
• From the perspective of cloud providers, they have two main goals, satisfying
performance requirements and reducing energy cost. Previous efforts propose
different approaches targeting one of these goals. In this work, we argue that one
should be able to evaluate how these multiple approaches affect each other. To
this end, we develop a modeling framework that can consider several approaches
simultaneously and evaluate system performance (see Chapter 5.2).
• In our framework, we consider speculation and performance degradation due to
resource contention (e.g., VM consolidation). We propose models that can eval-
uate system performance and expose effects of different approaches easily (see
Chapter 5.2).
85
• In Chapter 5.3, we demonstrate that our models can evaluate system performance
accurately under a variety of system settings (e.g., service rate distributions, work-
load demands, and QoS requirements).
• Our models have several types of usage, including balancing the tradeoff between
performance and energy cost and satisfying heterogeneous performance require-
ments (see Chapter 5.4).
• Our modeling framework can be extended easily to include a number of resource
management approaches. We discuss several potential extensions in Chapter 5.5.
5.2 Our Modeling Framework
We begin this chapter with an overview of the system (in Chapter 5.2.1). We then
describe the proposed modeling framework (in Chapter 5.2.2) and in particular (a) how
it can be applied to two speculation mechanisms and (b) how the results can be used to
estimate performance characteristics, namely the response time distribution.
5.2.1 System Overview
Although a typical data center consists of multiple racks (as depicted in Figure 5.13),
for simplicity of presentation, we first describe our models in the context of a single
rack. (In Chapter 5.4, we extend the models to multi-rack environments.) Specifically,
we consider a model of a rack that can support a maximum ofv VMs. Each job consists
of one or more tasks, depending on the application. Each task is served by a single VM.
Because VMs running on the same rack compete for physical resources (e.g., mem-
ory, disk I/O, CPU cache, and network capacity), the performance of a VM is affected
by the total number of VMs running on the same rack. Let μ be the mean service rate
86
of a VM, when there is no resource contention. We use a functiond(i), 0 < d(i)≤ 1,
where i is the number of VMs running on a rack, to represent the performance degra-
dation caused by resource contention due to VM consolidation. As the number of VMs
running on the same rack increases, d(i) decreases. Thus, under resource contention
between VMs, the service rate of a VM is μ∗d(i). Note that, our goal here is not to
study resource contention - a number of works in the existing literature focus a variety
of resources, including [16, 22, 36, 45, 61, 93, 94]. Rather, our goal is to use a simple
mechanism for accounting for the resulting performance degradation in our modeling
framework. We evaluate the effects of different performance degradation functions in
Chapter 5.3.
Speculation Mechanisms: As noted in Chapter 5.1, speculation has been used
widely to mitigate stragglers. There are (at least) two mechanisms for generating spec-
ulative copies: (i) immediately, at the time of job arrival or (ii) when the service time of
a task exceeds a threshold. Mechanism (i) is simpler to model, and we discuss how our
framework is applicable to this mechanism in Chapter 5.5.2. In this work, we mainly
focus on mechanism (ii), as described next.
Briefly, when the service time of a task exceeds a threshold, t
th
, it is assumed that
this task has a higher probability of becoming a straggler [25]. A cloud provider then
makes a (speculative) copy of this task and runs it on another VM. (Here, we assume
that this speculative copy is assigned to the same rack as the original task.) There are
then two options for dealing with the original task:
• Re-launch option: terminate the original task (i.e., only run the speculative copy).
• Re-instantiation option: continue running the original task (in addition to the
speculative one) and use the results of whichever copy completes first.
87
In the latter case, there can be multiple copies of a task running (as speculative copies
can result in additional speculative copies when their running time is greater than t
th
).
Regardless of the number of copies, when one of these copies completes, all remaining
ones are terminated. Compared to Re-launch, Re-instantiation can reduce response time
by increasing the effective service rate of a task, but it also results in greater energy
consumption (i.e., cost) due to running more VMs. In this work, we focus on these two
speculation mechanisms and develop analytical models for each in Chapter 5.2.2. For
convenience, a summary of notation used in the work is given in Table 5.1.
5.2.2 Proposed Framework
Our proposed modeling framework aims to capture the interaction between demand for
service and available system resources. Specifically, we consider system equilibrium as
a balance between an inflow controlled by demand for service, and an outflow controlled
by how the system manages its resources. That is, when these two flows intersect, the
system reaches an equilibrium. Moreover, one advantage of considering the two flows
separately is to aid cloud providers in understanding how system equilibrium is affected
by different factors (e.g., demand for service, service rate, and performance degradation
due to resource contention).
This modeling framework can be applied to a number of task management mecha-
nisms. As described above, here we consider two speculation mechanisms and derive
analytical models for these mechanisms. We also illustrate how this framework can be
used by cloud providers, specifically, in estimating response time distribution. Other
uses of our framework are discussed in Chapter 5.4, e.g., facilitation of tradeoff between
performance commitments and cost as well as satisfying heterogeneous QoS require-
ments.
88
v max number of VMs a rack can support
i number of VMs running on a rack
μ mean service rate of a VM without
performance degradation (tasks per time unit)
d(i) performance degradation function
λ
n
mean arrival rate of new tasks
λ
l
t
mean arrival rate of total tasks (Re-launch)
λ
i
t
mean arrival rate of total tasks (Re-instantiation)
k
l
average number of VMs running on a rack
(Re-launch)
k
eq
l
average number of VMs running on a rack
at the equilibrium (Re-launch)
k
i
average number of VMs running on a rack
(Re-instantiation)
k
eq
i
average number of VMs running on a rack
at the equilibrium (Re-instantiation)
t
l
c
average service time of a completed
task copy (Re-launch)
t
i
c
average service time of a completed
task copy (Re-instantiation)
t
t
average running time of a terminated task
copy (Re-instantiation)
t
th
time threshold used to determine if a
a speculative copy should be generated
t
q
QoS requirement on response time of a task
m
j
tq
t
th
k
p
l
probability that a task is re-launched
p
i
probability that task is re-instantiated
p
t
probability that a task copy is terminated before it
finishes (Re-instantiation)
T
l
r
r.v. for response time of a task (Re-launch)
T
l
rj
r.v. for running time of thej-th copy of a task
(Re-launch)
T
i
r
r.v. for response time of a task (Re-instantiation)
T
i
rj
r.v. for running time of thej-th copy of a task
(Re-instantiation)
T
job
r.v. for response time of a job
Table 5.1: Summary of Notation (r.v. = random variable)
89
λ
n
λ
t
l
k
l
(1-p
l
)λ
t
l
(a) re-launch
λ
n
λ
t
i
p
t
λ
t
i
k
i
(1-p
t
)λ
t
i
(b) re-instantiation
p
i
(1-p
t
)λ
t
i
p
l
λ
t
l
Figure 5.1: Speculation mechanisms
Lastly, in our models we assume that there is no queueing delay, i.e., that there are
sufficient resources in the data center to support the workload. Thus, in our models
jobs arriving to a full data center are rejected. Because data centers are typically over-
provisioned (i.e., their mean server utilization is typically low [14]), this assumption
does not significantly affect the utility of our models. For instance, if (for simplicity
of this example) we model the data center as an M/M/k queue, with k = 100, then
even with a utilization of 0.7, then the probability that an arriving task finds all servers
occupied is less than 5∗10
−4
. A typical mean utilization for a data center is 0.3 [14].
Analytical Model of Re-launch
The first mechanism we consider is Re-launch. Recall that in this mechanism, the orig-
inal copy of a task is terminated once the speculative copy is created. A depiction of
this mechanism is given in Fig. 5.1(a), where λ
n
is the mean arrival rate of new tasks,
λ
l
t
is the total mean arrival rate of tasks (including new tasks and speculative copies),
k
l
is the mean number of VMs running on a rack, and p
l
is the probability that a task
is re-launched. This probability is a function of the threshold, t
th
, i.e., as t
th
increase,
p
l
decreases. Moreover, since (given resource contention) the service rate of a VM is
affected by i, the number of other VMs running on the same rack, p
l
is also a function
ofi. That is, wheni increases, so doesp
l
.
90
Given Fig. 5.1(a), when the system reaches steady state, the arrival rate of new tasks
equals the departure rate of completed tasks, which we will refer to as goodput. That is,
flow balance in Fig. 5.1(a) givesλ
l
t
= λ
n
+p
l
λ
l
t
, i.e., the rate of completed tasks is
(1−p
l
)λ
l
t
=λ
n
. (5.1)
One can view Eq. 5.1 as the demand for VMs, measured in terms ofλ
l
t
.
Lett
l
c
be the average service time of a completed task (original or copy). By Little’s
Result, we have (refer to Fig. 5.1(a)):
k
l
= p
l
λ
l
t
t
th
+(1−p
l
)λ
l
t
t
l
c
,
that is
λ
l
t
=
k
l
p
l
t
th
+(1−p
l
)t
l
c
.
Consequently, the goodput is
(1−p
l
)λ
l
t
=
k
l
p
l
1−p
l
t
th
+t
l
c
. (5.2)
One can also view Eq. 5.2 as the system’s supply of resource (or VMs), measured
by goodput.
The above model does not restrict the VM service time distribution in any way.
Under the conditions of an exponential service time distribution, we can rewrite Eq. 5.2
as:
(1−p
l
)λ
l
t
= k
l
μd(k
l
) for exponential service time. (5.3)
91
That is, in the case of a memoryless distribution, as expected the goodput is not a func-
tion of t
th
. Note that, there are no service distribution assumptions in Eq. 5.1 and Eq.
5.2.
When Eq. 5.1 and Eq. 5.2 intersect (see Fig 5.3), the system reaches an equilibrium.
The equilibrium of Eq. 5.1 = Eq. 5.2 gives
k
l
= λ
n
(
p
l
1−p
l
t
th
+t
l
c
).
Thus, the equilibrium point gives us the mean number of VMs running on a rack and
the corresponding goodput of tasks.
Derivation of Eq. 5.3: From Eq. 5.2, we have
(1−p
l
)λ
l
t
=
k
l
p
l
1−p
l
t
th
+t
l
c
.
The service time of a VM follows an exponential distribution with parameter
1
μd(k
l
)
,
whered(k
l
) is the performance degradation function. We have
p
l
=e
−μd(k
l
)t
th
,
and,
1−p
l
= 1−e
−μd(k
l
)t
th
.
92
t
l
c
is the average service time of a completed task copy. If a task copy can be completed,
then it means that the service time of this copy is less thant
th
. Otherwise, this copy will
be re-launched. Therefore, the value oft
l
c
is
t
l
c
=
R
t
th
0
xμd(k
l
)e
−μd(k
l
)x
dx
R
t
th
0
μd(k
l
)e
−μd(k
l
)x
dx
=
−t
th
p
l
1−p
l
+
1
μd(k
l
)
(5.4)
After we replacet
l
c
in Eq. 5.2 with
−t
th
p
l
1−p
l
+
1
μd(k
l
)
from Eq. 5.4, we have
(1−p
l
)λ
l
t
=
k
l
p
l
1−p
l
t
th
+t
l
c
=
k
l
p
l
1−p
l
t
th
+
−t
th
p
l
1−p
l
+
1
μd(k
l
)
=k
l
μd(k
l
).
Response time distribution: As noted above, the intersection of Eq. 5.1 and Eq.
5.2 gives us the goodput of tasks when the system reaches equilibrium. One possible
use of our model in this case is the estimation of the response time distribution. Recall
that for cloud providers it is important to maintain SLAs, which can be expressed as the
requirement to provide a bound on response time of jobs. Consequently, it is important
for a cloud provider to determine what is the probability of violating SLAs (as that could
result in loss of revenue). In our model, given a required job response time t
q
, we can
estimate the probability that this requirement will be satisfied, i.e., the probability that
the response time of a job will be less thant
q
. We derive this next.
93
Before estimating this probability at a job level, we first consider it at a task level.
Since the response of a job is determined by its last completed task, it is useful to con-
sider this probability in task level. We then extend our estimation to job level.
In our model, a task can be re-launched several times. We define the response time
of a task,T
l
r
, as the time from the start of the first copy to the end of the last copy, where
T
l
rj
is the running time of thej-th copy of the task. We refer toP(T
l
r
≤t
q
) as a task QoS
probability. We now estimate the task QoS probability. Note that, as mentioned above,
the probability that a task is re-launched, p
l
, is a function oft
th
andi. In what follows,
we use k
eq
l
, the mean number of VMs running on a rack at the equilibrium point, as i.
Moreover, we assume that t
q
> t
th
because it would not make sense to set a re-launch
threshold that is greater than the required QoS (i.e., to re-launch tasks only after the SLA
is already violated). We definem as
j
tq
t
th
k
. Then,
P(T
l
r
≤t
q
)
=P(T
l
r1
≤t
th
)+P(T
l
r1
> t
th
)P(T
l
r2
≤t
th
)+
···+
m
Y
j=1
P(T
l
rj
>t
th
)P(T
l
r(m+1)
≤ t
q
−m∗t
th
)
=(1−p
l
)+p
l
(1−p
l
)+···+
(p
l
)
m
∗P(T
l
r(m+1)
≤t
q
−m∗t
th
)
=1−(p
l
)
m
+(p
l
)
m
∗P(T
l
r(m+1)
≤t
q
−m∗t
th
). (5.5)
If the service time of a VM follows an exponential distribution, then Eq. 5.5 can be
re-written as:
P(T
l
r
≤t
q
) = 1−e
−μd(k
eq
l
)tq
. (5.6)
94
As expected (given the memoryless service time distribution), in that case in Eq. 5.6,
this probability is not a function oft
th
. That is, re-launching of tasks, does not affect the
probability of satisfying the QoS requirement.
Given the above estimation of a task QoS probability, we can then extend this esti-
mation to the job level. When a job withn tasks arrives, all of its tasks are executed in
parallel. Because the job cannot be completed until its last task is completed, we have:
P(T
job
≤t
q
) = (P(T
l
r
≤t
q
))
n
.
Derivation of Eq. 5.6: From Eq. 5.5, we have
P(T
l
r
≤ t
q
)
=1−(p
l
)
m
+(p
l
)
m
∗P(T
l
r(m+1)
≤t
q
−m∗t
th
).
We assume that the system reaches equilibrium and use k
eq
l
to calculate P(T
l
r
≤ t
q
).
The service time of a VM follows an exponential distribution with parameter
1
μd(k
eq
l
)
,
whered(k
eq
l
) is the performance degradation function. Then, we have
P(T
l
r
≤t
q
)
=1−(p
l
)
m
+(p
l
)
m
∗P(T
l
r(m+1)
≤t
q
−m∗t
th
)
=1−(e
−μd(k
eq
l
)t
th
)
m
+(e
−μd(k
eq
l
)t
th
)
m
∗(1−e
−μd(k
eq
l
)(tq−m∗t
th
)
)
=1−(e
−μd(k
eq
l
)t
th
)
m
+(e
−μd(k
eq
l
)t
th
)
m
∗(1−
e
−μd(k
eq
l
)tq
e
−μd(k
eq
l
)(m∗t
th
)
)
=1−e
−μd(k
eq
l
)tq
95
Analytical Model of Re-instantiation
The second mechanism we consider is Re-instantiation. Recall that in this mechanism,
after a speculative copy is created, the original copy continues running; results from the
first completed copy are used, with all other copies then terminated. In our analytical
model, we simplify this mechanism a bit (for ease of analysis). Specifically, we do not
generate a speculative copy at the instance when the running time of a VM exceeds
t
th
. Rather, we “mark” the task as requiring a speculative copy but generate it only
when some VM completes a task. Thus, our model approximates the mechanism used
in real systems. However, we study the accuracy of such an approximation in Chapter
5.3, where our results indicate that this is a reasonable approximation. Intuitively, this
approximation is close to the real mechanism because the time interval between any two
VM/task completion instances is relatively short, particularly when the number of VMs
running in the system is sufficiently large.
A depiction of this mechanism (including our modification) is given in Fig. 5.1(b),
whereλ
n
is the mean arrival rate of new tasks, λ
i
t
is the total mean arrival rate of tasks
(including new tasks and speculative copies),k
i
is the mean number of VMs running on
a rack,p
i
is the probability that a task is re-instantiate whent
th
is exceeded, andp
t
is the
probability that a task copy is terminated before it completes execution. p
i
and p
t
are
functions oft
th
(the threshold) andi (the number of VMs running on the same rack).
We use a similar approach to that in in Chapter 5.2.2. Here, flow balance in Fig.
5.1(b) givesλ
i
t
= λ
n
+p
i
(1−p
t
)λ
i
t
, i.e., the rate of completed tasks is
(1−p
t
)λ
i
t
=
(1−p
t
)λ
n
1−p
i
(1−p
t
)
. (5.7)
96
Lett
i
c
be the average service time of a completed task copy (original or speculative),
and t
t
be the average running time of a terminated copy. By Little’s Result, we have
(refer to Fig. 5.1(b)):
k
i
= p
t
λ
i
t
t
t
+(1−p
t
)λ
i
t
t
i
c
,
that is
λ
i
t
=
k
i
p
t
t
t
+(1−p
t
)t
i
c
Consequently, the goodput is
(1−p
t
)λ
i
t
=
k
i
pt
1−pt
t
t
+t
i
c
. (5.8)
When Eq. 5.7 and Eq. 5.8 intersect, the system reaches equilibrium. The equilibrium
of Eq. 5.7 = Eq. 5.8 gives
k
i
=
(1−p
t
)λ
n
1−p
i
(1−p
t
)
×(
p
t
1−p
t
t
t
+t
i
c
).
Response time distribution: When a required response time t
q
is given, we can
estimate the QoS probability that this requirement will be satisfied, i.e., the probability
that the response time of a task will be less thant
q
. We consider the QoS probability at
a task level here. As shown in Chapter 5.2.2, this can be easily extended estimation at
the job level.
In this mechanism, a task can have multiple copies. When one of these copies com-
pletes, the task is finished, and the remaining copies are terminated (as illustrated in Fig.
5.2). We define T
i
r
as the response time of a task, where T
i
rj
is the running time of the
97
T
r1
i
t
th
2t
th
T
r2
i
T
r3
i
Figure 5.2: Running time of task copies
j-th copy of the task. As noted earlier, we assume that t
q
> t
th
and definem as
j
tq
t
th
k
.
Then,
P(T
i
r
≤t
q
)
=P(T
i
r1
≤t
th
)
+P(T
i
r1
>t
th
)P(T
i
r1
≤ 2t
th
ORT
i
r2
≤ t
th
|T
i
r1
>t
th
)
+··· (5.9)
If the service time of a VM follows an exponential distribution, we can simplify this
further. Specifically, the probability that there are 2 copies of a task in the system and
that this task can be completed before the third copy is generated is p
i
∗ (1− (p
i
)
2
).
The first term, p
i
, is the probability that the running time of the first copy is more than
t
th
(i.e., a speculative copy has to be generated). The latter term, (1− (p
i
)
2
), is the
probability that at least one of the two copies finishes during the second time period of
length t
th
. (Here, we apply the memoryless property). By continuing in this manner,
we can derive the probability that there are j copies in the system and the task can be
completed before the(j +1)-th copy is generated, i.e., (p
i
)
j(j−1)
2
∗(1−(p
i
)
j
).
98
Then, Eq. 5.9 can be re-written as:
P(T
i
r
≤t
q
)
=(1−p
i
)+p
i
(1−(p
i
)
2
)+···+(p
i
)
m(m−1)
2
(1−(p
i
)
m
)
+(p
i
)
m(m+1)
2
(1−P(T
i
r(m+1)
>t
q
−m∗t
th
)
m+1
)
=1−(p
i
)
m(m+1)
2
P(T
i
r(m+1)
>t
q
−m∗t
th
)
m+1
(5.10)
5.3 Evaluation and Validation
In this chapter, we evaluate and validate the proposed models. We note that our goal is
to explore the utility of our framework that models performance characteristics of cloud
computing type systems, rather than exploring such a system’s design or implementa-
tion. To this end, we use simulation-based experiments. This allows us to study and
demonstrate essential characteristics of our models in a more controlled environment.
We first focus on a single-rack environment. In Chapter 5.4, we demonstrate how our
models can be extended to multiple-rack environments.
5.3.1 Simulation Settings
We first describe our simulation settings. We use a Poisson job arrival process with a
mean arrival rate ofλ. The number of tasks per job is based on the distribution given in
Table 5.2, which corresponds to measurements in [6]. All simulation results presented
here are obtained with 95%± 3% confidence intervals. In our experiments, the default
settings are as follows: (a) the mean service rate,μ, is 1, (b) the number of VMs a rack
can support is 1000, (c) the performance degradation function, d(), is linear - when a
rack is fully utilized, the mean service rate is reduced to 0.6 (d(1000) = 0.6). How-
ever, we also explore the effects of alternative parameter settings, including different
99
# of tasks per job Distribution
1 - 10 85%
11 - 50 4%
51 - 150 8%
150 - 500 3%
Table 5.2: Distribution of number of tasks per job
parameter default value
μ 1
v 1000
t
th
1.4
Table 5.3: Default values of parameters
performance degradation functions. For service time of VMs, we use three different
distributions, exponential distribution, uniform distribution whose range is [0,
2
μ∗d()
],
and normal distribution whose standard deviation is
1
μ∗d()
. Table 5.3 summarizes the
default parameter values. The time units ofμ andt
th
are the same.
As described above, in our models, we assumed that the total amount of resources is
sufficient to support all jobs. Therefore, there is no queueing delay for any jobs/tasks.
Based on this assumption, in our experiments, if job arrival rate is higher than system
service rate, some jobs may be dropped. Thus, the number of tasks in the system is no
more than the number of VMs the system can support.
5.3.2 Re-launch model evaluation
We first present results of the analytical model of the Re-launch mechanism.
Experiment 1: Validating Equilibrium Points and Response Time Distribution:
The goal of our first experiment is to validate the use of equilibrium points and response
time distribution in our model. In Chapter 5.2.2, we argued that when Eq. 5.1 and Eq.
5.2 intersect, the system reaches an equilibrium, which can be used to determine the
100
mean number of VMs running on a rack and corresponding goodput of tasks. More-
over, our estimation of response time distribution is also based on the equilibrium point.
In this experiment, we demonstrate that the equilibrium point calculated based on our
model is an agreement with the simulation-based experiments when the system reaches
steady state. Furthermore, we can estimate the response time distribution accurately,
based on the equilibrium point.
In this experiment, the service time of a VM follows an exponential distribution. In
Chapter 5.2.2, we noted that Eq. 5.1 and Eq. 5.2 are not a function of t
th
when the
service time follows an exponential distribution. We vary the mean arrival rate of new
jobs, i.e.,λ
n
varies from 150 to 300 to 450. We draw the model-based curves in Fig. 5.3,
where we identify each curve with its corresponding equation. (Note that, Eq. 5.2 is not
a function ofλ
n
, and thus remains constant under different value ofλ
n
.) When Eq. 5.1
and Eq. 5.2 intersect, the system reaches an equilibrium. We run our simulations using
different values ofλ
n
and measure the corresponding mean number of VMs running on
the rack and the goodput of tasks. We compare these results with those calculated at the
equilibrium points as determined by our model. The results in Fig. 5.3 demonstrate that
the performance metrics (task goodput and mean number of VMs running on a rack)
calculated based on our model (at the equilibrium point) are the same as those measured
from the simulation experiments when the system reaches steady state.
In Eq. 5.5 and Eq. 5.6, we showed how cloud providers can estimate task QoS prob-
ability, i.e., the probability that the response time of a task is less than the performance
requirement,t
q
. We calculate the average number of VMs running on the rack at equi-
librium points from our model and use it to calculate performance degradation and the
task QoS probability. We compare the estimation results with the results obtained from
simulations, as depicted in Fig. 5.4. We use two different values of λ
n
, 300 and 450,
and vary t
q
from 1.8 to 2.2 to 2.6. The results in Fig. 5.4 demonstrate that our model
101
0
200
400
600
800
1000
0 200 400 600 800 1000
goodput
average number of VMs running on a rack
Eq. 2
Eq. 1 (λ
n
= 150)
Eq. 1 (λ
n
= 300)
Eq. 1 (λ
n
= 450)
Simulation (λ
n
= 150)
Simulation (λ
n
= 300)
Simulation (λ
n
= 450)
Figure 5.3: Equilibrium
point - simulation vs. model
(exponential distribution)
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.8 2.2 2.6
probability
t
q
model
simulation
(a)λ
n
= 300
0
0.2
0.4
0.6
0.8
1
1.8 2.2 2.6
probability
t
q
model
simulation
(b)λ
n
= 450
Figure 5.4: Task QoS probability (exponential distri-
bution)
can estimate the task QoS probability accurately. Note that, in the simulations, the num-
ber of VMs changes with time. However, when we estimate the task QoS probability
based on our model, we only use the information from the equilibrium point. This indi-
cates that cloud provider can use our model to evaluate system performance by simply
considering the equilibrium point. Note that, when we fix the value of t
q
, the results
in Fig. 5.4 show that the task QoS probability decreases when λ
n
increases. This can
be explained as follows. When λ
n
increases, the mean number of VMs running on the
rack also increases. This results in a greater performance degradation. Note also that
atλ
n
= 450, the system utilization (the faction of busy VMs) is≈ 60%. Thus, even at
utilizations higher than is currently typical for data centers, our model is accurate.
Experiment 2: Validating Model under Different Distributions and Studying
the Effect of t
th
: The goal of this experiment is to demonstrate that our model works
well under different service time distributions as well as to study the effect of t
th
. We
have experimented with a number of distributions; due to space limitation, here we show
the results from using the uniform and the normal distributions. (The results with other
distributions are qualitatively similar.) We fixλ
n
at 250 and varyt
th
from 1.4 to 1.8. We
draw curves of Eq. 5.1 and Eq. 5.2 in Fig. 5.5 and Fig. 5.7. Note that, the curve of Eq.
5.1 is not a function oft
th
. We compare the goodput and the number of VMs running on
a rack measured from our simulations to those computed by our models. The results in
102
0
100
200
300
400
500
600
0 200 400 600 800 1000
goodput
average number of VMs running on a rack
Eq. 2 (t
th
= 1.4)
Eq. 2 (t
th
= 1.8)
Eq. 1
Simulation (t
th
= 1.4)
Simulation (t
th
= 1.8)
Figure 5.5: Equilibrium
point - simulation vs. model
(uniform distribution: [0,
2
μ∗d()
])
0
0.2
0.4
0.6
0.8
1
2 2.4 2.8
probability
t
q
model
simulation
(a)t
th
= 1.4
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2 2.4 2.8
probability
t
q
model
simulation
(b)t
th
= 1.8
Figure 5.6: Task QoS probability (uniform distribu-
tion: [0,
2
μ∗d()
])
Fig. 5.5 and Fig. 5.7 show that, our models produce accurate results, even if the service
time distribution is not exponential.
Note that, in Fig. 5.5 and Fig. 5.7, increasing t
th
from 1.4 to 1.8, can result in a
reduction in the mean number of VMs running on the rack. This is a result ofp
l
decreas-
ing when t
th
increases. Because these distributions are not memoryless, re-launching
tasks frequently increases the mean number of VMs running on the rack. From the
results in Fig. 5.3, Fig. 5.5, and Fig. 5.7, we can observe that the increase in goodput
slows down gradually as the mean number of VMs running on a rack increases. This is
due to an increase in performance degradation as a result of a greater number of VMs
running on a rack.
We use our model to estimate the task QoS probability, i.e., the probability that the
response time of a task is less than t
q
, based on Eq. 5.5. We vary t
q
from 2.0 to 2.4 to
2.8. We also measure the corresponding probability in our simulation experiments. We
comparison is shown in Fig. 5.6 and Fig. 5.8, which demonstrates that our models can
estimate this probability accurately under different service time distributions.
Experiment 3: Using Analytical Model with Historical Data: To draw the curves
of Eq. 5.1 and Eq. 5.2, we need to know the values of several parameters, including
p
l
and t
l
c
. In the above two experiments, we calculated these value directly using the
103
0
100
200
300
400
500
0 200 400 600 800 1000
goodput
average number of VMs running on a rack
Eq. 2 (t
th
= 1.4)
Eq. 2 (t
th
= 1.8)
Eq. 1
Simulation (t
th
= 1.4)
Simulation (t
th
= 1.8)
Figure 5.7: Equilibrium
point - simulation vs.
model (normal distribution:
standard deviation(
1
μ∗d()
))
0
0.2
0.4
0.6
0.8
1
2 2.4 2.8
probability
t
q
model
simulation
(a)t
th
= 1.4
0
0.2
0.4
0.6
0.8
1
2 2.4 2.8
probability
t
q
model
simulation
(b)t
th
= 1.8
Figure 5.8: Task QoS probability (normal distribu-
tion: standard deviation(
1
μ∗d()
))
known service time distributions. However, in a real system, it may not be the case that
service time distributions are known. However, even if such information is not available,
our models can still be used. In this case, cloud providers can collect historical data,
including p
l
and t
l
c
, and calculate the goodput and the number of VMs running on a
rack with corresponding λ
n
using our models. Since it may not be practical to collect
historical data at numerous equilibrium points, cloud providers could collect information
for equilibrium points at several values of λ
n
, and then extrapolate or interpolate as
needed, using regression. We explore the utility of this approach in this experiment.
Specifically, we use three different distributions, but assume that the cloud providers
do not have this information. We use several different values of λ
n
and collect corre-
sponding data, includingp
l
andt
l
c
. We view this as historical data and perform 2-degree
polynomial regression. The historical data and the regression results are depicted in in
Fig. 5.9. Note that, the decrease in the regression results in Fig. 5.9(b)(c) is an artifact
of the polynomial regression. We then vary the valueλ
n
and collect the corresponding
data as ground truth. We compare the ground truth with our estimation based on the
polynomial regression. The comparison is presented in Fig. 5.9. In the case of interpo-
lation, the regression results match the equilibrium points well under all distributions.
However, in the case of extrapolation, as expected the regression results do not match
104
0
100
200
300
400
500
600
0 200 400 600 800 1000
goodput
average number of VMs running on a rack
Estimation (regression)
historical data
ground truth
(a) exponential distribution
0
50
100
150
200
250
300
350
400
450
0 200 400 600 800 1000
goodput
average number of VMs running on a rack
Estimation (regression)
historical data
ground truth
(b) uniform distribution
0
50
100
150
200
250
300
350
0 200 400 600 800 1000
goodput
average number of VMs running on a rack
Estimation (regression)
historical data
ground truth
(c) normal distribution
Figure 5.9: Ground truth (simulation results) vs. estimation (using regression)
the simulation results well, particularly for the ground truth having the largest value of
λ
n
. This is expected as the target point is far from the historical data, i.e., a condition
under which accurate extrapolation is difficult. In this case, cloud providers would need
to collect additional historical data, to improve regression results. In summary, these
results indicate that even without knowing the service time distributions, our analytical
models are useful.
Experiment 4: Effect of Performance Degradation Function: In previous exper-
iments, we used a linear performance degradation function,d(). In this experiment, we
use different d() functions (with different degrees and rates of degradation in perfor-
mance) and demonstrated()’s effect on Eq. 5.2. The functions used here are shown in
Fig. 5.10. The first two, d
1
() and d
2
(), where the performance degradation under d
2
()
is greater. The third, d
3
(), is a logarithmic function. When the rack is fully utilized,
d
3
() has the same performance degradation asd
1
(). However, when the number of VMs
running on a rack is small, d
3
() results in a worse performance degradation than d
1
().
We now draw the curve of Eq. 5.2 using different performance degradation functions,
as depicted in Fig. 5.11, where the service time of a task follows an exponential dis-
tribution. From Fig. 5.11, we can observe that the higher degradation rate can result
in higher mean number of VMs running on a rack (e.g., underd
3
()), and a worse level
of performance degradation can result in a smaller goodput that the system is able to
105
0
0.2
0.4
0.6
0.8
1
0 200 400 600 800 1000
d(i)
number of VMs running on a rack (i)
d
1
()
d
2
()
d
3
()
Figure 5.10: Performance degradation
functions
0
100
200
300
400
500
600
700
0 200 400 600 800 1000
goodput
the average number of VMs running on a rack
Eq. 2 (using d
1
())
Eq. 2 (using d
2
())
Eq. 2 (using (d
3
())
Figure 5.11: Effect of performance
degradation functions
1
1.1
1.2
1.3
1.4
1.5
1.6
1.4 1.8 2.2
average task service time
t
th
Re-instantiation
Re-launch
(a) uniform distribution
1
1.2
1.4
1.6
1.8
2
1.4 1.8 2.2
average task service time
t
th
Re-instantiation
Re-launch
(b) normal distribution
Figure 5.12: Effect of threshold on Re-launch vs. Re-instantiation
λ
n
error rate error rate
(goodput) (average # of VMs)
100 1.22% 0.71%
200 1.01% 1.25%
300 0.72% 0.96%
400 0.34% 0.82%
500 0.32% 1.09%
Table 5.4: Effect of approximation (model
vs. mechanism in real systems
λ
n
error rate error rate
(goodput) (average # of VMs)
100 0.89% 0.04%
200 0.48% 0.37%
300 0.17% 0.11%
400 0.77% 0.42%
Table 5.5: Comparison between our model
and the mechanism used in real systems
(uniform distribution: [0,
2
μ∗d()
])
reach (e.g., underd
2
()). Moreover, under more severe performance degradation charac-
teristics, it is possible that the goodput will decrease when the average number of VMs
running on a rack increases (e.g., underd
2
()).
106
λ
n
error rate error rate
(goodput) (average # of VMs)
100 0.34% 0.29%
200 0.03% 0.32%
300 0.91% 0.98%
400 0.45% 0.25%
Table 5.6: Comparison between our
model and the mechanism used in real
systems (normal distribution: standard
deviation(
1
μ∗d()
))
λ
n
error rate error rate
(goodput) (average # of VMs)
1 0.41% 0.72%
2 0.02% 0.21%
3 1.19% 3.19%
4 1.37% 3.09%
5 0.33% 2.87%
Table 5.7: Effect of approximation (model
vs. mechanism in real systems,v=10)
5.3.3 Re-instantiation model evaluation
We now evaluate and validate our model of Re-instantiation.
Experiment 5: Validating Our Approximation: Recall that compared to the
mechanism used in real systems, we approximate the generation of speculative copies
in our model as follows - when the running time of a VM is greater than t
th
, we do
not make a speculative copy immediately, but rather do it at the time of next task com-
pletion. The goal of this experiment is to verify that our approximation still produces
accurate results.
We experimented with a number of service time distributions.We simulate the real
mechanism in our experiment and measure the resulting goodput of tasks and the mean
number of VMs running on a rack. We compare the simulation results with those calcu-
lated based on our analytical model. This comparison is given in Tables 5.4, 5.5, and 5.6
as a % error between the simulated and model results. This indicates that our approxi-
mation has little effect on accuracy. This is due to the fact that the time interval between
two events - (1) crossing of the threshold and (2) task/VM completion - is quite short.
which in turn is due to the following. When the number of VMs running in the system
is large, the time interval between two VM departure is short.
107
Since the significance of the discrepancy is partly a function of the total number of
VMs that can be running simultaneously, we also consider how this inaccuracy might
grow as we decrease the total number of VMs on a rack. To this end, we consider an
extremely small scale rack, with v = 10. The corresponding results are given in Table
5.7. We observe that, even if the number of VMs a rack can support is quite small, the
inaccuracy due to our approximation is not significant.
Experiment 6: Studying the Effect oft
th
: The goal of this experiment is to study
how task response time is affected by t
th
in the speculation mechanisms. Unlike the
Re-launch mechanism, the Re-instantiation mechanism allows a task to have multiple
copies, which can reduce the response time of a task. In this experiment, we change
t
th
from 1.4 to 1.8 to 2.2 and compare the mean task response time from the model of
the Re-launch mechanism and from the model of the Re-instantiation mechanism. We
depict results from two service time distributions, a uniform distribution and a normal
distribution, in Fig. 5.12. These results demonstrate that, under the Re-instantiation
mechanism, whent
th
is reduced, the mean task service time is also smaller as the number
of copies of the same task tends to be larger. However, in the Re-launch mechanism,
small values of t
th
can result in larger mean task service times, as in this case using
smaller values oft
th
results in more frequent re-launching of a task. Because the same
t
th
can have different effects on the two mechanisms, cloud providers can use our models
to explore a good value fort
th
, based on the mechanism they would like to use.
5.4 Utility of Analytical Models
In this chapter, we explore potential utility of our models. We first consider how the
models can be used to explore the tradeoff between system performance (benefits) and
energy usage (cost). We then consider how cloud providers can satisfy heterogeneous
108
response time requirements based on our estimation of response time distribution (as
presented in Chapter 5.2.2).
5.4.1 Energy vs. Performance
To mitigate performance degradation which should result in smaller response times,
cloud providers can assign fewer tasks to one rack. However, given the same workload,
this would require use of more racks, thus increasing energy usage and cost. (We explain
in more detail how energy cost is related to rack usage below.) Consequently, there is
a tradeoff between a system’s performance characteristics and its energy cost. Next,
we consider how our models can be used to study this tradeoff and find an appropriate
balance between cost and performance.
Here, we consider a multi-rack environment, with the corresponding model illus-
trated in Fig. 5.13. Specifically, a data center contains multiple racks, with a job man-
ager at the front end. When a job arrives, the job manager is responsible for assigning
its tasks to racks. Depending on specific performance and energy usage goals, cloud
providers can choose different task assignment mechanisms. Two possible (simple)
baseline mechanisms are as follows - one focused only on performance and the other
only on energy usage.
• Performance first (PF): One approach to reducing job response time would be
to reduce as much as possible the response time of each of its tasks. Given the
performance degradation (due to resource contention), when a job arrives, a rea-
sonable mechanism would be to assign each task (individually) to the rack with
the least number of VMs running. This should reduce response time, but would
result in quite a few racks running, thus increasing energy cost.
109
• Energy first (EF): One approach to reducing energy usage/cost is to use as few
racks as possible, i.e., consolidate the workload on a few racks and let the others
idle or shut them off. To this end, when a jobs arrives, EF assigns each task
(individually) to the rack with the greatest number of VMs running, as long as
that rack still has VMs available. This would reduce energy cost (by reducing the
number of running racks), but can result in significant performance degradation,
due to the high workload on those racks that are running.
A simpler approach to tradeoff between these two extremes is a hybrid mechanism.
We consider the following specific hybrid approach. The task assignment mechanism in
our approach is similar to that of EF, i.e., we favor assigning tasks to a rack with more
VMs running. However, unlike EF, which continues to assign tasks to a rack until the
rack cannot accept more tasks, our mechanism stops assigning tasks to a rack when the
mean arrival rate of new tasks to that rack reaches a threshold, as calculated using our
models (details below). By doing this, we can make sure that the system can satisfy its
performance requirements while reducing energy cost.
We few the performance (or QoS) requirement of the system as having two param-
eters: (i) t
q
, a bound on the response time of a task, and (ii) task QoS probability, i.e.,
the probability that the response time of tasks is less than t
q
. Recall the estimation of
response time distribution in Chapter 5.2.2, which is calculated based on an assumption
that the system reaches the equilibrium point. Moreover, as demonstrated in Chapter
5.3, the equilibrium point in our models shifts when λ
n
changes. The shifting of the
equilibrium point results in the change of the mean number of VMs running on a rack
and affects response time due to changes in performance degradation. Therefore, cloud
providers can control the equilibrium point of a rack by controlling the workload (i.e.,
number of tasks) that is assigned to a rack, i.e., by changing λ
n
. To calculate λ
n
for
each rack, we can vary λ
n
iteratively, until a new system equilibrium that satisfies the
110
requested task QoS probability is found. Note that, the assumption here is that cloud
providers have the information they need, such as the service time distribution and the
performance degradation function (see Experiments 1 and 2), or that they can obtain
historical data (see Experiment 3). Thus, cloud providers can determine λ
n
based on
our models without generating real workload.
In our simulation-based experiments, we vary several parameters, including number
of racks, t
q
, and the required task QoS probability. We also consider different service
time distributions. First, we use the following settings: the total number of racks is 10,
t
q
= 3, the required task QoS probability is0.9, the service time follows an exponential
distribution, and we use the analytical model of the Re-launch mechanism.
In our experiments, when a job arrives, its tasks are assigned to different racks. Here,
we add another constraint for task assignment. Cloud providers need to consider dataset
locality when doing task assignment, as that can reduce response time as well [25].
Thus, in our experiments, a task is assigned to a rack which has the dataset it needs.
There could be several such racks due to data replication; data is typically replicated
due to reliability (and performance) considerations [25]. In our simulations, we vary
the number of copies per dataset, from 2 to 10. The smaller the number of copies, the
less flexibility there is in assigning tasks to racks, but a greater number of copies is
more costly (e.g., due to data storage and management costs). We also explore how the
number of copies affects the tradeoff between system performance and energy costs.
We consider the results of our experiments under two metrics: (i) the task QoS
probability, i.e, that the response time of tasks is less thant
q
, and (ii) the number of active
racks. Because the workload is the same in all mechanisms, the total number of active
VMs (across all racks) is also the same. Cloud providers can save energy by turning off
racks that are not in use, to reduce the number of idle physical resources; although, that
can in turn result in delays (when racks need to be turned on). The exploration of “idle
111
vs. off” tradeoff is beyond the scope of this work. Thus, below we use the number of
active racks as an indicator of energy costs.
In Fig. 5.14, we depict the number of active racks when the numbers of copies per
dataset is 4 and 10. The results demonstrate that, when cloud providers have greater
flexibility in task assignment, they can obtain greater reduction in energy cost. We
demonstrate the corresponding results for our two metrics in Fig. 5.15 and Fig. 5.18.
The results in Fig. 5.15 indicate that, as compared to PF, the hybrid mechanism uses
fewer racks. Although the hybrid mechanism use more racks than EF, the results in Fig.
5.18 demonstrate that it can always satisfy the required performance characteristics,
while EF cannot guarantee performance. Because PF and EF target only one of the two
metrics, they cannot guarantee to satisfy the other. Our models can help cloud providers
explore the appropriate tradeoff between performance commitments and cost.
Now, we show more results to demonstrate that cloud providers can use our models
to facilitate the tradeoff between performance and energy cost. We vary the service
time distribution,t
q
, and the required task QoS probability. A summary of settings for
these additional results is given in Table 5.8. The number of racks in our experiments
is 20. The results are depicted in Fig. 5.16 and Fig. 5.17. Similarly to the results
we demonstrated in Fig. 5.15 and Fig. 5.18, the results in Fig. 5.16 indicate that,
as compared to PF, the hybrid mechanism uses fewer racks. Moreover, the results in
Fig. 5.17 demonstrate that (aided by our models) the hybrid mechanism can always
satisfy the required performance characteristics, while EF cannot make performance
guarantees.
5.4.2 Heterogeneous QoS Requirements
In our previous experiments, every task had the same response time requirement, t
q
,
and the same requested QoS probability. Here, we demonstrate how our model can be
112
jobs
job manager
tasks
racks
Figure 5.13: System model
0
2
4
6
8
10
12
number of active racks
simulation time
PF
EF
Hybrid
(a) number of copies per
dataset = 4
0
2
4
6
8
10
12
number of active racks
simulation time
PF
EF
Hybrid
(b) number of copies per
dataset = 10
Figure 5.14: Number of active racks
0
2
4
6
8
10
12
2 3 4 5 6 7 8 9 10
average number of active racks
number of copies per dataset
PF
EF
Hybrid
Figure 5.15: Mean num-
ber of active racks as a
function of dataset copies
service time t
q
required task
distribution QoS probability
exponential 4 0.95
uniform 4 0.9
normal 4.5 0.9
Table 5.8: A summary of simulation settings
adapted to satisfy heterogeneous QoS requirements. We first consider this at task level
and then extend it to job level. Each task has its ownt
q
and the required QoS probability.
Cloud providers can assign tasks to different racks based on their requirements.
We classify tasks into different categories based on their requirements. In our
simulation-based experiments, we explore several different settings and present a subset
of results (due to space limitations). In our experiments, there are four categories of
tasks, and each category has its ownt
q
. The values oft
q
of the four categories are 2.4,
113
0
5
10
15
20
2 3 4 5 6 7 8 9 10
average number of active racks
number of copies per dataset
PF
EF
Hybrid
(a) exponential
distribution
0
5
10
15
20
2 3 4 5 6 7 8 9 10
average number of active racks
number of copies per dataset
PF
EF
Hybrid
(b) normal distribution
0
5
10
15
20
2 3 4 5 6 7 8 9 10
average number of active racks
number of copies per dataset
PF
EF
Hybrid
(c) uniform distribution
Figure 5.16: Number of active racks under all mechanisms
0.7
0.75
0.8
0.85
0.9
0.95
1
2 3 4 5 6 7 8 9 10
probability
number of copies per dataset
PF
EF
Hybrid
required probability
(a) exponential
distribution
0.7
0.75
0.8
0.85
0.9
0.95
1
2 3 4 5 6 7 8 9 10
probability
number of copies per dataset
PF
EF
Hybrid
required probability
(b) normal distribution
0.7
0.75
0.8
0.85
0.9
0.95
1
2 3 4 5 6 7 8 9 10
probability
number of copies per dataset
PF
EF
Hybrid
required probability
(c) uniform distribution
Figure 5.17: Task QoS probability that the service time of a task is less thant
q
under
all mechanisms
0.5
0.6
0.7
0.8
0.9
1
2 3 4 5 6 7 8 9 10
probability
number of copies per dataset
PF
EF
Hybrid
required probability
Figure 5.18: Task QoS
probability
0.5
0.6
0.7
0.8
0.9
1
2.4 2.7 3.0 3.3
probability
t
q
requested probability
Figure 5.19: Task QoS
probability for each cate-
gory
0.5
0.6
0.7
0.8
0.9
1
3 5 7
probability
number of tasks per job
requested probability
Figure 5.20: Job QoS
probability for each cate-
gory
2.7, 3.0, and 3.3. Although the values oft
q
of the four categories are different, it is likely
desirable that each of these is satisfied with reasonably high probability. Thus, we fix
the required QoS probability of each category at 0.9.
Based on t
q
and the required QoS probability of each task, cloud providers can
assign tasks to different racks. If a task has a smallert
q
, cloud providers should assign
it to a rack with fewer VMs running. Our models can help cloud providers determine
114
which rack a task should be assigned to, i.e., by checkingλ
n
of each rack and selecting
a rack where the system equilibrium can satisfy the requested task QoS probability.
In Fig. 5.19, we depict the task QoS probability for each category from our simulation
results. The results demonstrate that the required QoS probabilities of the four categories
are satisfied. Thus, our models can help cloud providers satisfy heterogeneous QoS
requirements by controlling the values ofλ
n
of each rack.
Now, let us consider the response time at the job level. Recall that we assume that
all tasks of the same job are executed in parallel. Therefore, the QoS probability of a
job withn tasks is:
P(T
job
≤t
q
) = (P(T
l
r
≤t
q
))
n
,
Given t
q
and the required job QoS probability, we can compute the required task QoS
probability, as
P(T
l
r
≤t
q
) = (P(T
job
≤t
q
))
1
n
. (5.11)
In the previous experiment, we focused on the required task QoS probability. In
calculating the required job QoS probability, we need to account for the number of tasks
per job. We now have the following experiment; there are three categories of jobs,
with 3, 5, 7 tasks per job, respectively. The t
q
values for each category are 4, 4.3, and
4.6, respectively, and the required job QoS probability is fixed at 0.9. Using Eq. 5.11,
we calculate the task QoS probability for each category, resulting in the required task
QoS probabilities of 0.965, 0.979, and 0.985, respectively. Similarly to the previous
experiment, when the QoS requirements of each task are given, including t
q
and task
QoS probability, cloud providers can determine which rack a task should be assigned to
by checking the system equilibrium of each rack. The results depicted in in Fig. 5.20
demonstrate that all job QoS probabilities are satisfied. Thus, our models can be used,
115
to determine task assignment, in order to satisfy heterogeneous QoS requirements at the
task/job level.
5.5 Extensibility of Framework
Above we presented our modeling framework and its application to two speculation
mechanisms as well as to considering tradeoffs between performance characteristics
and energy costs. In this chapter, we discuss the extensibility of our framework in the
context of several potential directions of extensions.
5.5.1 Combining mechanisms
In a real system, one might consider using multiple speculation mechanisms. We have
demonstrated in Experiment 6 that using Re-instantiation can reduce the response time
of a task. Therefore, from a system perspective, if a task requires shorter response time,
Re-instantiation can be chosen. However, using Re-instantiation can result in a greater
number of copies running, thus increasing energy cost. Hence, when a task arrives,
cloud providers can consider its QoS requirements as well as their cost and choose an
appropriate mechanism accordingly.
In such a case, our models (see Chapters 5.2.2 and 5.2.2) can be combined, for
instance, in the following manner. Upon task arrival one of the mechanisms is chosen
probabilistically. After this choice is made, the task always follows the same speculation
mechanism. Let p
rl
be the fraction of all tasks using Re-launch and p
ri
be the fraction
of all tasks using Re-instantiation, wherep
rl
+p
ri
= 1. Then, the new task arrival rates
for Re-launch and Re-instantiation arep
rl
λ
n
andp
ri
λ
n
, respectively.
116
From Eq. 5.1 and Eq. 5.7, the rate of completed tasks is
p
rl
λ
n
+
(1−p
t
)p
ri
λ
n
1−p
i
(1−p
t
)
. (5.12)
From Eq. 5.2 and Eq. 5.8, the goodput is
k
l
p
l
1−p
l
t
th
+t
l
c
+
k
i
pt
1−pt
t
t
+t
i
c
. (5.13)
Given Eq. 5.12 and Eq. 5.13, we can determine the equilibrium point and then use the
rest of the framework in a similar manner.
5.5.2 Other VM operations
Other potential extensions, given current resource management techniques in datacen-
ters, include VM migration, other types of speculation mechanisms, and task iteration
(where completed tasks may be (re)generated as part of the same job, e.g., as is common
in machine learning applications [96]). We consider each in turn.
VM Migration
VM migration is widely used in data centers [90]. It facilitates balancing of workload for
mitigating performance degradation as well improvement in utilization and reduction of
energy costs. Our models can be extended to VM migration easily, as depicted in Fig.
5.21(a), where λ
m
is the mean arrival rate of migrated tasks from other racks, λ
m
t
is
the mean total arrival rate of tasks, including new tasks and migrated tasks, k
m
is the
mean number of VMs on a rack, p
m
is the probability that a task is migrated to other
racks. Since VM migration is controlled by the system,p
m
can be determined by cloud
providers.
117
λ
n
λ
t
m
k
m
λ
m
p
m
λ
t
m
(1-p
m
)λ
t
m
λ
n
k
λn
(a) VM Migration (b) Cloning Speculation
Figure 5.21: Extensions: VM migration and cloning
Similarly to Chapter 5.2.2, the rate of completed tasks is then
(1−p
m
)(λ
n
+λ
m
), (5.14)
and the goodput is
k
m
pm
1−pm
t
m
+t
m
c
, (5.15)
wheret
m
is the mean running time of migrated VMs, andt
m
c
is the mean service time of
a completed task.
The estimation of response time distribution will be similar to that of Re-launch.
Since a VM will have a new service time after it is migrated, i.e., like re-launching a
task. The only difference is that, in the case of VM migration, we can consider a task as
being re-launched on a different rack, where as in Re-launch, a task is re-launched on
the same rack.
Cloning-based speculation
In the speculation mechanism considered above, the system waits for some time before
generating a speculative copy. Another approach is considered in [5] where the focus
118
is on small jobs. Small jobs are jobs that consist of a few tasks.Thus, they consider
full cloning of small jobs, where the basic idea is to launch multiple clones (speculative
copies) immediately upon job arrival.
In this case, we can assume that there are α copies per task; thus, α VMs are allo-
cated to these copies simultaneously, and they are released simultaneously when one
of the copies completes. Therefore, we can model these α VMs as one powerful VM
whose service rate is αμ. Moreover, a rack then supports fewer tasks (in this mecha-
nism) as multiple VMs have to be allocated to each arriving task simultaneously. The
cloning speculation mechanism is depicted in Fig 5.21(b). Our modeling framework can
then be used, simply by settingp
l
= 0 in Fig. 5.1(a).
Task iteration
In our models, we assume that a task runs only once. However, in some applications,
each task may have several iterations. By iteration here we mean that, when a task fin-
ishes, it may (re)generate again, still as part of the same job. This technique is widely
adopted in machine learning, deep learning, and search applications [81]. Such applica-
tions continue updating parameters values, to build a better training model; additional
iterations can also refine and provide better results (e.q., in search applications). Task
iteration is similar to Re-instantiation: when a task is finished, it will generate another
task. The probability that a task will generate a successor task is controlled by the appli-
cations. If an application determines that a larger number of iterations is needed, it can
set this probability higher in our analytical model. Important differences between these
two mechanisms include the following. (1) Task iteration does not generate specula-
tive copies and does not terminate tasks. (2) In task iteration, a task can have multiple
iterations. Thus, to estimate the response time distribution, we need to consider how to
119
estimate the response time distribution of sequential tasks. This is beyond the scope of
this work and is part of future efforts.
5.6 Conclusions
We introduced a modeling framework for evaluating system performance in the context
of cloud computing. Specifically, we focused on speculation mechanisms and perfor-
mance degradation due to resource contention and proposed analytical models based on
analysis of interaction between demand for service and system resources. Our mod-
els can aid cloud providers in evaluating several performance metrics (e.g., goodput of
tasks, mean number of VMs running on a rack, and task/job response time distribution)
and can provide insight into choosing appropriate speculation mechanisms to satisfy
a particular QoS requirement (e.g., using the Re-instantiation mechanism can reduce
response time). Our simulation-based experiments in Chapter 5.3 demonstrated that our
models can estimate response time distribution accurately. Moreover, we explored a
hybrid task management mechanism based on our analytical models in Chapter 5.4 and
showed how the hybrid mechanism can facilitate a tradeoff between performance and
energy cost. In Chapter 5.4, we also showed that our models can help cloud providers
satisfy heterogeneous QoS requirements through appropriate task management. Fur-
thermore, we demonstrated in Chapter 5.5 that our modeling framework is extensible
and can be applied to a number of task management mechanisms.
120
Chapter 6
Conclusions
In this work, we study how service providers can satisfy QoS requirements through user-
system interaction analysis. We focus two types of interactions: (1) those are incorpo-
rated into system’s design, and (2) those that result from observed system performance.
We consider several different services, including P2P streaming systems, web/video
services, and cloud computing. We first consider an important problem of how to pro-
vide incentive to encourage peers to contribute their upload capacity in P2P streaming
systems in Chapter 3 of this thesis. We demonstrate that service providers can use adver-
tisements as incentives in P2P streaming systems and thus encourage peers to contribute
greater upload capacity. Our study provides system developers with insight into efficient
development of such systems. In Chapter 4, we study resource estimation problems in
virtual networks through user and network interaction analysis. We propose a resource
estimation mechanism, where our evaluation results demonstrate that our mechanism
can estimate the amount of resources accurately and efficiently. We then study the two
problems in the context of cloud computing: (1) how can cloud providers evaluate sys-
tem performance under different goals, and (2) how can cloud providers facilitate the
tradeoff between performance and cost. To the end, we propose a modeling framework
for evaluating system performance. Our framework can provide insight into appropriate
resource management for facilitating tradeoffs between performance commitments and
cost.
121
References
[1] Amazon EC2. http://aws.amazon.com/ec2/.
[2] Microsoft Azure. http://azure.microsoft.com/.
[3] M. Adler, R. Kumar, K. Ross, D. Rubenstein, T. Suel, and D.D. Yao. Optimal peer
selection for p2p downloading and streaming. In Proceedings of IEEE INFOCOM,
2005.
[4] E. Altman, F. Boccara, J. Bolot, P. Nain, P. Brown, D. Collange, and C. Fenzy.
Analysis of the TCP/IP flow control in high-speed wide-area networks. In IEEE
CDC, 1995.
[5] Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. Effective
straggler mitigation: Attack of the clones. In NSDI, pages 185–198, 2013.
[6] Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Warfield, Dhruba Borthakur,
Srikanth Kandula, Scott Shenker, and Ion Stoica. Pacman: Coordinated memory
caching for parallel jobs. In NSDI, 2012.
[7] Ganesh Ananthanarayanan, Michael Chien-Chun Hung, Xiaoqi Ren, Ion Stoica,
Adam Wierman, and Minlan Yu. Grass: Trimming stragglers in approximation
analytics. In USENIX NSDI, 2014.
[8] Ganesh Ananthanarayanan, Srikanth Kandula, Albert Greenberg, Ion Stoica,
Yi Lu, Bikas Saha, and Edward Harris. Reining in the outliers in Map-Reduce
clusters using Mantri. In USENIX OSDI, 2010.
[9] David G. Andersen. Theoretical approaches to node assignment. Unpublished
Manuscript, Dec. 2002.
[10] T. Anderson, L. Peterson, S. Shenker, and J. Turner. Overcoming the internet
impasse through virtualization. IEEE Computer, April 2005.
[11] Nikhil Bansal, Tracy Kimbrel, and Kirk Pruhs. Speed scaling to manage energy
and temperature. J. ACM, 54(1):1–39, March 2007.
122
[12] Chadi Barakat and Eitan Altman. Analysis of TCP with Several Bottleneck Nodes.
Technical Report RR-3620, INRIA, 1999.
[13] Brian Barrett. Google’s Insane Number of Servers Visualized.
http://gizmodo.com/5517041/googles-insane-number-of-servers-visualized.
[14] Luiz Andre Barroso. Warehouse-scale computing: Entering the teenage decade.
In ACM ISCA, 2011.
[15] Andy Bavier, Nick Feamster, Mark Huang, Larry Peterson, and Jennifer Rexford.
In vini veritas: realistic and controlled network experimentation. In ACM SIG-
COMM, 2006.
[16] Xiangping Bu, Jia Rao, and Cheng-Zhong Xu. Interference and locality-aware task
scheduling for mapreduce applications in virtual clusters. In HPDC, 2013.
[17] Bin Cheng, Xuezheng Liu, Zheng Zhang, and Hai Jin. A measurement study of a
peer-to-peer video-on-demand system. In IPTPS, 2007.
[18] Xiang Cheng, Sen Su, Zhongbao Zhang, Hanchi Wang, Fangchun Yang, Yan Luo,
and Jie Wang. Virtual network embedding through topology-aware node ranking.
SIGCOMM Comput. Commun. Rev., April 2011.
[19] Xu Cheng, C. Dale, and Jiangchuan Liu. Statistics and social network of YouTube
videos. In IEEE IWQoS, Jun 2008.
[20] L. Cheung, L. Golubchik, and Fei Sha. A study of web services performance
prediction: A client’s perspective. In IEEE MASCOTS, 2011.
[21] C.L. Chiang. Statistical Methods of Analysis. World Scientific, 2003.
[22] Ron C. Chiang and H. Howie Huang. Tracon: Interference-aware scheduling for
data-intensive applications in virtualized environments. In ACM/IEEE SC, 2011.
[23] N.M.M.K. Chowdhury, M.R. Rahman, and R. Boutaba. Virtual network embed-
ding with coordinated node and link mapping. In IEEE INFOCOM, 2009.
[24] Hongyan Cui, Shaohua Tang, Xu Huang, Jianya Chen, and Yunjie Liu. A novel
method of virtual network embedding based on topology convergence-degree. In
IEEE ICC, pages 246–250, June 2013.
[25] Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified data processing on
large clusters. In USENIX OSDI, 2004.
[26] D. Dietrich, A. Rizk, and P. Papadimitriou. Multi-domain virtual network embed-
ding with limited information disclosure. In IFIP Networking Conference, pages
1–9, May 2013.
123
[27] Marcel Dischinger, Andreas Haeberlen, Krishna P. Gummadi, and Stefan Saroiu.
Characterizing residential broadband networks. In Proceedings of the 7th ACM
SIGCOMM conference on Internet measurement, pages 43–56, 2007.
[28] Florin Dobrian, Vyas Sekar, Asad Awan, Ion Stoica, Dilip Joseph, Aditya Ganjam,
Jibin Zhan, and Hui Zhang. Understanding the impact of video quality on user
engagement. In ACM SIGCOMM, 2011.
[29] J. Fan and M. H. Ammar. Dynamic topology configuration in service overlay
networks: A study of reconfiguration policies. In IEEE INFOCOM, 2006.
[30] Nick Feamster, Lixin Gao, and Jennifer Rexford. How to lease the internet in your
spare time. ACM SIGCOMM Comput. Commun. Rev., 37:61–64, Jan. 2007.
[31] A. Fischer, J. Botero, M. Beck, H. De Meer, and X. Hesselbach. Virtual network
embedding: A survey. IEEE Communications Surveys & Tutorials, PP(99):1–19,
2013.
[32] Y . Fukushima, Yin Tao, K. Inada, and T. Yokohira. As-friendly peer selec-
tion algorithms without as topology information in p2p live streaming. In the
8th Asia-Pacific Symposium on Information and Telecommunication Technologies
(APSITT), 2010.
[33] Anshul Gandhi, Varun Gupta, Mor Harchol-Balter, and Michael A. Kozuch. Opti-
mality analysis of energy-performance trade-off for server farm management. Per-
form. Eval., 67(11):1155–1171, 2010.
[34] T. Ghazar and N. Samaan. Pricing utility-based virtual networks. IEEE Transac-
tions on Network and Service Management, 10(2):119–132, June 2013.
[35] Phillipa Gill, Martin Arlitt, Zongpeng Li, and Anirban Mahanti. Youtube traffic
characterization: a view from the edge. In IMC, 2007.
[36] Sriram Govindan, Jie Liu, Aman Kansal, and Anand Sivasubramaniam. Cuanta:
Quantifying effects of shared on-chip resource interference for consolidated virtual
machines. In ACM Symposium on Cloud Computing (SOCC), 2011.
[37] A. Habib and J. Chuang. Incentive mechanism for peer-to-peer media streaming.
In the 12th IEEE International Workshop on Quality of Service, 2004.
[38] Xiaojun Hei, Chao Liang, Jian Liang, Yong Liu, and K.W. Ross. A measure-
ment study of a large-scale p2p iptv system. IEEE Transactions on Multimedia,
9(8):1672–1687, Dec. 2007.
[39] Poo Kuan Hoong and Hiroshi Matsuo. Push-pull incentive-based p2p live media
streaming system. WSEAS Transactions on Communications, 7:33–42, Feb. 2008.
124
[40] http://isi.edu/nsnam/ns/. ns-2.
[41] http://www.geni.net/. GENI.
[42] http://www.planet-lab.org/. PlanetLab.
[43] http://www.youtube.com/. YouTube.
[44] Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. Dryad:
Distributed data-parallel programs from sequential building blocks. In ACM
EuroSys, 2007.
[45] Younggyun Koh, Rob C. Knauerhase, Paul Brett, Mic Bowman, Zhihua Wen, and
Calton Pu. An analysis of performance interference effects in virtual environments.
IEEE ISPASS, 2007.
[46] Stavros G. Kolliopoulos and Clifford Stein. Improved approximation algorithms
for unsplittable flow problems. In FOCS, 1997.
[47] R. Kumar, Y . Liu, and K.W. Ross. Stochastic fluid theory for p2p streaming sys-
tems. In Proceedings of IEEE INFOCOM, 2007.
[48] Bo Li, Susu Xie, G.Y . Keung, Jiangchuan Liu, I. Stoica, Hui Zhang, and Xinyan
Zhang. An empirical study of the coolstreaming+ system. IEEE Journal on
Selected Areas in Communications, 25(9):1627–1639, Dec. 2007.
[49] Nicolas Liebau, Oliver Heckmann, Aleksandra Kovacevic, Andreas Mauthe, and
Ralf Steinmetz. Charging in peer-to-peer systems based on a token accounting
system. In the 5th International Workshop on Advanced Internet Charging and
QoS Technologies, pages 49–60, 2006.
[50] Ching-Chi Lin, Pangfeng Liu, and Jan-Jan Wu. Energy-aware virtual machine
dynamic provision and scheduling for cloud computing. In IEEE Cloud, 2011.
[51] Minghong Lin, A. Wierman, L.L.H. Andrew, and E. Thereska. Dynamic right-
sizing for power-proportional data centers. In INFOCOM, 2011.
[52] Jens Lischka and Holger Karl. A virtual network mapping algorithm based on
subgraph isomorphism detection. In VISA, 2009.
[53] Zhengye Liu, Yanming Shen, Shivendra S. Panwar, Keith W. Ross, and Yao Wang.
Using layered video to provide incentives in p2p live streaming. In Proceedings of
the workshop on Peer-to-peer streaming and IP-TV, pages 311–316. ACM, 2007.
[54] Zhengye Liu, Yanming Shen, S.S. Panwar, K.W. Ross, and Yao Wang. P2P video
live streaming with MDC: Providing incentives for redistribution. In IEEE Inter-
national Conference on Multimedia and Expo, pages 48–51, July 2007.
125
[55] Zhengye Liu, Yanming Shen, K.W. Ross, S.S. Panwar, and Yao Wang. Substream
trading: Towards an open p2p live streaming system. In IEEE International Con-
ference on Network Protocols, pages 94–103, Oct. 2008.
[56] Zhengye Liu, Yanming Shen, K.W. Ross, S.S. Panwar, and Yao Wang. Layerp2p:
Using layered video chunks in p2p live streaming. IEEE Transactions on Multi-
media, 11(7):1340–1352, Nov. 2009.
[57] Jing Lu and Jonathan Turner. Efficient mapping of virtual networks onto a shared
substrate. Washington University. Technical Report, 2006.
[58] Hui Lv, Yaozu Dong, Jiangang Duan, and Kevin Tian. Virtualization challenges:
A view from server consolidation perspective. In ACM VEE, 2012.
[59] J. J. D. Mol, D. H. J. Epema, and H. J. Sips. The orchard algorithm: P2p mul-
ticasting without free-riding. In IEEE International Conference on Peer-to-Peer
Computing, pages 275–282, 2006.
[60] Tim Moreton and Andrew Twigg. Trading in trust, tokens and stamps. In Proceed-
ings of the 2nd Workshop on Economics of Peer-to-Peer Systems, 2003.
[61] Ripal Nathuji, Aman Kansal, and Alireza Ghaffarkhah. Q-clouds: Managing per-
formance interference effects for QoS-aware clouds. In Eurosys, 2010.
[62] Dejan Novakovi´ c, Nedeljko Vasi´ c, Stanko Novakovi´ c, Dejan Kosti´ c, and Ricardo
Bianchini. Deepdive: Transparently identifying and managing performance inter-
ference in virtualized environments. In USENIX ATC, 2013.
[63] Jitendra Padhye, Victor Firoiu, Donald F. Towsley, and James F. Kurose. Modeling
TCP Reno performance: a simple model and its empirical validation. IEEE/ACM
Trans. Netw., 8:133–145, April 2000.
[64] Vinay Pai and Alexander E. Mohr. Improving robustness of peer-to-peer stream-
ing with incentives. In Proceedings of the First Workshop on the Economics of
Networked Systems, 2006.
[65] Fabio Pianese and Diego Perino. Resource and locality awareness in an incentive-
based p2p live streaming system. In Proceedings of the workshop on Peer-to-peer
streaming and IP-TV, pages 317–322. ACM, 2007.
[66] Fabio Pianese, Diego Perino, Joaqu´ ın Keller, and Ernst W. Biersack. Pulse: An
adaptive, incentive-based, unstructured p2p live streaming system. IEEE Transac-
tions on Multimedia, 9(8):1645–1660, 2007.
[67] Tianhao Qiu, Ioanis Nikolaidis, and Fulu Li. On the design of incentive-aware p2p
streaming. Journal of Internet Engineering, 1(2):61–71, 2007.
126
[68] Asfandyar Qureshi, Rick Weber, Hari Balakrishnan, John Guttag, and Bruce
Maggs. Cutting the electric bill for internet-scale systems. In ACM SIGCOMM,
2009.
[69] Sandvine. Sandvine Global Internet Phenomena Report - Spring 2011.
http://www.sandvine.com/.
[70] Wayne Schmidt. How Much TV Commercial Length has Grown over the Years.
http://www.waynesthisandthat.com/commerciallength.htm.
[71] Zhijie Shen and Roger Zimmermann. Isp-friendly peer selection in p2p networks.
In Proceedings of the 17th ACM international conference on Multimedia, 2009.
[72] T. Silverston, O. Fourmaux, and J. Crowcroft. Towards an incentive mechanism for
peer-to-peer multimedia live streaming systems. In IEEE International Conference
on Peer-to-Peer Computing, pages 125–128, Sep. 2008.
[73] Neil Spring, Ratul Mahajan, and David Wetherall. Measuring isp topologies with
rocketfuel. In ACM SIGCOMM, 2002.
[74] Sen Su, Zhongbao Zhang, Xiang Cheng, Yiwen Wang, Yan Luo, and Jie Wang.
Energy-aware virtual network embedding through consolidation. In IEEE Con-
ference on Computer Communications Workshops (INFOCOM WKSHPS), pages
127–132, March 2012.
[75] W. Szeto, Y . Iraqi, and R. Boutaba. A multi-commodity flow based approach to
virtual network resource allocation. In IEEE GLOBECOM, 2003.
[76] Y . C. Tay, Dinh Nguyen Tran, Eric Yi Liu, Wei Tsang Ooi, and Robert Morris.
Equilibrium analysis through separation of user and network behavior. Comput.
Netw., 52:3405–3420, Dec. 2008.
[77] William Thigpen, Thomas J. Hacker, Laura F. Mcginnis, and Brian D. Athey. Dis-
tributed accounting on the grid. In Proceedings of the 6th Joint Conference on
Information Sciences, pages 1147–1150, 2002.
[78] M. Till Beck, A. Fischer, H. de Meer, J.F. Botero, and X. Hesselbach. A distributed,
parallel, and generic virtual network embedding framework. In IEEE ICC, pages
3471–3475, June 2013.
[79] Dinh Nguyen Tran, Wei Tsang Ooi, and Y . C. Tay. Sax: A tool for studying
congestion-induced surfer behavior. In PAM, 2006.
[80] J.S. Turner and D.E. Taylor. Diversifying the internet. In IEEE GLOBECOM,
2005.
127
[81] Shivaram Venkataraman, Aurojit Panda, Ganesh Ananthanarayanan, Michael J.
Franklin, and Ion Stoica. The power of choice in data-aware cluster scheduling. In
OSDI, pages 301–316, October 2014.
[82] Long Vu, Indranil Gupta, Jin Liang, and Klara Nahrstedt. Measurement and mod-
eling of a large-scale overlay for multimedia streaming. In the International ICST
Conference on Heterogeneous Networking for Quality, Reliability, Security and
Robustness, 2007.
[83] Di Wu, Chao Liang, Yong Liu, and K. Ross. View-upload decoupling: A redesign
of multi-channel p2p video systems. In Proceedings of IEEE INFOCOM, 2009.
[84] Beverly Yang and Hector Garcia-Molina. Ppay: micropayments for peer-to-peer
systems. In Proceedings of the 10th ACM conference on Computer and communi-
cations security, pages 300–310, 2003.
[85] Yuan Yao, Longbo Huang, Abhishek B. Sharma, Leana Golubchik, and Michael J.
Neely. Data centers power reduction: A two time scale approach for delay tolerant
workloads. In INFOCOM, 2012.
[86] Hongliang Yu, Dongdong Zheng, Ben Y . Zhao, and Weimin Zheng. Understanding
user behavior in large-scale video-on-demand systems. In EuroSys, 2006.
[87] Minlan Yu, Yung Yi, Jennifer Rexford, and Mung Chiang. Rethinking virtual
network embedding: substrate support for path splitting and migration. SIGCOMM
Comput. Commun. Rev., April 2008.
[88] Lin Yuan and Gang Qu. Analysis of energy reduction on dynamic voltage scaling-
enabled systems. Trans. Comp.-Aided Des. Integ. Cir. Sys., 24(12):1827–1837,
November 2006.
[89] Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz, and Ion Stoica.
Improving MapReduce performance in heterogeneous environments. In USENIX
OSDI, 2008.
[90] Qi Zhang, Lu Cheng, and Raouf Boutaba. Cloud computing: state-of-the-art and
research challenges. Journal of internet services and applications, 1(1):7–18,
2010.
[91] Sheng Zhang, Zhuzhong Qian, Jie Wu, and Sanglu Lu. An opportunistic resource
sharing and topology-aware mapping framework for virtual networks. In IEEE
INFOCOM, 2012.
[92] Xinyan Zhang, Jiangchuan Liu, Bo Li, and Tak-Shing Peter Yum. Coolstream-
ing/donet: A data-driven overlay network for peer-to-peer live media streaming.
In Proceedings of IEEE INFOCOM, 2005.
128
[93] Qian Zhu and Teresa Tung. A performance interference model for managing con-
solidated workloads in QoS-aware clouds. IEEE Cloud, 2012.
[94] Qian Zhu, Jiedan Zhu, and Gagan Agrawal. Power-aware consolidation of scien-
tific workflows in virtualized environments. In ACM/IEEE SC, 2010.
[95] Yong Zhu and Mostafa H. Ammar. Algorithms for assigning substrate network
resources to virtual network components. In IEEE INFOCOM, 2006.
[96] Martin Zinkevich, Markus Weimer, Alexander J. Smola, and Lihong Li. Paral-
lelized stochastic gradient descent. In NIPS, 2010.
129
Abstract (if available)
Abstract
Provision of quality of service (QoS) is an important issue for service providers. To support required QoS, service providers need to insure that the amount of resources allocated are sufficient to support required QoS and choose appropriate resource management approaches. Since users react to experienced QoS, service providers should consider user-system interaction in resource allocation and resource management. For example, when users open a web page or watch a video, if the downloading rate is slow, they may abort the connection before it finishes. In such a case, the resources used by the connection are wasted. Moreover, high aborting rate implies that required QoS is not satisfied. This user behavior affects resource allocation. Service providers should consider how to allocate resources so as to reduce aborting rate and the amount of wasted resources. ❧ In this dissertation, we focus on several services and study how service providers can satisfy required QoS through appropriate resource allocation or resource management. ❧ The first service we study is P2P streaming. In order to provide satisfactory performance in P2P streaming systems, the number of peers with high upload capacities in streaming systems should be sufficiently high. Thus, one important problem in providing streaming services is that of providing appropriate incentives for peers to contribute their upload capacity. To this end, we propose and evaluate the use of advertisements as an incentive for peers to contribute upload capacity. ❧ We then consider web and video services and study resource estimation in virtual networks (VNs). Existing efforts focusing on resource allocation in VNs assume that VN requests indicate the exact amount of required resources. However, they do not consider how to determine the amount of resources required to support a needed QoS. To this end, we propose an alternative approach—namely that of considering QoS as a constraint. That is, when VN requests are made, service providers should be able to use the minimum required QoS as constraints of that request, rather than the amount of resources needed. The infrastructure provider must then determine the resource allocation necessary for this QoS. In particular, the provider must take into account user reaction to perceived performance and adjust the allocation dynamically. Consequently, we study web and video services and propose an estimation mechanism that is based on analyzing the interaction between user behavior and network performance. The proposed approach can satisfy user performance requirements through appropriate resource estimation. Moreover, our approach can adjust resource estimations efficiently and accurately. ❧ Finally, we focus on cloud computing. The use of cloud computing has been widely adopted
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Distributed resource management for QoS-aware service provision
PDF
QoS-aware algorithm design for distributed systems
PDF
Performance and incentive schemes for peer-to-peer systems
PDF
Improving network security through cyber-insurance
PDF
Resource scheduling in geo-distributed computing
PDF
High-performance distributed computing techniques for wireless IoT and connected vehicle systems
PDF
Improve cellular performance with minimal infrastructure changes
PDF
Adaptive resource management in distributed systems
PDF
QoS based resource management for Internet applications
PDF
Elements of next-generation wireless video systems: millimeter-wave and device-to-device algorithms
PDF
Scaling-out traffic management in the cloud
PDF
Performant, scalable, and efficient deployment of network function virtualization
PDF
Cloud-enabled mobile sensing systems
PDF
Enabling efficient service enumeration through smart selection of measurements
PDF
Optimizing task assignment for collaborative computing over heterogeneous network devices
PDF
Enabling massive distributed MIMO for small cell networks
PDF
Optimal distributed algorithms for scheduling and load balancing in wireless networks
PDF
Measuring the impact of CDN design decisions
PDF
Modeling social and cognitive aspects of user behavior in social media
PDF
Deriving component‐level behavior models from scenario‐based requirements
Asset Metadata
Creator
Wang, Bo-Chun
(author)
Core Title
Satisfying QoS requirements through user-system interaction analysis
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
02/10/2015
Defense Date
01/13/2015
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
equilibrium,incentive,OAI-PMH Harvest,performance,quality of service,user-system interaction
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Golubchik, Leana (
committee chair
), Govindan, Ramesh (
committee member
), Psounis, Konstantinos (
committee member
), Yu, Minlan (
committee member
)
Creator Email
bochun@cyprez.net,bochunwa@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-532729
Unique identifier
UC11298701
Identifier
etd-WangBoChun-3182.pdf (filename),usctheses-c3-532729 (legacy record id)
Legacy Identifier
etd-WangBoChun-3182.pdf
Dmrecord
532729
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Wang, Bo-Chun
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
equilibrium
incentive
quality of service
user-system interaction