Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Performance and incentive schemes for peer-to-peer systems
(USC Thesis Other)
Performance and incentive schemes for peer-to-peer systems
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
PERFORMANCE AND INCENTIVE SCHEMES FOR PEER-TO-PEER SYSTEMS by Wei-Cherng Liao A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) May 2009 Copyright 2009 Wei-Cherng Liao Acknowledgements First, I would like to thank my parents who born me, raised me, and supported me in my entirely life. Without them, I cannot have this thesis finished and published. Additionally, I would like to thank all members of family, who also give me various support during my Ph.D. pursuit period, as well as my life. Further, I would like to thank my advisor, Professor Konstantinos Psounis, who guides me throughout my Ph.D. life at USC. I am naive in the network society when I joined USC. However, he patiently teaches me how to conduct research, to organize thought, to present results, as well as to being a good teacher (teaching assistant). All of the above establish me strong strength to finish my degree and this thesis. I would also like to thank all my colleagues in NETPD Lab and ANGR Lab as well as my entire committee members, who give me valuable discussion and help during my student life. In particular, I would like to thank Fragkiskos Papadopoulos who works closely with me and help me for several research works. Additionally, I would like to thank Pankaj Gupta who gave me opportunity to intern at a wonderful company, Net- Logic, in 2006 and 2007. The valuable working experience with him is really helpful not only for fulfilling my thesis but also for my future career. Finally, I would like to thanks all my friends at USC, who give me a colorful life in the wonderful United States. ii Table of Contents Acknowledgements ii List Of Figures v Abstract viii Chapter 1: Introduction 1 Chapter 2: A Peer-to-Peer Cooperation Enhancement Scheme and its Perfor- mance Analysis 7 2.1 Related Work . . . .... ... .... ... .... .... ... ... 7 2.2 A Simple and Effective Algorithm . . . . . . .... .... ... ... 9 2.3 A Mathematical Model For The Proposed Scheme . . .... ... ... 10 2.3.1 Token Dynamics . . . . .... ... .... .... ... ... 11 2.3.2 A Free-rider Profile Example: The Linear Model . . . . . . . . 13 2.3.3 Steady State Performance Analysis on Gnutella-like Systems . . 19 2.3.4 Choosing the right values for K on ,K up , and K down . ... ... 24 2.4 Experiments . . . . .... ... .... ... .... .... ... ... 26 2.4.1 Simulation Setup . . . . .... ... .... .... ... ... 26 2.4.2 Simulation Results for P on =1 ... .... .... ... ... 28 2.4.3 Simulation Results for P on < 1 ... .... .... ... ... 32 2.4.4 The Impact of Malicious Users on System’s Performance . . . . 34 Chapter 3: Performance Analysis of BitTorrent-like Systems 36 3.1 Related Work . . . .... ... .... ... .... .... ... ... 39 3.2 The BitTorrent System and the Proposed Token Based Scheme . . . . . 41 3.2.1 The BitTorrent System . .... ... .... .... ... ... 41 3.2.2 The Proposed Token Based Scheme . .... .... ... ... 43 3.3 Steady State Performance Analysis of BitTorrent-like Systems . . . . . 45 3.3.1 A Mathematical Model for the Original BitTorrent System . . . 45 3.3.2 A Mathematical Model for the Token Based System . . . . . . 56 3.4 System Time Dynamics . . . . . .... ... .... .... ... ... 61 3.4.1 The Original BitTorrent System . . . .... .... ... ... 62 3.4.2 The Token-enhanced BitTorrent System . . . .... ... ... 64 iii 3.5 Experiments . . . . .... ... .... ... .... .... ... ... 66 3.5.1 Simulation Setup . . . . .... ... .... .... ... ... 66 3.5.2 Steady State Performance Prediction and Flash Crowd Scenarios 68 3.5.3 Predicting System Time Dynamics and Non-flash Crowd Scenarios 72 3.5.4 Impact of the Proposed Token Based Scheme on Fairness . . . . 77 3.5.5 Impact of the Proposed Token Based Scheme on Freeriders . . . 78 3.5.6 Performance Prediction for Large Systems . .... ... ... 79 3.5.7 Comparison Between the Two Models .... .... ... ... 82 3.6 Comparison with Other Models . .... ... .... .... ... ... 84 Table 3.1: Model Comparison . .... ... .... .... ... ... 87 Chapter 4: Performance Analysis of P2P Streaming Systems 88 4.1 Related Work . . . .... ... .... ... .... .... ... ... 89 4.2 Performance Analysis for Data-Driven P2P Streaming Systems . . . . . 91 4.2.1 Preliminary .... ... .... ... .... .... ... ... 92 4.2.2 The Model .... ... .... ... .... .... ... ... 93 4.2.3 Experiments . . . . . . .... ... .... .... ... ... 96 4.3 Trading Social Welfare, Profit and Fairness in P2P Streaming Systems . 99 4.3.1 The Model .... ... .... ... .... .... ... ... 101 4.3.2 Simulation Results . . . .... ... .... .... ... ... 111 4.3.3 Implementing the algorithm . . . . . .... .... ... ... 115 Chapter 5: Conclusions 117 Reference List 119 iv List Of Figures 2.1 The Linear Model for P up . ... .... ... .... .... ... ... 14 2.2 Expected amount of tokens for a free-rider vs. time. . .... ... ... 17 2.3 The effect of different values of K up . . . ... .... .... ... ... 18 2.4 The effect of different values of K on . . . ... .... .... ... ... 19 2.5 User’s expected download rate. . .... ... .... .... ... ... 28 2.6 User’s expected upload rate. . . .... ... .... .... ... ... 29 2.7 Average query response time. . . .... ... .... .... ... ... 31 2.8 Average file download delay. . . .... ... .... .... ... ... 32 2.9 (a) User’s expected download rate, and (b) User’s expected upload rate. 33 2.10 (a) Average query response time, and (b) Average file download delay. . 33 2.11 (a) User’s expected download rate, and (b) Average file download delay with 10% malicious users. . . . .... ... .... .... ... ... 34 3.1 Time line of optimistic unchoking and choking decision making. . . . . 50 3.2 Evolution of the number of peers: (i) During (t 0 ,t 1 ] new users join the system, (ii) during (t 1 ,t 2 ] all users are present in the system, (iii) during (t 2 ,t 3 ] H-BW users depart the system, and (iv) during (t 3 ,t 4 ] only L-BW users are present in the system. . .... ... .... .... ... ... 55 3.3 Average number of L-BW users that a H-BW user is uploading to: (i) Scenario 1, and (ii) Scenario 2. . .... ... .... .... ... ... 68 v 3.4 Average download rate for H-BW and L-BW users: (i) Scenario 1, and (ii) Scenario 2. . . .... ... .... ... .... .... ... ... 69 3.5 Average file download delay for H-BW users, L-BW users, and for the system: (i) Scenario 1, and (ii) Scenario 2. . . .... .... ... ... 69 3.6 Average download rate for H-BW and L-BW users: (i) Scenario 1, and (ii) Scenario 2. . . .... ... .... ... .... .... ... ... 70 3.7 Average file download delay for H-BW users, L-BW users, and for the system: (i) Scenario 1, and (ii) Scenario 2. . . .... .... ... ... 71 3.8 Leecher arrival rate. .... ... .... ... .... .... ... ... 73 3.9 The original BitTorrent system: (i) Number of users in the system, and (ii) Average file download delay for H-BW, M-BW, and L-BW leechers. 73 3.10 The token-enhanced system: Number of users in the system. . . . . . . 76 3.11 Upload-to-download ratio for H-BW and L-BW users: (i) Scenario 1, and (ii) Scenario 2. .... ... .... ... .... .... ... ... 77 3.12 The token-based scheme prevents freeriding: (i) Scenario 1, and (ii) Sce- nario 2. ... ... .... ... .... ... .... .... ... ... 79 3.13 Performance prediction for large systems: (i) The proposed token scheme can prevent freeriders from exploiting large systems, (ii) Our fluid model can reproduce the results of a real trace, (iii) The fluid model can further show the distribution of different classes of users, (iv) Using the proposed token scheme we can tradeoff fairness for overall system performance. . 80 3.14 Comparison between our two models. . . . . .... .... ... ... 82 3.15 Inaccuracy due to the homogeneity assumption. . . . .... ... ... 82 3.16 Inaccuracy due to the static resource allocation assumption. . . . . . . . 83 4.1 Download rate of L-BW and H-BW users in Scenario 1. . . . . . . . . 96 4.2 Download rate of L-BW, M-BW, and H-BW users in Scenario 2. . . . . 97 4.3 Download rate of L-BW, M-BW, and H-BW users in Scenario 3. . . . . 98 vi 4.4 (i) Illustration for the user’s utility function u i (x), and (ii) the tradeoff between social welfare and the operator’s profit for Scenario 1. . . . . . 112 4.5 (i) Tradeoff between social welfare and fairness for Scenario 2, (ii) trade- off between social welfare and fairness for Scenario 3. . . . . . . . . . 113 4.6 The tradeoff between social welfare and the decision making period. . . 115 vii Abstract Peer-to-peer (P2P) systems provide a powerful infra-structure for large scale distributed systems, such as file sharing and content distribution. The performance of peer-to-peer systems depends on the level of cooperation of the system’s participants. While most existing peer-to-peer architectures have assumed that users are generally cooperative, there is great evidence from widely deployed systems suggesting the opposite. To date, many schemes have been proposed to alleviate this problem. However, the majority of these schemes are either too complex to use in practice, or tailored for specific applications. Further, how these incentive schemes affect system performance has not been analytically studied. In this work we first propose a general scheme based on the idea that offering uploads brings revenue to a user, and performing downloads has a cost. We then introduce a theoretical model that predicts the perfor- mance of the system and computes the values of the scheme’s parameters that achieve a desired performance for Gnutella-like systems. Among all peer-to-peer (P2P) systems, BitTorrent seems to be the most prevalent one. This success has drawn a lot of research interest on it. However, despite the large body of work, there has been no attempt to mathematically model, in a heterogeneous (and hence realistic) environment, what is perhaps the most important performance metric from an viii end user’s point of view: the average file download delay. To this end, we propose a mathematical model that accurately predicts the average file download delay in a hetero- geneous BitTorrent-like system. Our analysis can be divided into two parts: steady state analysis and fluid model analysis. Further, we propose a flexible token based scheme for BitTorrent-like systems that can be used to tradeoff between overall system performance and fairness to high bandwidth users, by properly setting it’s parameters. We extend our mathematical model to predict the average file download delays in the token based sys- tem, and demonstrate how this model can be used to decide on the scheme’s parameters that achieve a target performance/fairness. The success of BitTorrent provides a vision for P2P realtime video broadcast, i.e. P2P video streaming. The third part of this thesis is the performance analysis of P2P streaming. Despite the large body of work on P2P streaming systems, to our best knowl- edge, there has been few attempt to analytically study the performance of P2P streaming systems with MDC incentive scheme. Motivated by this, we first propose a mathemat- ical model that predicts the performance of such systems, which can be used to choose system parameters that achieve a target performance. Additionally, there have been few attempts to study the tradeoff between the users’ and the network operator’s interests. With this in mind, we use a stochastic optimization framework to design algorithms that optimize the social welfare, that is, the sum of the users’ utilities, under a given fairness constraint while trading off the operator’s profit. Further, we use the framework to study how a fairness constraint affects system performance, i.e. social welfare. In particular, ix we show that one can also tradeoff between maintaining fairness among users and max- imizing social welfare. Finally, we briefly address practical issues in implementing the proposed algorithms in practical P2P streaming systems. x Chapter 1 Introduction Peer-to-peer (P2P) systems provide a powerful infrastructure for large-scale distributed applications, such as file sharing. As a result, they have become very popular. For exam- ple, 43% of the Internet traffic is P2P traffic [58]. While cooperation among the system’s participants is a key element to achieve good performance, there has been growing evi- dence from widely deployed systems that peers are usually not cooperative. For example, a well known study of the Gnutella file sharing system in 2000 reveals that almost 70% of all peers only consume resources (download files), without providing any files to the system at all [1]. This phenomenon is called “free-riding”. Despite the fact that this phenomenon was identified several years ago, recent studies of P2P systems show that the percentage of free-riders has significantly increased [28]. This is not because industry and academia have ignored the problem. There is a large body of work on incentive mechanisms for P2P networks, varying from centralized and decentralized credit-based mechanisms, e.g. [29,31,42,66], to game-theoretic approaches and utility-based schemes, e.g. [10,57], to schemes that attempt to identify and/or penal- ize free-riders, e.g. [19, 33, 63], the last two being proposed by the popular KaZaA and 1 eMule systems. The problem of free-riders is hard to tackle because the solution has to satisfy conflicting requirements: minimal overhead, ease of use, and at the same time good amount of fairness and resilience to hacking. In this thesis we first propose and study the performance of an efficient algorithm that is very easy to use, it enforces users to be fair, and it can be implemented in a number of ways that tradeoff overhead and resilience to malicious users. According to the algorithm, users use tokens as a means to trade bytes within the system. A user earns K up tokens for each byte he/she uploads to the system and spends K down tokens for each byte he/she downloads from the system. The user may gain K on tokens for each second his/her machine is on the system (i.e. it is online). A user can initiate a download only if the number of tokens that he/she has is large enough to download the complete file. The proposed algorithm relies on the general idea that users should be awarded for offering uploads and staying online, and pay for performing downloads. While others have proposed solutions that use the same general idea in the past, e.g. [57, 66], there are a number of questions that either have not been addressed at all or have been studied via simulations only: (i) How should one tune the parameters that dictate the gain from uploads and the cost of downloads? Specifically to our scheme, what is the right value for the parameters K on ,K up , K down ? (ii) What is the exact effect of such an algorithm on overall system performance over a wide range of conditions? (iii) Would a small number of malicious users, that manage to subvert the scheme, degrade overall performance no- ticeably? (iv) Is it possible to trade off one performance metric for another by varying the parameters of the algorithm, e.g. trade off download delay for total system capacity? Our 2 theoretical analysis of the performance of the resulting system, coupled with extensive realistic simulations, gives concrete answers to all these questions. Interestingly enough, it shows that the query response times and file download delays can be reduced by one order of magnitude while being able to sustain higher user download demands. Among all P2P systems, BitTorrent seems to be the most prevalent one. In particu- lar, more than 50% of all P2P traffic is BitTorrent traffic [51]. The BitTorrent system is designed for efficient large scale content distribution. The complete BitTorrent proto- col can be found in [8]. The most important feature of BitTorrent is its rate based TFT (Tit-for-Tat) unchoking scheme. In the rate based TFT unchoking scheme, a user will provide uploads to four neighbors who provide him/her the highest download rates and to one more, randomly selected neighbor, via a process called optimistic unchoking. This scheme successfully discourages freeriders in the BitTorrent system because freeriders will keep getting choked if they do not provide uploads to other users. This is important given the significant performance degradation of P2P systems due to free-riding [43]. By successfully discouraging freeriders, BitTorrent provides a fast and efficient infrastruc- ture for large scale content distribution. Because of the prevalence and the success of BitTorrent, there is a large body of work studying various aspects of the BitTorrent system, such as its performance analysis [21, 56, 65], incentive schemes for it [24, 36], traffic measurements [7, 30, 51, 53], and fairness issues [6, 64]. However, despite this large body of research, there has been no attempt to mathematically model, in a heterogeneous and hence realistic environment, 3 what is perhaps the most important performance metric from an end user’s point of view: the average file download delay. Our first contribution in this aspect is a simple mathematical model that accurately predicts the average file download delay in a heterogeneous BitTorrent-like system, where users may have different upload/download capacities. In particular, we analyze steady state performance of BitTorrent under the most common flash crowd scenario. Despite its simplicity, our model is quite general, it has been derived with minimal assumptions, and requires minimal system information. Our second contribution is that we propose an alternative TFT scheme based on the general token idea, which is very simple and flexible. In the proposed scheme users use tokens as a means to trade blocks. We show that the proposed scheme can be used to tradeoff between high overall system performance and fairness to high bandwidth users, by properly setting it’s parameters. As a final note, we revise existing fluid model, which only consider homogeneous scenarios, to study the performance of BitTorrent under heterogeneous environment. In addition, we also incorporate the proposed alternative TFT scheme in the fluid model. The revised fluid model can be used to study time dynamics of BitTorrent-like systems. The success of BitTorrent provides a vision for P2P realtime video broadcast, i.e. P2P video streaming. The third part of this thesis is the performance analysis of P2P stream- ing. Video broadcast/streaming over the Internet has drawn a lot of interest on it for a long time. J. Liu et al. give a thorough introduction to Internet video broadcast/streaming systems in [37]. They classify these systems into three categories: IP multicast, Content 4 Distribution Network (CDN), and P2P multicast. Because IP multicast relies on router’s support for multicasting and CDN imposes severe bandwidth requirement on streaming servers, P2P multicast is the most promising technique for Internet video streaming [37]. Among all P2P multicast systems, the data driven approach is the most prevalent imple- mentation for P2P streaming systems, e.g. [54, 55, 62]. The approach is similar to the most famous file sharing system, BitTorrent, in the way that users directly “pull” out useful data from their neighbors. But it is different from the BitTorrent system in that: (i) it uses a sliding window mechanism to confront the realtime constraint of streaming systems, and (ii) it does not use the TFT incentive scheme adapted in BitTorrent. From now on, we refer to P2P streaming systems as systems that use the data driven approach. It is widely reported that P2P streaming systems have become more and more popu- lar [2, 27]. This success has invoked plenty of studies on them. For example, [40, 50, 72] study how to design such systems, [2, 27] collect traffic measurements, [34, 70, 71, 73] analyze their performance, and [38, 39] propose incentive schemes for them. However, despite this large body of work, to our best knowledge, there is no prior attempt to ana- lytically study the performance of P2P streaming systems with MDC incentive scheme. Motivated by this, we first propose a mathematical model that predicts the performance of such systems, which can be used to choose system parameters that achieve a target performance. In addition, there have been few attempts to study the tradeoff between users’ inter- ests, which is usually referred to as social welfare, and the operator’s profit, or between social welfare and fairness. To study these problems, we use a stochastic optimization 5 framework to study the aforementioned problem. The technique of stochastic optimiza- tion has been widely used to study various problems in wireless networks, including resource allocation [35, 44], power allocation [45, 47], dynamic data compression [46], and etc. Base on this framework, we propose a joint demand control and scheduling al- gorithm for P2P streaming systems: the algorithm helps users not only decide a optimal download rate, i.e. demand control, but also determine where to provide uploads, i.e. scheduling. We show that the proposed algorithm can achieve a performance (in terms of users’ social welfare) within O(1/V ) of the optimal with a tradeoff of sacrificing the operator’s profit where V is a control parameter. In addition, we incorporate a fairness constraint in the optimization framework which can maintain fairness among users. The rest of this thesis is organized as follows: In Chapter 2 we present the proposed incentive scheme and its impact on Gnutella-like systems. We provide a mathematical model to compare the performance on Gnutella-like systems with/without the propose scheme. We also present simulation results to validate the proposed model. In Chap- ter 3 we analyze the performance of BitTorrent in two different ways: the study state analysis and time dynamic analysis. Further, we propose an alternative TFT scheme for BitTorrent-like systems based on the proposed token idea. We also present our exper- imental results to verify our models. In Chapter 4 we use two different approachs to analyze the performance of P2P streaming systems. Based on our framework in BitTor- rent, we first analyze the performance of data-driven P2P streaming systems with the MDC scheme. And then, we use a stochastic optimization framework to optimize and study performace of P2P streaming systems. Conclusions are drawn in Chapter 5. 6 Chapter 2 A Peer-to-Peer Cooperation Enhancement Scheme and its Performance Analysis In this section we propose and study the performance of an efficient algorithm that is very easy to use, it enforces users to be fair, and it can be implemented in a number of ways that tradeoff overhead and resilience to malicious users. According to the algorithm, users use tokens as a means to trade bytes within the system. A user earns K up tokens for each byte he/she uploads to the system and spends K down tokens for each byte he/she downloads from the system. The user may gain K on tokens for each second his/her machine is on the system (i.e. it is online). A user can initiate a download only if the number of tokens that he/she has is large enough to download the complete file. 2.1 Related Work There has been a large body of work on incentive mechanisms for P2P networks. Three of the most popular localized schemes are the ones implemented by the eMule [63], the KaZaA [32], and the BitTorrent [8] systems. eMule rewards users that provide files to 7 the system by reducing their waiting time when they upload files using a scoring function (called QueueRank). Similarly, in KaZaA, each peer computes locally its Participation Level as a function of download and upload volumes, and peers with high participation levels have higher priority [33]. A disadvantage of both of these schemes is that they provide relatively weak incentives for cooperation since peers that have not contributed to the system at all can still benefit from it, if they are patient enough to wait in the upload queues. Other problems include that they favor users with high access bandwidth, which may result in frustration or a feeling of unfairness [9], and that they are vulnerable to the creation and distribution of hacked daemons that do not conform to the desired behavior [23]. BitTorrent uses a different scheme that is specific to its architecture. Each peer periodically stops offering uploads to its neighbors that haven’t been offering uploads to him/her recently. This scheme is hard to subvert. However, it suffers from some unfairness issues and it only works with “BitTorrent-style” systems, that is, in systems where files are broken into pieces, and downloading a file involves being connected to almost all of ones neighbors in order to collect and reassemble all the pieces of the file. Non-localized proposals are primarily concerned with creating systems that cannot be subverted. Some of them make use of credit/cash-based systems. They achieve pro- tection from hackers by either using central trusted servers to issue payments (central- ized approach), e.g. [29, 42], or by distributing the responsibility of transactions to a large number of peers (distributed approach), e.g. [66]. Other distributed approaches use lighter-weight exchanged-based mechanisms, e.g. [3], or reputation management 8 schemes, e.g. [31]. These mechanisms are indeed hard to subvert but they are also quite complex to use in practice [3]. In this paper we decouple the issue of how to design an algorithm to prevent free- riding from the issue of how to implement this algorithm in a P2P system. We first propose an efficient scheme that provides very strong incentives for cooperation. We show this via both theory and simulations. Then, we show that the scheme is generally applicable to any P2P system and comment on how to implement it using either a lo- calized, or a non-localized approach. Another important contribution is the theoretical analysis of the performance of a P2P system with and without the proposed scheme. The analysis yields a set of equations that are used to predict the system’s performance under a wide range of conditions, and to tune the parameters of the scheme. 2.2 A Simple and Effective Algorithm As mentioned earlier, the algorithm uses tokens as a means to trade bytes within the system. Each user is given an initial number of tokens M when he/she first joins the network. This allows new users to start downloading a small number of files as soon as they join the system. When a user rejoins the system he/she uses the amount of tokens he/she previously had. Users spend K down tokens for each byte they download from the system and earn K up tokens for each byte they upload to the system. This forces users to offer files for upload proportionally to the number of files they want to download. Further, users gain K on tokens/sec while being online. This mechanism of accumulating tokens serves two 9 purposes. First, it allows users who are not contacted frequently for uploads to gain tokens by just being online, which is more fair towards users with low access bandwidth [9]. Second, it provides an incentive for users to keep their machines on the system even when they are not downloading a file, which helps to prevent the so-called problem of “low availability” [4]. Note that the value of K on should be relatively small, in order to prevent users from gaining many tokens by just keeping their machines on without providing any uploads. Finally, a user can initiate a download only if the number of tokens he/she currently possesses is greater or equal to the number of tokens required to download the requested file. This scheme provides strong incentives for cooperation. Free-riders are “forced” to provide some uploads to the system in order to gain tokens fast enough to sustain their desirable download demands. Some free-riders may decide to share their files as soon as they are out of tokens. Others may adopt a more dynamic behavior and decide to adjust the number of uploads they provide to the system as a function of the number of tokens they currently have. In any case, the change in the free-rider’s behavior increases the amount of available system resources tremendously, which, in turn, significantly im- proves the system’s performance, as we shall see in Section 2.4. 2.3 A Mathematical Model For The Proposed Scheme In this section we derive a mathematical model, which can be used to tune the parame- ters of the scheme and to predict the system’s performance. Tuning the parameters of the scheme is important because an arbitrary setting of their parameters may lead to several 10 undesired situations. For example, giving a large value to K on may provide tokens to the free-riders fast enough, so that there won’t be any reason for them to start sharing their files with the system. As another example, giving relatively small values to both K on and K up may reduce the token accumulation rate of cooperative users so much such that they cannot sustain their download demands. Further, predicting the performance of the sys- tem from the model is beneficial because the alternative is P2P simulations/experiments, and those either involve a significantly smaller number of peers than the number in real- ity, or are prohibitively expensive. 2.3.1 Token Dynamics We assume a system that implements the proposed scheme, which we call “system with the tokens”. Recall that K down and K up are expressed in tokens/byte and K on in to- kens/sec. Now, let C down and C up denote the file download and upload speeds of a user (access line bandwidth), both expressed in bytes/sec. The user spends K down C down dt to- kens if he/she is downloading files from other peers during time (t, t + dt). Also, he/she earns K on dt tokens if he/she is online during time (t, t + dt) and K up C up dt tokens if other users are downloading files from the user under study during time (t, t + dt). Let T(t) denote the number of the user’s tokens at time t, with T(0)≥ 0. We can then write the following differential equation: dT(t) dt = K on I on (t)+ K up C up I up (t) −K down C down I down (t), (2.1) 11 where I on (t)= ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ 1 if the user is online in (t, t+dt) 0 otherwise , I up (t)= ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ 1 if the user provides uploads in (t, t+dt) 0 otherwise , I down (t)= ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ 1 if the user performs downloads in (t, t+dt) 0 otherwise . Taking expectations on both sides of Equation (2.1), and interchanging the expectation with the derivative on the left hand side 1 , we get: dE[T(t)] dt = K on P on (t)+ K up C up P up (t) −K down C down P down (t), (2.2) where P on (t) is the probability that the user is online at time t, P up (t) is the probability that the user provides uploads to the system at time t, and P down (t) is the probability that the user performs downloads from the system at time t. Note that Equation (2.2) can be regarded as a fluid model describing the token dynamics. P on (t), P up (t), and P down (t) depend on how the user behaves given the number of tokens that he/she has at some point in time, and on his/her download demands. Along these lines, one can define user profiles, e.g. for non-freerider and free-rider users, and 1 Taking into account that T(t) is bounded in practice, we can use the bounded convergence theorem [17] to justify the interchange. 12 solve the corresponding differential equation. The solution can be used to study how the expected amount of tokens of the particular class of users evolves as a function of time, for different values of the scheme’s parameters and for different download and upload speeds. We next demonstrate this by considering free-rider users, under a simplistic profile that captures their main behavior. (One can also perform a similar study for non- freerider users.) 2.3.2 A Free-rider Profile Example: The Linear Model As mentioned earlier, free-riders are motivated to provide uploads to the system when they do not have enough tokens to sustain their download demands and they may lose their willingness to provide uploads, as the amount of tokens they possess is larger than the amount of tokens they need. To capture the main characteristics of this behavior we introduce a simple model, which we call the Linear Model. According to the Linear Model, a free-rider provides uploads to the system with a high (constant) probability, say P upmax , when his/her amount of tokens is less than some threshold, say T th1 . When his/her amount of tokens reaches T th1 , the probability that he/she provides uploads to the system is linearly decreasing as the amount of tokens he/she possesses continues to increase. (This is the reason why we call it the Linear Model.) The decrease continues until the amount of tokens he/she has reaches some threshold, say T th2 . After T th2 , we assume that the free-rider provides uploads to the 13 P upmax P upmin T th1 T th2 T(t) Pup(T(t)) Figure 2.1: The Linear Model forP up . system with some low (constant) probability, say P upmin . This behavior is depicted in Figure 2.1 and characterized by Equation (2.3). 2 P up (t)= ⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎩ P upmax ifT(t)≤ T th1 P upmin ifT(t)≥ T th2 P upmax 1− C 1 T(t)− T th1 T th2 − T th1 otherwise , (2.3) where C 1 = P upmax − P upmin P upmax . 2 Determining more realistic free-rider profiles of their upload behavior as a function of the amount of tokens they possess, is an interesting and challenging research problem in its own right, and it is out of the scope of this paper. 14 For ease of exposition, we assume that P on (t)=1, i.e. users are always online, and that P down (t)= P down , i.e. independent of time t. Then, Equation (2.2) along with Equation (2.3) yield: dE[T(t)] dt = ⎧ ⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎩ K on + C 2 − C 3 if T(t)<T th1 K on − C 3 +C 2 1− C 1 T(t)− T th1 T th2 − T th1 otherwise , (2.4) where C 2 = K up C up P upmax , C 3 = K down C down P down . Solving the above differential equation with an initial condition T(0) = 0, gives: E[T(t)] = ⎧ ⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎩ (K on + C 2 − C 3 )t ift<t 0 C 4 T th1 + K on + C 2 − C3 C 4 + C 3 − K on − C 2 C 4 e C 4 (t−t 0 ) otherwise , (2.5) 15 where C 4 = K up C up (P upmax − P upmin ) T th2 − T th1 , t 0 = T th1 K on + K up C up P upmax − K down P down C down . We can now assign values to K on , K up , K down , C up , C down , P upmax , P upmin , T th1 , T th2 , and P down , and use Equation (2.5) to study the dynamics of the expected amount of tokens of a free-rider. For example, Figure 2.2 shows how the expected number of tokens evolves as a function of time, when K on = 1000 tokens/sec, K up =0.5 tokens/byte, K down =1 tokens/byte, C up =1.5Mbps, C down =1.5Mbps, P upmax =1, P upmin =0.1, T th1 =10 6 tokens, T th2 =10 7 tokens, and P down =0.3. From the plot we can observe that, initially, the expected amount of tokens increases linearly as a function of time, which represents the first part of Equation (2.5) (wheret<t 0 =84). After the amount of the free-rider’s tokens reaches threshold T th1 =10 6 (which occurs at time t 0 =84), the token accumulation rate starts decreasing, since the free-rider now provides uploads to the system with a lower probability. Finally, the user adapts to an equilibrium point, where the token spending rate equals the token earning rate, and the expected amount of tokens settles to a steady state value, which we refer to as T ss . 3 3 Similar results hold for other values of the model parameters. 16 0 200 400 600 800 1000 0 1 2 3 4 5 6 x 10 6 Time (sec) Expected Amount of Tokens Figure 2.2: Expected amount of tokens for a free-rider vs. time. Notice that T ss can be obtained directly by either letting t→∞ in Equation (2.5), or by setting Equation (2.4) equal to zero, i.e. without the need of solving the differential equation first. It is given by: T ss = (K on + C 2 − C 3 )(T th2 − T th1 ) C 1 C 2 + T th1 . (2.6) Now, substituting Equation (2.6) into Equation (2.3), one can also get an expression of the steady state upload probability. This probability, denoted by P upss , is given by: P upss = K down P down C down − K on K up C up . (2.7) Equations (2.6) and (2.7), which are much simpler than Equations (2.5) and (2.3), can be used for “back-of-the-envelope” calculations, and for gaining a better intuition on the long-run system behavior. 17 We now proceed to study the impact of different values of K up and K on on token dynamics. First, lets keep K on , and the rest of the parameters fixed, and vary K up . The results are shown in Figure 2.3. From the plots, we observe that as K up increases, T ss 0 500 1000 0 1 2 3 4 5 6 7 8 9 10 x 10 6 Time (sec) Expected Amount of Tokens K up =0.5 K up =1 K up =2 0 500 1000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Time (sec) Upload Probability K up =0.5 K up =1 K up =2 Figure 2.3: The effect of different values ofK up . increases and P upss decreases. This can be easily justified by looking at Equations (2.6) and (2.7). Intuitively, as K up increases, free-riders are accumulating tokens faster and are able to settle to a larger T ss . At the same time, they need to provide fewer uploads in order to gain the tokens they need to sustain their download demands, which yields a smaller P upss . Now, lets study the impact of the proposed scheme when we vary K on , keeping the rest of the parameters fixed. The results are shown in Figure 2.4. First, from the plots we observe that the effect of choosing different values of K on is subtle when K on is much smaller than the corresponding (long-run) token earning rate from uploads, which is K up C up P upss . This is expected, as small K on values do not have a great influence on the mechanism by which free-riders can accumulate tokens, and thus T ss and P upss , remain 18 0 500 1000 0 2 4 6 8 10 12 x 10 6 Time (sec) Expected Amount of Tokens K on =1 K on =1000 K on =50000 0 500 1000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Time (sec) Upload Probability K on =1 K on =1000 K on =50000 Figure 2.4: The effect of different values ofK on . almost unaltered. On the other hand, a comparatively large value of K on can have a great impact on the token accumulation mechanism, and thus on T ss and P upss . In particular, as we can see from the plots, it can significantly increase T ss and decrease P upss .As before, this can be easily justified by looking at Equations (2.6) and (2.7). Intuitively, with larger values of K on , free-riders can accumulate tokens faster and are able to settle to a larger T ss . And this can be accomplished by just staying online, without the need of providing many uploads to the system, which yields a smaller P upss . 2.3.3 Steady State Performance Analysis on Gnutella-like Systems So far, we have seen how one can use Equation (2.2) in order to study the token dynam- ics for a specific class of users (both transient and steady state behavior), and we have demonstrated the effect of using different values of the scheme’s parameters on the to- ken accumulation and the upload probability. We are now interested in using Equation 19 (2.2) to derive a mathematical model that can be used for predicting steady state perfor- mance metrics (such as user download/upload rates, which we will define shortly), and for tuning the scheme’s parameters in order to achieve a target performance. To study the steady state we set dE[T(t)] dt =0 and drop the time dependence from the probabilities in Equation (2.2). Note that the existence of a steady state can be easily jus- tified for a free-rider independently of his/her exact profile, by taking into consideration that in the long run he/she will spend as many tokens as he/she gains. 4 Now, let R up be the long-run average rate of file upload requests per second that the user handles, which we refer to as the upload rate. Also, let R down be the long-run average rate of file download requests per second that the user initiates, which we refer to as the download rate. Last, let S denote the average file size in the system in bytes. Then, it is easy to see that P up = RupS Cup and P down = R down S C down . Equation (2.2) in steady state yields: K on P on + K up R up S− K down R down S =0. Taking the average over all free-riders yields: K up = K down E[R down |FR] E[R up |FR] − K on P FR on E[R up |FR]S , (2.8) where P FR on is the long-run probability that a free-rider is online. 5 Equation (2.8) relates the parameters of the scheme, K on ,K up , and K down , with the average download and upload activity of free-riders. We will later use it to select the 4 Considering the existence of a steady state for a non-freerider is a bit of more involved. As we will shortly see he/she may or may not have a steady state. Nevertheless, this will not be important for the system dynamics. 5 Similarly, we denote the long-run probability that a non-freerider is online by P NF on . 20 parameter values that yield a target performance. But first, we need to compute the average download and upload rates. 2.3.3.1 User Download Rate (R down ) Let N be the number of peers in the system and let a proportion α of them be free-riders. Assume that free-riders are uniformly distributed over the system. Also, assume that both non-freeriders and free-riders have the same download demands. In particular, they have the same query request rate, denoted by R q queries/sec, and the same preference over files, that is, each query is for file i with some probability Q f (i) irrespectively of the query issuer. Now, in order to proceed, we have to define an upload profile for free-riders, which determines how they respond to query requests given the number of tokens they have (i.e. when do they initiate an upload?). To make the analysis tractable, we assume the following simple upload profile: Free-riders respond to a query request only if the amount of tokens they currently have is less than the amount required to download the file they currently desire. (Recall that non-freeriders always respond to query requests.) Let P ans (i) be the probability that a query request for file i is successfully answered. Now, recall that in the system with the tokens a user can initiate a download only if the amount of tokens he/she has is larger than the amount required to download the file. Let P FR tkn and P NF tkn denote respectively the probability that a free-rider and a non-freerider 21 have enough tokens to initiate a download. Then, we can express the average download rate of free-riders and non-freeriders as follows: E[R down |FR]= i R q · Q f (i)· P ans (i)· P FR tkn · P FR on , (2.9) E[R down |NF]= i R q · Q f (i)· P ans (i)· P NF tkn · P NF on , (2.10) where the summation is taken over all files i. Clearly, the average download rate over all users in the system is: E[R down ]= E[R down |FR]· α + E[R down |NF]· (1− α). (2.11) To complete the calculation of the download rates, note that R q , Q f (i), P FR on , P NF on , and α are given quantities. (There exist a large body of work in measurement studies of P2P systems, e.g. [13,59], from which one can deduce typical values for these quantities.) Hence, what remains is to compute P ans , P FR tkn , and P NF tkn . We start by deriving a relation between P FR tkn and P NF tkn . First, recall that in steady state the token earning rate equals the token spending rate for each free-rider. A free-rider responds to a query request only when he/she doesn’t have enough tokens and certainly when he/she is online, i.e. with probability (1− P FR tkn )P FR on . Since a non-freerider always responds to a query request when he/she is online, it is easy to see that the token earning rate of free-riders over that of non-freeriders equals (1−P FR tkn )P FR on P NF on . Now, the token spending rate is proportional to the download rate, and Equations (2.9) and (2.10) imply that the token spending rate of free-riders over that of non-freeriders equals P FR on P FR tkn P NF on P NF tkn . Assuming that non-freeriders are 22 also in steady state (in which case Equation (2.8) also holds if the average is taken with respect to non-freeriders only), we can equate these two ratios and write P NF tkn = P FR tkn (1−P FR tkn ) . Clearly, this equality is valid for P FR tkn ≤ 0.5. In particular when P FR tkn =0.5, P NF tkn =1, which implies that non-freeriders always have enough tokens to initiate downloads. For P FR tkn > 0.5, the last equality no longer holds. In this case the token earning rate of non- freeriders will be larger than their token spending rate, which implies that their amount of tokens will continuously increase. However, this still suggests that P NF tkn =1. We can now write: P NF tkn =min 1, P FR tkn 1− P FR tkn . (2.12) Now lets find a relation for P ans (i). First, assume that due to congestion at the overlay layer [25], each message (either a query request or a query response) has a probability p of being dropped at some peer. 6 Then, if L is the average number of overlay hops until a query is answered, P drop =1− (1− p) L is the probability that the query response is lost. Next, observe that if K ≤ N is the average number of peers that a query request can reach if all users were online, the request can be answered by an average of K· (P FR on · (1− P FR tkn )· α + P NF on · (1− α)) peers, which we call K eff . Finally, let P f (i) be the probability that a peer has file i. We can then write: P ans (i)=1− (1− P f (i)· (1− P drop )) K eff . (2.13) 6 This assumption is introduced to make the model more general. A well designed system usually has p ≈ 0, which is accom- plished by setting the buffer size of the TCP socket sufficiently large. 23 2.3.3.2 User Upload Rate (R up ) The total number of downloads equals the total number of uploads in any system and thus the expected download and upload rates over all nodes are also equal. This does not mean that all peers provide uploads. For example, in a system that does not implement the proposed scheme, E[R down ]= E[R up ], but we know that only non-freeriders provide uploads, i.e. E[R up |FR]=0, and hence E[R up |NF]= E[R down ] (1−α) . On the other hand, in the system with the tokens each free-rider answers to a query request with probability P FR on (1− P FR tkn ). As a result, this system behaves as if there are N · (P NF on · (1− α)+ P FR on α· (1− P FR tkn )) non-freeriders. It is easy to see that the expected upload rate of each non-freerider is now given by: E[R up |NF]= P NF on · E[R down ] P NF on · (1− α)+ P FR on · α· (1− P FR tkn ) . (2.14) And, since E[R up ]= E[R down ], the expected upload rate of each free-rider equals: E[R up |FR]= P FR on · (1− P FR tkn )· E[R down ] P NF on (1− α)+ P FR on α· (1− P FR tkn ) . (2.15) 2.3.4 Choosing the right values for K on ,K up , and K down We use P FR tkn as the design parameter of our system since it dictates how often free- riders offer uploads, which, in turn, specifies the average amount of available resources in the system. We are given the query- and file-popularity probability functions Q f (i), P f (i), the query request rate R q , the user statistics P FR on , P NF on , and information about the overlay network. (For example, information about the overlay network includes the 24 percentile of free-riders α, the socket buffer sizes that dictate the drop probability p, and the structure of the overlay graph as well as the search algorithm that dictate the number of peers that a query reaches K and the average path length between a query issuer and a query responder L.) We want to find a set of values for K on ,K up and K down that will satisfy a target P FR tkn , and, in turn, a target system performance. First, observe from Equation (2.8) that it is the relative values of K on ,K up , and K down that are important for the proper operation of the system. Recall also that K on should be sufficiently smaller than the token spending rate of free-riders. This is to prevent them from accumulating enough tokens by just staying online without offering any upload. Thus, we should have K on K down E[R down |FR]S. With the above observations in mind we proceed as follows in order to satisfy the target P FR tkn : (i) Fix K down to some arbitrary value, (ii) use Equation (2.12) to compute P NF tkn , (To guarantee that cooperative users will not be penalized, P NF tkn should be close to 1.) (iii) use Equations (2.9) and (2.13) to compute the value of E[R down |FR], and Equations (2.15), (2.11) and (2.10) to compute E[R up |FR], (iv) assign a value to K on which is one order of magnitude smaller than K down E[R down |FR]S, (The specific value turns out not to affect the performance sizeably.) and (v) use Equation (2.8) to find the right value for K up . 25 Conversely, if we are given the values of K on ,K up , and K down we can use our equa- tions to predict quantities like E[R down |FR], E[R down |NF], E[R up |FR] and so on. 7 In the next Section we verify the accuracy of our analysis via experiments on top of TCP networks, and show the impact of the proposed scheme on system’s performance. 2.4 Experiments 2.4.1 Simulation Setup For our experiments we use GnutellaSim [49], a packet-level peer-to-peer simulator build on top of ns-2 [48], which runs as a Gnutella system. We implement the file downloading operation using the HTTP utilities of ns-2. We use a 100-node transit-stub network topology as the backbone, generated with GT-ITM [11]. We attach a leaf node to each stub node. Each leaf node represents a peer. The propagation delays assigned to the links of the topology are proportional to their length and are in the order of ms. We assign capacities to the network such that the congestion levels are moderate. The capacity assigned to a peer’s access link is 1.5Mbps. In order to test the algorithm on a general gnutella-like unstructured P2P network we use Gnutella v0.4, which uses pure flooding as the search algorithm and does not distinguish between peers. The TTL for a query request message is set to 7 (the default value used in Gnutella). 7 Note that we can also use Equations (2.9)...(2.15) to compute upload/download rates in a system that does not implement the scheme, by setting P FR tkn =1. 26 For simulation purposes we implement the following user behavior: each user initi- ates query requests at the constant rate of 1 query every 20sec. Once a timeout for a query request occurs, the corresponding query is retransmitted. The maximum number of retransmissions is set to 5, 8 and the timeout to 60sec. There are 1000 distinct files in the system, i=1...1000. A query request is for file i with probability proportional to 1 i (Zipf distribution). The number of replicas of a certain file is also described by a Zipf distribution with a scaling parameter equal to 1, and the replicas of a certain file are uniformly distributed among all peers. These settings are in accordance with measurement studies from real P2P networks [13, 59]. We distinguish two systems: (i) the original system which does not implement the proposed algorithm, and (ii) the system with the tokens. In both systems, 85% of peers are free-riders in accordance to the percentage reported in [28]. Finally, the file size is set to 1MB. We first perform simulations for the following two scenarios: (i) when P on =1, i.e. when all peers initially join the system and never go offline, and (ii) when P on < 1, i.e. when peers dynamically join and leave the system according to P on . Then, we study what the impact on system’s performance is, when some malicious users subvert the proposed scheme. 8 This corresponds to the situation where a user quits searching for a certain file, after reattempting an unsuccessful search, e.g. by using a different keyword, for 5 consecutive times. 27 2.4.2 Simulation Results for P on =1 2.4.2.1 Download and Upload Rates For various values of the design parameter P FR tkn we compute the corresponding values of K on ,K up and K down according to the procedure described in the previous Section. We then assign these values to all users of the system and compare the theoretical download and upload rates with the experimental results. Figure 2.5 and 2.6 show respectively the expected download and upload rate over all non-freeriders, over all free-riders, and over all users of the system, as a function of P FR tkn . 0 20 40 60 80 100 0 5 10 15 20 25 30 35 40 45 50 P tkn FR (%) Download Rate (Downloads / 1000 sec) E[R down |NF] (Theoretical) E[R down |NF] (Simulation) E[R down |FR (Theoretical) E[R down |FR] (Simulation) E[R down ] (System with Tokens-Theoretical) E[R down ] (System with Tokens-Simulation) E[R down ] (Original System) P tkn FR =0.55 P tkn FR =0.32 P tkn FR =0.45 Figure 2.5: User’s expected download rate. The horizontal line in Figure 2.5 represents the expected download rate of a user in the original system. (Clearly, in the original system E[R down ]= E[R down |FR]= E[R down |NF].) The horizontal line in Figure 2.6 represents the expected upload rate of a non-freerider in the original system. (Recall that in this system E[R up |FR]=0.) 28 0 20 40 60 80 100 0 50 100 150 P tkn FR (%) Upload Rate (Uploads / 1000 sec) E[R up |NF] (Theoretical) E[R up |NF] (Simulation) E[R up |FR (Theoretical) E[R up |FR] (Simulation) E[R up ] (System with tokens-Theoretical) E[R up ] (System with Tokens-Simulation) E[R up ] (Original System) P tkn FR =0.32 P tkn FR =0.55 Figure 2.6: User’s expected upload rate. It is clear from the plots that analytical and simulation results match. Further, we can make several interesting observations. First, notice that as P FR tkn increases, the download rate for both classes of users first increases and then starts decreasing until it reaches the value of the original system. Second, observe that while the upload rate of free-riders behaves in a similar manner, the upload rate of non-freeriders continuously increases until it reaches its original value. Based on these observations we divide the plots into three regions. The first region corresponds to P FR tkn < 0.32. In this region, both classes of peers are constrained to a lower download rate compared to the original system, since the probability of having tokens to initiate a new download after a successful query is pretty low. Notice that for P FR tkn =0.32, and hence for P NF tkn =0.47 < 1, cooperative users can at least sustain the same download rate they had in the original system. The second region corresponds to 0.32≤ P FR tkn ≤ 0.55. In this region, users accumulate tokens at a higher rate than before. Since there are more responses than in the original network, users can 29 use the extra tokens to initiate more downloads. Notice that cooperative users earn tokens faster than free-riders since they always respond to query requests. At P FR tkn =0.55, non-freeriders achieve their maximum download rate, which is approximately twice the one they had in the original system. Finally, the third region corresponds to 0.55 < P FR tkn ≤ 1. In this region free-riders accumulate tokens faster than before and reduce their query response rate since they do not need to provide as many uploads as before. This causes cooperative users to handle more uploads. Further, since the query response rate regulates the download rate, the latter also decreases. At P FR tkn =1, the two systems have approximately the same performance, as expected. 2.4.2.2 Impact On Delays Figure 2.7 and 2.8 show respectively the average query response time (that includes retransmissions) and the average download delay as a function of P FR tkn . The plots can be divided in the same three regions as before. For P FR tkn < 0.32, the low user download rate imposes a low load into the network. This yields the low delays. For 0.32≤ P FR tkn ≤ 0.55, as the user download rate increases, the load in the network and hence the delays also increase. Note that the query and download delays are still significantly smaller than in the original system, despite that the download rate, and hence the load, is higher. This is because a significant portion of the load is now handled by the free-riders. For 0.55 < P FR tkn ≤ 1 the delays continue to increase even though the download rate decreases. This is because free-riders provide fewer and fewer uploads. As P FR tkn approaches 1, the performance of the two systems is approximately the same. 30 0 20 40 60 80 100 0 50 100 150 P tkn FR (%) Average Query Response Time (Seconds) System with Tokens Original System P tkn FR =0.32 P tkn FR =0.55 Figure 2.7: Average query response time. To fairly compare the delays between the two systems, we should consider the case where the load is the same, i.e. where E[R down ]= 22 downloads/1000sec. This value corresponds to P FR tkn =0.45, and as we can see from the plots this corresponds to approx- imately one order of magnitude lower query and file download delays. This is a gigantic amount of improvement on the system’s performance. As a final note, the best operating region is the second, where 0.32≤ P FR tkn ≤ 0.55.In this region, we can either choose to operate the system at P FR tkn =0.32, where cooperative users can sustain the same download demands as in the original system, or sacrifice a bit from the performance improvement with respect to reduced delays to support higher user demands. 31 0 20 40 60 80 100 0 50 100 150 200 250 P tkn FR (%) Averaged File Download Delay (Seconds) System with Tokens Original System P tkn FR =0.32 P tkn FR =0.55 Figure 2.8: Average file download delay. 2.4.3 Simulation Results for P on < 1 We now study the impact of the proposed scheme in the more realistic scenario where P on < 1. In particular, we now set P NF on = P FR on =0.5. 9 As before, we first compare the theoretical and simulation results for the expected download and upload rates, and then show the impact of the scheme on system delays. 2.4.3.1 Download and Upload Rates Figure 2.9-(a) and (b) show respectively the expected download and upload rate over all non-freeriders, over all free-riders, and over all users of the system, as a function of P FR tkn . As before, the horizontal line in Figure 2.9-(a) represents the expected download rate of a user in the original system. And, the horizontal line in Figure 2.9-(b) represents the expected upload rate of a non-freerider in the original system. We again see from the 9 P FR on and P NF on do not have to be equal. Similar results hold for other values of these parameters. 32 0 10 20 30 40 50 60 70 80 90 100 0 5 10 15 20 25 30 P tkn FR (%) Download Rate (Downloads / 1000 sec) E[R down |NF] (Theoretical) E[R down |NF] (Simulation) E[R down |FR (Theoretical) E[R down |FR] (Simulation) E[R down ] (System with Tokens−Theoretical) E[R down ] (System with Tokens−Simulation) E[R down ] (Original System) P tkn FR =0.33 P tkn FR =0.5 0 20 40 60 80 100 0 10 20 30 40 50 60 70 80 90 100 P tkn FR (%) Upload Rate (Uploads / 1000 sec) E[R up |NF] (Theoretical) E[R up |NF] (Simulation) E[R up |FR (Theoretical) E[R up |FR] (Simulation) E[R up ] (Theoretical) E[R up ] (Simulation) E[R up |NF] (Original System) P tkn FR =0.33 P tkn FR =0.50 (a) (b) Figure 2.9: (a) User’s expected download rate, and (b) User’s expected upload rate. plots that analytical and simulation results match. Further, we can divide the plots into regions and justify the behavior in each region, just like we did before. 2.4.3.2 Impact On Delays Figure 2.10-(a) and (b) show respectively the average query response time and the aver- age download delay as a function of P FR tkn . We again observe a similar behavior as with the case where P on =1. 0 20 40 60 80 100 0 10 20 30 40 50 60 P tkn FR (%) Average Query Response Time (Seconds) System with Tokens Original System P tkn FR =0.33 P tkn FR =0.50 0 20 40 60 80 100 0 50 100 150 200 P tkn FR (%) Average File Download Time (Seconds) System with Tokens Original System P tkn FR =0.33 P tkn FR =0.50 (a) (b) Figure 2.10: (a) Average query response time, and (b) Average file download delay. 33 As we can see from Figure 2.9 and 2.10, the performance improvement from utilizing the proposed scheme can still be quite significant, even if users dynamically join and leave the system. In particular, under the appropriate parameter tuning, one can again reduce the delays by one order of magnitude, while being able to sustain higher user download demands. 2.4.4 The Impact of Malicious Users on System’s Performance 0 20 40 60 80 100 0 5 10 15 20 25 30 35 40 45 P tkn FR (%) Download Rate (Downloads / 1000 sec) E[R down |NF] (No Malicious Users) E[R down |NF] (10% Malicious Users) E[R down |FR] (No Malicious Users) E[R down |FR] (10% Malicious Users) 0 20 40 60 80 100 0 50 100 150 200 250 P tkn FR (%) Averaged File Download Delay (Seconds) No Malicious Users 10% Malicious Users (a) (b) Figure 2.11: (a) User’s expected download rate, and (b) Average file download delay with 10% malicious users. So far we have seen that system’s performance can be tremendously improved when adopting the proposed scheme. However, this was for the case where all users were well-behaved. We are now interested in studying how much performance degeneration is observed if a small percentage of users are malicious. In particular, lets assume that 10% of free-riders have managed to circumvent the proposed scheme, so that they can download files as long as they receive query responses from the system, independently from the amount of tokens they currently have. 34 Figure 2.11-(a) and (b) show respectively the user expected download rate and the average file download delay in a system with and without malicious users. From the plots we observe that performance degeneration due to the presence of malicious users is almost negligible. Notice however that, in general, the user download rate in the system with malicious users is smaller than the rate in a system without malicious users, and that the average file download delay in the system with malicious users is larger. These ob- servations can be easily justified. First, malicious users do not respond to query requests, which means a lower user download rate. And, since fewer users provide uploads, the file download delay increases. 35 Chapter 3 Performance Analysis of BitTorrent-like Systems Among all P2P systems, BitTorrent seems to be the most prevalent one. In particular, more than 50% of all P2P traffic is BitTorrent traffic [51]. The BitTorrent system is designed for efficient large scale content distribution. The complete BitTorrent protocol can be found in [8]. We summarize the main functionality here. BitTorrent groups users by the file that they are interested in. In each group there exists at least one user, called seed, who has the complete file of interest. The seed is in charge of disseminating the file to other users, called leechers, who do not have the file. When disseminating the file, BitTorrent partitions the whole file into a large number of blocks and then the seed starts uploading blocks to its neighbors. Meanwhile, users of the group exchange the blocks they have with their neighbors. When a user has all the blocks of the file, he/she finishes the download process and becomes a potential seed. There are at least three features making BitTorrent successful. First, BitTorrent breaks a complete file into blocks and disseminates the file by sending blocks instead of sending the complete file. In this way, users, who have a partial file, can exchange their blocks with their neighbors without the help of the seed. As a result, the service capacity of the 36 system is enlarged because every participating user can contribute to the system even if he/she only has a partial file. Second, BitTorrent uses the local rarest first (LRF) block selection algorithm to disseminate blocks, which means users will prefer to download the rarest block among their neighbors. After a user downloads the “rarest” block, he/she can disseminate this block to other users and thus can increase the availability of this block. It has been shown that the LRF algorithm can efficiently enlarge the service capacity and prevent the last block problem [6]. The most important feature of BitTorrent is its rate based TFT (Tit-for-Tat) unchok- ing scheme. In the rate based TFT unchoking scheme, a user will provide uploads to four neighbors who provide him/her the highest download rates and to one more, ran- domly selected neighbor, via a process called optimistic unchoking. This scheme suc- cessfully discourages freeriders in the BitTorrent system because freeriders will keep getting choked if they do not provide uploads to other users. This is important given the significant performance degradation of P2P systems due to free-riding [43]. By success- fully discouraging freeriders, BitTorrent provides a fast and efficient infrastructure for large scale content distribution. Because of the prevalence and the success of BitTorrent, there is a large body of work studying various aspects of the BitTorrent system, such as its performance analysis [21, 56, 65], incentive schemes for it [24, 36], traffic measurements [7, 30, 51, 53], and fairness issues [6, 64]. However, despite this large body of research, there has been no attempt to mathematically model, in a heterogeneous and hence realistic environment, 37 what is perhaps the most important performance metric from an end user’s point of view: the average file download delay. Our first contribution in this aspect is a simple mathematical model that accurately predicts the average file download delay in a heterogeneous BitTorrent-like system, where users may have different upload/download capacities. In particular, we analyze steady state performance of BitTorrent under the most common flash crowd scenario. Despite its simplicity, our model is quite general, it has been derived with minimal assumptions, and requires minimal system information. Our second contribution is that we propose an alternative TFT scheme based on the general token idea, which is very simple and flexible. In the proposed scheme users use tokens as a means to trade blocks. Each user maintains a token table which keeps track of the amount of tokens his/her neighbors possess. A user increases his/her neighbor’s tokens by K up for every byte he/she downloads from the neighbor. On the other hand, the user decreases a neighbor’s tokens by K down for every byte he/she uploads to the neighbor under study. A user would upload a block to his/her neighbor only if the neigh- bor has sufficient tokens to perform the download. We show that the proposed scheme can be used to tradeoff between high overall system performance and fairness to high bandwidth users, by properly setting it’s parameters. As a final note, we revise existing fluid model, which only consider homogeneous scenarios, to study the performance of BitTorrent under heterogeneous environment. In addition, we also incorporate the proposed alternative TFT scheme in the fluid model. The revised fluid model can be used to study time dynamics of BitTorrent-like systems. 38 3.1 Related Work B. Cohen, the author of BitTorrent, gives a thorough introduction to the BitTorrent system in [16]. The paper describes the BitTorrent protocol, the system architecture and the incentive scheme built in the BitTorrent system. There is a large body of work that studies the efficiency and the popularity of BitTor- rent via measurements, e.g. [21, 36, 53]. [56] proposes a fluid model to describe how the population of seeds and leechers evolves. [18, 22, 67] extend the above model to study BitTorrent’s performance under different user behaviors and different arrival processes. Further, [52] and [15] extend this model to study BitTorrent’s performance under het- erogeneous environments. Other interesting analytical results that are less relevant to our work include [65] which proposes a model to study the peer distribution and uses a dying process to study the file availability in BitTorrent, [20] which uses fluid models to study the distribution of the file transfer time in generic P2P systems that have some similar characteristics to BitTorrent, and [69] which uses a branching process to study the service capacity of such systems. Despite the large body of work on modeling BitTorrent’s performance, the majority of the studies make a number of simplifying assumptions in order to keep the analy- sis tractable. For example, the studies in [18, 22, 56, 67] consider homogeneous network environments only, where users have the same link capacities. This is clearly an unrealis- tic assumption given Internet’s heterogeneity. Those studies that consider heterogeneity make other simplifying assumptions, for example [52] completely ignores BitTorrent’s TFT scheme, and [15] models only some aspects of it. (See Section 3.6 for details.) Since 39 BitTorrent’s TFT scheme is one of the main features responsible for its great success, it is essential to accurately account for it. Our models accurately predict the performance of the BitTorrent system in both steady state and dynamic scenarios, under minimal assumptions, without compromising on the realism of the modeling methodology. In particular, we consider a heterogeneous BitTorrent-like system, where users are grouped into different classes according to their link capacities, and fully model all the important aspects of the system, including the TFT scheme. Although our proposed token-based TFT scheme is inspired by the token based in- centive schemes for Gnutella-like P2P systems presented in the previous chapter, notice that both the analysis and implementation of the token-based scheme, as well as what the scheme can accomplish in BitTorrent-like systems, differs from the Gnutella-like sys- tems. Our proposed token-based TFT for BitTorrent degenerates to the block-based TFT scheme proposed in [6] when nodes gain tokens for uploading a byte at the same rate that they use tokens to download a byte. Hence, our scheme is much more general and flexible. Further, while the work in [6] studies the performance of the block-based TFT scheme via simulations, we extend our mathematical models to predict the performance of the token-based scheme and the block-based scheme as a special case. Finally, our token based scheme allows us to investigate two recent, interesting prob- lems related to BitTorrent. The first is about the relative performance perceived by lower and higher bandwidth users. We show how our models can be used to decide on the scheme parameters that achieve a target tradeoff between the perceived performance of 40 lower-bandwidth and higher-bandwidth users, by making higher-bandwidth users offer more/less uploads than usual to lower-bandwidth ones. The second is about freerid- ers. [24,36,61] have shown that despite the built-in incentive scheme in BitTorrent, skill- ful freeriders can still benefit from the system by connecting to many users and relying on optimistic unchoking. We show how our scheme can be used to block such skillful freeriders, motivate them to offer uploads, and as a result improve the overall perfor- mance of the system. 3.2 The BitTorrent System and the Proposed Token Based Scheme 3.2.1 The BitTorrent System We now describe in detail the main functionality of the BitTorrent system. Recall that BitTorrent groups users by the file that they are interested in. When a user is interested in joining a group, he/she first contacts the tracker, a specific host that keeps track of all the users currently participating in the group. The tracker responds to the user with a list containing the contact information of L randomly selected peers. (Typical values for L are 40− 60 [8].) After receiving the list, the user establishes a TCP connection to each of these L peers, which we refer to as the user’s neighbors. As mentioned earlier, when disseminating the file, BitTorrent partitions the whole file into a number of blocks. Neighbors exchange block availability information and mes- sages indicating interest in blocks. The BitTorrent protocol uses a rate based TFT scheme to determine to which neighbors a user should upload blocks to. The rate based TFT 41 scheme proceeds as follows: time is slotted into 10 second intervals and each such time- interval is called an unchoking period. At the end of each unchoking period a user makes a choking/unchoking decision. The choking/unchoking decision proceeds as follows: First, the user computes for each of the neighbors that are interested in downloading a block from him/her, the average download rate that he/she receives during the last 20 sec- onds. Then, he/she selects to provide uploads to, i.e. to unchoke, the four neighbors who provided him/her the best download rates, with ties broken arbitrarily. (Similarly, if the user chooses not to provide uploads to a neighbor, we say that the neighbor is choked.) Finally, he/she also randomly selects another neighbor to provide uploads to. This last (random) selection process is called optimistic unchoking. Hence, at any time instance a user is concurrently uploading to 5 neighbors. The following rules are also adopted by the scheme. Let’s call the neighbor that was selected at the last optimistic unchoking, an optimistic unchoking neighbor, and suppose that the last optimistic unchoking (and hence the end of the last unchoking period) took place at time t 1 seconds. Now, suppose that the end of another unchoking period occurs at some time t 2 seconds. (Clearly, t 2 ≥ t 1 +10 seconds). Then, if at time t 2 the optimistic unchoking neighbor belongs in the set of the four neighbors who provide the user the best download rates (and hence they will be unchoked), the user performs a new optimistic unchoking. Otherwise: (i) if t 2 <t 1 +30 seconds, the user does not choke the optimistic unchoking neighbor and does not perform a new optimistic unchoking, and (ii) if t 2 ≥ t 1 +30 seconds, the user chokes the optimistic 42 unchoking neighbor and performs a new optimistic unchoking. We call this 30 second time-interval an optimistic unchoking period. This TFT scheme successfully discourages free-riders because they will keep getting choked if they do not provide uploads to their neighbors. Further, it gives the opportunity to new users to start downloading from the system even if they do not have enough blocks to exchange, in which case the download rate they provide is low. Finally, notice that the scheme allows a user to discover good neighbors, i.e. neighbors who provide him/her with high download rates, and exchange data with them. Therefore, users who have high upload link capacities tend to exchange data with a larger number of high capacity users. And users with low upload link capacities tend to exchange data with a larger number of low capacity users. Hence, in a sense the system is designed to be fair to each class of users. 3.2.2 The Proposed Token Based Scheme The process by which a new user discovers neighbors in the proposed token based scheme is exactly the same with the original BitTorrent system. Further, again, the file is partitioned into blocks and neighbors exchange block availability information and mes- sages indicating interest in blocks. As mentioned earlier, in the token based system users use tokens as a means to trade blocks. In particular, each user maintains a token table which keeps track of the amount of tokens his/her neighbors possess. When the user uploads X up bytes to a neighbor, 43 he/she decreases the neighbor’s tokens by K down X up . On the other hand, the user in- creases a neighbor’s tokens by K up X down if he/she downloads X down bytes from the neighbor under study. Under the proposed scheme each user decides to which (of the interested) neighbors he/she will upload blocks to, every 10 seconds. This is equal to the unchoking period in the original BitTorrent system. In particular, every 10 seconds the user first checks which of his/her neighbors have enough tokens to perform the download of a block. If there are more than 5 neighbors having enough tokens, then the user randomly selects 5 of them to upload to, which is equal to the number of peers a user provides uploads to in the original BitTorrent system. If 5 or fewer neighbors have enough tokens the user provides uploads only to them. If a neighbor runs out of tokens while downloading from the user, then the user stops uploading to the neighbor immediately after the current block transfer is complete, and randomly selects to upload to some other neighbor who has enough tokens. Finally, we initialize the token table of each user with an amount of tokens that suffices to download one block. The reason of giving initial tokens is to allow users download data when they first join the system. Note that K up and K down are relative values. Therefore, the proposed scheme actu- ally has only one design parameter. We will show that for K up = K down the proposed token based scheme has approximately the same performance, and it is as fair, as the original BitTorrent system. Finally, we will also show that as K up increases the overall system performance of the token based scheme can get significantly better than that of the original BitTorrent system, by sacrificing some fairness towards high capacity users. 44 In particular, high capacity users will end up providing uploads to the system at a faster rate than the download rate they receive. 3.3 Steady State Performance Analysis of BitTorrent-like Systems In this section we propose a mathematical model to study the performance of BitTorrent- like systems. In particular, we focus on studying the average file download delay, which is the time difference between the moment that a user joins a group and the moment that the user downloads the complete file. As mentioned earlier, in real P2P systems users have heterogeneous capacities. We incorporate this fact in our analysis in order to make it more realistic and general. In particular, we assume that there exist two classes of users: (i) high bandwidth (H-BW) users, who have a high upload link capacity, and (ii) low bandwidth (L-BW) users, who have a low upload link capacity. 1 We denote by α the percentage of L-BW users in the system. We start our analysis with the original BitTorrent system and then proceed with the proposed token based system. 3.3.1 A Mathematical Model for the Original BitTorrent System 3.3.1.1 Computing the Download Rates of H-BW and L-BW users Consider a H-BW user and denote by n d HH and n d HL the steady state average number of H-BW and L-BW neighbors respectively that this user is downloading from, and by D HH and D HL the corresponding average download rates. Similarly, consider a L-BW 1 The studies in [6, 59] divide the users of a real P2P system according to their upload link capacities into four classes. Here, we assume two classes of users only, for ease of exposition. Our analysis can be extended along the same lines for more classes of users. 45 user and denote by n d LH and n d LL the steady state average number of H-BW and L-BW neighbors respectively that this user is downloading from, and by D LH and D LL the corresponding average download rates. Now, let R downH and R downL be the aggregate download rate of a H-BW and a L-BW user respectively. It is easy to see that: R downH = n d HH D HH + n d HL D HL , (3.1) R downL = n d LH D LH + n d LL D LL . (3.2) Now, denote by n u HH and n u HL the steady state average number of H-BW and L- BW neighbors respectively that a H-BW user is uploading to, and let U HH and U HL be the corresponding average upload rates. Similarly, denote by n u LH and n u LL the steady state average number of H-BW and L-BW neighbors respectively that a L-BW user is uploading to, and by U LH and U LL the corresponding average upload rates. Further, let R upH and R upL be the aggregate upload rate of a H-BW and a L-BW user respectively. As before, we can write: R upH = n u HH U HH + n u HL U HL , (3.3) R upL = n u LH U LH + n u LL U LL . (3.4) In order to be able to predict the download delays we first need to compute R downH and R downL . Hence, we need to calculate the values of the parameters n d HH , n d HL , n d LH , n d LL , D HH , D HL , D LH , and D LL . To do so, we will first compute the values of n u HH , 46 n u HL , n u LH , n u LL , U HH , U HL , U LH , and U LL and then relate them to the aforementioned parameters. 2 In order to compute n u HH , n u HL , n u LH , n u LL , U HH , U HL , U LH , and U LL , we first need to find, in addition to Equations (3.3) and (3.4), six more relations. In this way we will have a system comprising of eight equations and eight unknowns. 3 First, recall that at any time instance, a user in BitTorrent is uploading to 5 of its neighbors. Hence, we have: n u HH + n u HL =5, (3.5) n u LH + n u LL =5. (3.6) Let C upH /C downH and C upL /C downL be the upload/download link capacity of H-BW and L-BW users respectively. Further, assume that a user’s download link capacity is larger than or equal to his/her upload link capacity. Therefore, the system’s bottlenecks are the upload links and we can assume that these are fully utilized. 4 This means that R upH = C upH and that R upL = C upL . Since peer-to-peer traffic is transferred via TCP connections, we can assume that the upload capacity of a user will be fairly shared among concurrent upload connections, if the maximum possible download rate of each connection is larger or equal to the fair 2 Computing these parameters first is easier. This is because, it is the rules according to which a user chooses a neighbor to provide uploads to, that are explicitly defined in the BitTorrent protocol. 3 In general, if there are n classes of users, one would need to solve a system of (n+ n)· n=2n 2 equations. This is because each class C∈{1...n} is characterized by n variables dictating the number of users from each class that a member of class C is uploading to on average, and n corresponding upload rates. 4 This is not an unrealistic assumption. Common Internet access technologies, such as dial-up, DSL, cable modem, and ethernet, satisfy this assumption [59]. Further, this assumption has been also made in many studies on peer-to-peer networks, e.g. see [22] and references therein, and it is in accordance with measurement studies of BitTorrent systems, e.g. see [21, 53]. 47 share. For L-BW users this is always the case since C downH >C upL , and C downL ≥ C upL , and we can state the following lemma whose proof is straightforward: Lemma 1. U LL = U LH = C upL 5 . (3.7) We now turn our attention to the upload rates that a H-BW user provides. At any time instance, a L-BW user is downloading on average from n d LL L-BW neighbors. We define the spare download capacity of this user as C downL − n d LL D LL . 5 Therefore, the upload rate that a H-BW user can provide to a L-BW user is given by the following lemma: Lemma 2. U HL =min C upH 5 ,C downL − n d LL D LL =min C upH 5 ,C downL − n u LL U LL . (3.8) Proof 1. If the spare capacity of the L-BW user is larger than his fair share ( C upH 5 ), the user will be downloading from the H-BW user at an average rate equal to his/her fair share. Otherwise, the user will be downloading at an average rate equal to his/her spare capacity. Further, since the total download rate from L-BW users to L-BW users equals the total upload rate from L-BW users to L-BW users, n d LL D LL = n u LL U LL . Now, note that once we know the values for n u HH and n u HL , the value of U HH will result from Equation (3.3). Further, let L be the total number of a user’s neighbors 5 Note that in general, because of BitTorrent’s TFT strategy (see Section 3.2), a L-BW user that has been selected from a H-BW user via optimistic unchoking, will be downloading from the H-BW user for a time duration equal to the optimistic unchoking period. When the optimistic unchoking period elapses the H-BW user will choke this L-BW user because he/she provides him/her with a low download rate. Therefore, we will be assuming that the probability that the same L-BW user is concurrently downloading from two or more H-BW users is quite small. This is not an unrealistic assumption if the number of users in the system is large. (Recall from Section 3.2 that typical values for L are 40−60.) 48 and assume that all of these neighbors are interested in a block that the user under study possesses. 6 Finally, denote by Binomial(N,p,k) the probability mass function of a Binomial random variable with parameters N and p, that is, Binomial(N,p,k) ≡ ⎛ ⎜ ⎜ ⎝ N k ⎞ ⎟ ⎟ ⎠ p k (1− p) (N−k) . Then, n u HL (the average number of L-BW users that a H-BW user provides uploads to) is given by the following lemma: Lemma 3. n u HL = L k=0 n(k)Prob{have k H-BW neighbors out of L}, (3.9) where: n(k)= ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ L−k L−4 if k≥ 5, 5− k otherwise. and: Prob{have k H-BW neighbors out of L} = Binomial(L, 1− α,k). Proof 2. First, recall that α is the percentage of L-BW users in the system. Since the neighbors’ list consists of a random selection of H-BW and L-BW users, it is easy to see that Prob{have k H-BW neighbors out of L} = Binomial(L, 1− α,k). Now, let’s consider a H-BW user, say user j, and let k ≤ L be the number of j’s H-BW neighbors. Since j provides uploads to 5 of his/her neighbors, we distinguish two cases: (i) k≥ 5, and (ii)k< 5. First, consider case (i) and recall how BitTorrent’s TFT 6 It has been demonstrated that file sharing in BitTorrent is very effective, i.e., there is a high likelihood that a node holds a block that is useful to its peers, e.g. see [6]. This is partially due to the local rarest first (LRF) block selection algorithm that BitTorrent uses to disseminate blocks. 49 scheme works (see Section 3.2). It is easy to see that in this case, j may be uploading to at most one L-BW user at any time instance. This L-BW user is randomly selected (via optimistic unchoking) with probability L−k L−4 . Now, consider case (ii). In this case, j is uploading to exactly 5− k L-BW users at any time instance, as he/she does not have any other H-BW neighbor that he/she could provide uploads to. It is now easy to see that n u HL is given by Equation (3.9). Now, recall from Section 3.2 that the optimistic unchoking period is 30 seconds, the rate observation window is 20 seconds, and users make their choking decision every 10 seconds. Suppose that H-BW user j selects L-BW user i via optimistic unchoking at time t 0 , as shown in Figure 3.1. t 0 t 0 +10 t 0 +20 t 0 +30 t 0 +40 t 0 +50 t 1 t 1 +10 t 1 +20 t 1 +30 t 1 +40 t 1 +50 j chokes i j optimistic unchokes i i makes first choking decision i chokes j Figure 3.1: Time line of optimistic unchoking and choking decision making. According to BitTorrent’s TFT scheme, at time t 0 +30 user j will choke i, because i did not provide him/her with a high download rate. Also, suppose that L-BW user i makes his/her first choking decision at time t 1 . Clearly, user i will not choke user j at t 1 , t 1 +10, and t 1 +20 because j provides him/her with a higher download rate compared 50 to U LL (the rate by which i is downloading from a L-BW neighbor). 7 Further, user i will choke j at time t 1 +50 because the rate observation window is 20 seconds and user j did not provide anything to i during the period (t 1 +30,t 1 + 50]. How about t 1 +30 and t 1 +40?At t 1 +30, the average download rate that i observes from j is U HL (20+t 0 −t 1 ) 20 .If this rate is larger than U LL , i will not choke j. Similarly, at t 1 +40, the average download rate that i observes from j is U HL (10+t 0 −t 1 ) 20 . If this rate is larger than U LL , i will not choke j. Therefore, if N unchoke denotes the number of times that i did not choke j, we can write: N unchoke = ⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎩ 3 if U HL (20+t 0 −t 1 ) 20 <U LL , 5 if U HL (10+t 0 −t 1 ) 20 ≥ U LL , 4 otherwise. Because users are not synchronized, it makes sense to assume that t 1 is uniformly dis- tributed between t 0 and t 0 +10. Hence, we can compute the average number of times N unchoke that i did not choke j. This corresponds to a duration of 10 N unchoke seconds. Now, recall that a H-BW user is uploading to n u HL L-BW users on average. Therefore, considering the above scenario only, it is easy to see that at any time instance a H-BW user on average downloads from n u HL 10N unchoke 30 L-BW users. And hence, the average number of H-BW users that a L-BW user provides uploads to (due to the above scenario only) is 1−α α n u HL 10N unchoke 30 . We refer to this scenario, as the optimistic unchoking reward scenario. 7 Recall that the probability that two or more H-BW users uploading to the same L-BW user at the same time-instance is small. Therefore, i will be always downloading from at least one L-BW neighbor. 51 Now, n u LH (the average number of H-BW users that a L-BW user provides uploads to) is given by the following lemma: Lemma 4. n u LH = L i=0 n(w)Prob{have w L-BW neighbors out of L} + 1− α α n u HL N unchoke 3 , (3.10) where: n(w)= ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ L−w L−4 if w≥ 5, 5− w otherwise. and: Prob{have w L-BW neighbors out of L} = Binomial(L, α, w). Proof 3. As before, since α is the percentage of L-BW users in the system and the neigh- bors’ list consists of a random selection of H-BW and L-BW users, the probability of having w L-BW neighbors out of L is Binomial(L, α, w). Further, the second term on the right hand side of Equation (3.10) corresponds to the optimistic unchoking reward scenario. What about the first term? This term accounts for the number of H-BW users that a L-BW user has chosen to upload to, just like in the proof of Lemma 3. In particular, consider L-BW user i, and let w≤ L be the number of i’s L-BW neigh- bors. As before, we distinguish two cases: (i) w ≥ 5, and (ii)w< 5. In case (i) i may be uploading to at most one H-BW user at any time instance. This H-BW user has been selected via optimistic unchoking, with probability L−w L−4 , and will be choked after the optimistic unchoking period elapses. This is because the H-BW user, who prefers 52 other H-BW users to upload to, won’t be uploading to this L-BW user. In case (ii), i has selected to upload to exactly 5−w H-BW users, as he/she does not have any other L-BW neighbor to provide uploads to. Notice that in Lemma 3, we have not considered the optimistic unchoking reward scenario. This is because, if a L-BW user selects via optimistic unchoking a H-BW user to provide uploads to, say at time t 0 (see Figure 3.1), the H-BW user will choke this L- BW user on his/her first choking decision at time t 1 (see Figure 3.1), because the L-BW user does not provide him/her with a high download rate. Therefore, H-BW users do not provide uploads to L-BW users in this case (i.e., L-BW users are not getting any reward for optimistic unchoking H-BW users.) Given Equations (3.3)...(3.10) (and R upH = C upH , R upL = C upL ), we can now compute n u HH , n u HL , n u LH , n u LL , U HH , U HL , U LH , and U LL . We now proceed to relate these parameters to n d HH , n d HL , n d LH , n d LL , D HH , D HL , D LH , and D LL . First, clearly D HH = U HH , D HL = U LH , D LH = U HL , and D LL = U LL . Further, notice that in any system the total number of upload connections equals the total number of download connections. For example, the total number of upload connections provided by H-BW users to L-BW users equals the total number of download connections that L-BW users 53 receive from H-BW users. Therefore, we can write n d LH α = n u HL (1− α). Similarly, we can easily relate n d HH , n d HL , n d LL to n u HH , n u LH , n u LL , as follows: n d HH = n u HH , n d HL (1− α)= n u LH α, n d LL = n u LL . We can now compute the average download rate of a H-BW user and a L-BW user using Equations (3.1) and (3.2), and of course the average download rate across all users. 3.3.1.2 Computing the Average Download Delay of H-BW and L-BW users In general, users are selfish in real P2P systems, in the sense that they will leave the system as soon as they finish their downloads [28, 30, 53]. We assume this to be the case here. Thus, since, in general, H-BW users have higher download rates than L-BW users, they are expected to leave the system earlier. Further, when a popular file first becomes available, say at some time t 0 , all users join the system in a short time period (flash crowd scenario) [30, 53]. Hence, the total number of users in the system evolves according to Figure 3.2. In Figure 3.2 we see that the total number of users sharing the file reaches a steady state value fast, at some time t 1 t 2 . At time t 2 H-BW users start departing the system and by t 3 only L-BW users are present. To compute the average file download delay we assume that t 1 ≈ t 0 and that t 2 ≈ t 3 . As we will show in the next section these approximations do not yield significant inaccuracies. 54 Number of Users Time t 4 t 2 t 3 t 1 t 0 Users join the system All users present in the system H-BW users depart the system Only L-BW users present in the system Figure 3.2: Evolution of the number of peers: (i) During (t 0 ,t 1 ] new users join the system, (ii) during (t 1 ,t 2 ] all users are present in the system, (iii) during (t 2 ,t 3 ] H-BW users depart the system, and (iv) during(t 3 ,t 4 ] only L-BW users are present in the system. Now, let S be the file size and let T H and T L be the average file download delay of a H-BW and a L-BW user respectively. It is easy to see that: T H = S R downH . (3.11) Further, let S d be the amount of data that a L-BW user has downloaded when all H-BW users were present in the system. It is easy to see that S d = T H R downL . After H-BW users leave the system, the average download rate of L-BW users is just equal to their upload capacity. Hence, the average file download delay of a L-BW user can be expressed as follows: T L = T H + S− S d C upL . (3.12) Note that going from rates to delays is a relatively easy task, it is the computation of rates that is quite involved. And, as already mentioned, our model is not only the first to 55 compute the rates and, subsequently, the delays quite accurately, but also the first to do so in the context of heterogeneous users. 3.3.2 A Mathematical Model for the Token Based System The model for the token based system is similar to the model for the original BitTorrent system. In particular, it is easy to see that Equations (3.1)...(3.6) hold for the token based system as well. Now, let’s justify why Equation (3.7) holds in this system. As before, we assume again that the download capacity of a user is larger than or equal to his/her upload capacity. Now, recall that a user earns K up tokens for each byte he/she uploads and spends K down tokens for each byte he/she downloads. For a L-BW user, his/her L-BW neighbors may earn tokens by uploading to him/her at a rate K up U LL , and they spend tokens by downloading from him/her at a rate K down D LL . Clearly, to make the token based system operate properly, we need to have K up ≥ K down . Hence, K up U LL ≥ K down D LL (since D LL = U LL ). Now, consider a H-BW user. The rate that a H-BW user gains tokens by providing uploads to a L-BW user (K up U HL ) is larger than the rate that the user spends tokens by downloading from the L-BW user (K down D HL ), since K up U HL ≥ K down U HL >K down U LH = K down D HL . Therefore, all users will always have enough tokens to download from a L-BW user. Hence, the upload capacity of a L-BW user is fully utilized and Equation (3.7) holds true in this system as well. Before proceeding, consider the scenario where a H-BW user exchanges data with another H-BW user. In this case, the H-BW user’s token earning rate (K up U HH ) is greater or equal to the user’s token spending rate (K down D HH ) (since K up ≥ K down and D HH = 56 U HH ). However, when a L-BW user exchanges data with a H-BW user, we distinguish two possible cases: (i) K up U LH ≤ K down U HL , and (ii) K up U LH >K down U HL . For both cases we need to establish new relations for n u HL , n u LH , and U HL . For the second case we also need to establish a new relation for U HH , since we can no longer compute this like we did in the original BitTorrent system. (The reasons for these will become apparent shortly.) Case (i): U HL is given by the following lemma (compare it with Lemma 2): Lemma 5. U HL =min C upH 5 ,C downL − n d LL U LL , K up U LH K down . (3.13) Proof 4. First, the token earning rate of a L-BW user from a H-BW user is K up U LH . Hence, the download rate of a L-BW user from a H-BW user cannot exceed KupU LH K down . (Re- call that each user keeps track of the amount of tokens that his/her neighbor possesses.) Now, C downL −n d LL U LL is the spare download capacity of the L-BW user. Clearly, he/she cannot download at a rate faster than this. Finally, as with the proof of Lemma 2, if the spare capacity of the L-BW user is larger than his/her fair share ( C upH 5 ), the user will be downloading from the H-BW user at an average rate equal to his/her fair share. Com- bining these facts, gives the result. Now, recall that according to the scheme a user randomly selects to upload to those neighbors who have enough tokens to perform the download. Further, consider a L-BW user. Since, as mentioned earlier, users always have enough tokens to download from a L-BW user, the L-BW user will equally select every peer to provide uploads to. Since 57 the total number of upload connections is 5, the percentage of L-BW users in the system is α, and the neighbor’s list consists of a random selection of H-BW and L-BW users, n u LL =5α. Since n u LL + n u LH =5, we have that n u LH =5(1− α). What remains, is to compute n u HL . Suppose there are N users in the system. By observing that in the long run the token earning rate of all L-BW users from H-BW users (n u LH K up U LH Nα) equals the token spending rate of all L-BW users to H-BW users (n u HL K down U HL N(1 − α)), we can write: n u HL = n u LH K up U LH α K down U HL (1− α) . (3.14) Case (ii): The relation for n u LH in this case is exactly the same as in case (i). However, since K up U LH ≥ K down U HL , now a L-BW user earns tokens by uploading to a H-BW user at a faster rate by which he/she spends tokens by downloading from the H-BW user. This means that a L-BW user always has enough tokens to download from a H-BW user. Since a H-BW user always has enough tokens to download from a H-BW user as well (as K up U HH ≥ K down D HH ), the H-BW user cannot distinguish H-BW neighbors from L- BW neighbors, and thus he/she provides uploads to all of his/her neighbors with the same probability. Hence, n u HL =5α. Further, U HL in this scenario is given in the following lemma: Lemma 6. U HL = L i=0 i k=0 min C upH 5 ,R HL (k) P 1 (k|i)P 2 (i), (3.15) 58 where: R HL (k)= ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ C downL −n d LL U LL k ifk> 0, 0 otherwise. (3.16) and: P 1 (k|i)= Prob{download from k out of i H-BW neighbors} = Binomial(i, 5 L ,k), P 2 (i)= Prob{have i H-BW neighbors out of L} = Binomial(L, 1− α,i). Proof 5. First, as we have said, now a L-BW user always has enough tokens to download from a H-BW neighbor. Hence, her/her download rate is not constrained by the amount of tokens he/she possesses. If a L-BW user is downloading fromk> 0 H-BW users, the average download rate from each H-BW user is equal to R HL (k)= C downL −n d LL U LL k , where C downL − n d LL U LL is the spare capacity of the L-BW user. However, this rate cannot exceed the maximum average rate that a L-BW user can download from a H-BW user, which is C upH 5 . Further, the probability that the L-BW user is downloading from a H-BW neighbor is 5 L because each user randomly selects 5 out of L neighbors to provide uploads to (as every neighbor always has enough tokens). Therefore, given that a L-BW user has i H-BW neighbors, the probability that he/she is downloading from k ≤ i of them is Binomial(i, 5 L ,k). And finally, the probability that the L-BW user has i H-BW neighbors is Binomial(L, 1− α,i). Notice that the upload link capacity of a H-BW user may not be fully utilized. This is because, since every neighbor seems identical, a H-BW user may select to provide 59 uploads to several L-BW users who cannot download fast. Hence, we can no longer use Equation (3.3) (with R upH = C upH ) to compute U HH . Instead, we need to find a new relation for U HH . 8 This is given in the following lemma: Lemma 7. U HH = 5 w=0 R HH (w)Prob{upload to w L-BW neighbors}, (3.17) where: R HH (w)= ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ C upH −wU HL 5−w ifw< 5, 0 otherwise. and: Prob{upload to w L-BW neighbors} = Binomial(5,α,w). Proof 6. The average rate by which a H-BW user is uploading to a L-BW user is U HL .If a H-BW user is uploading to w L-BW users, then the average upload rate to each H-BW user is equal to R HH (w)= C upH −wU HL 5−w , where C upH −wU HL is the spare upload capac- ity of the H-BW user. Further, a H-BW user randomly selects 5 neighbors to provide up- loads to because users always have tokens. Hence,Prob{upload to w L-BW neighbors} = Binomial(5,α,w). Now, notice that the way we have related n u HH , n u HL , n u LH , n u LL , U HH , U HL , U LH , U LL to n d HH , n d HL , n d LH , n d LL , D HH , D HL , D LH , D LL in the original system, is also valid in the token-based system. Therefore, we can now compute the average download rates 8 In contrast, in case (i), since L-BW users gain tokens slower that H-BW users, H-BW users rarely pick L-BW users to upload to, and thus their upload capacity remains (approximately) fully utilized. 60 using Equations (3.1) and (3.2), and hence the average download delays using Equations (3.11) and (3.12), as we did before. 3.4 System Time Dynamics In the previous section we have analyzed the performance of heterogeneous BitTorrent- like systems in steady state. In particular, we have derived a mathematical model that accounts for all central details of the BitTorrent protocol and predicts the download rates, and hence the delays, of different classes of users. However, as stated, this model does not capture user dynamics and does not scale well when there exist many classes of users. (Recall that one needs 2n 2 equations to characterize a system consisting of n classes of users.) With the above in mind we propose a second, fluid-based model that can be used for scenarios where users arrive and depart at any time instance, e.g. such as in non-flash crowd scenarios. Another advantage of this model is that it scales well with increasing number of classes. The model is inspired by prior work on fluid-based analysis of Bit- Torrent systems [56] (see Section 3.6 and Table 3.1 for a detailed comparison with [56] and other prior work) and by the techniques and rational of our first model. To keep the analysis tractable, we assume that leechers of a particular class provide uploads to leechers of other classes only via optimistic unchoking. This implies the following: (i) we assume that a leecher of a specific class always has enough neighbors of the same class to which he/she can connect to (This is not an unrealistic assumption since the list of neighbors (L) returned by the tracker is usually large.), and (ii) we do not consider 61 the optimistic unchoking reward scenario we saw earlier. As we shall see in Section 3.5, these approximations do not significantly affect the model’s accuracy, but they do make it slightly less accurate than our first model in steady state. As before, we first consider the original BitTorrent system and then proceed with the token-enhanced BitTorrent system. 3.4.1 The Original BitTorrent System We say that two users, which can be either leechers or seeds, are in the same class if they have the same link capacities, and we letG = {1,...,K} be the set of user classes in the system. Denote by x l j (t) and x s j (t) respectively, the number of class-j leechers and seeds in the system at time t. Let μ j be the service rate of a class-j user, which is defined as the rate by which the user can upload a file to other users. Given the file size S and the upload link capacity of class-j users C j up , μ j = C j up S . Further, let μ s (t) be the aggregate service rate provided by all seeds in the system at time t. That is, μ s (t)= K j=1 μ j x s j (t). Also, let si (t) and ji (t) be the portion of the aggregate service rate provided by all seeds and class-j leechers respectively, to all class-i leechers at time t. Finally, denote by R i (t) the aggregate service rate that all class-i leechers receive at time t. It is easy to see that: R i (t)= K j=1 x l j (t)μ j ji (t)+ μ s (t) si (t). (3.18) Notice that R i (t) is also the departure rate of class-i leechers, and that their average file download rate is R i (t)× S (S the filesize). Now, let λ i (t) be the arrival rate of class- i leechers, and p i s be the probability that a class-i leecher will stay in the system after 62 he/she downloads the file (i.e the probability of becoming a seed). Further, let γ i be the rate at which class-i seeds leave the system. Then, the population of class-i leechers and seeds in the system is described by the following differential equations: x l i (t)= λ i (t)− R i (t), (3.19) x s i (t)= R i (t)p i s − γ i x s i (t). (3.20) As before, since all leechers in the system are equally likely to be downloading from the seeds, si (t)= x l i (t) P K n=1 x l n (t) μ s (t),∀i∈G. Further, recall from the previous section that leechers are inclined to exchange file blocks with other leechers that belong in their class, due to the rate-based TFT scheme. Also, by our earlier assumption, a class-i leecher can receive uploads from a leecher of some other class-j only via optimistic unchoking. Since users randomly select a neighbor for optimistic unchoking, the probability that the selected neighbor is of class-i equals x l i (t) P K n=1 x l n (t) . Further, since a class-j leecher concurrently uploads to Z neighbors, the rate that a class-i leecher can receive from it is μ j Z . We can now state the following lemma for ji (t), whose proof follows immediately from the above arguments: Lemma 8. ji (t)= ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ x l i (t) P K n=1 x l n (t) 1 Z if i= j, 1− K i=1,i=j ji (t) otherwise. (3.21) 63 For a system with K classes of users we have 2K variables ({x l i (t)} ,{x s i (t)},i∈G) and 2K differential equations (two for each class, like Equations (3.19) and (3.20)), that dictate the evolution of these 2K variables. Therefore, we can solve using mathematical tools (e.g. such as Mathematica [68]), this system of equations to study how the user population ({x l i (t)} , {x s i (t)},i ∈ G) and the leecher departure rates ({R i (t)},i ∈ G) evolve with time. And, of course, we can compute the corresponding download delays of each class of leechers as a function of time because the user download delay is the reciprocal of the user departure rate. 3.4.2 The Token-enhanced BitTorrent System It is easy to see that Equations (3.18)...(3.20) still hold in the token-enhanced system. What changes is the way we compute{ ji (t)},i,j ∈ G. (Clearly the relation for si (t) remains the same.) Recall that a user earns K up tokens for each byte he/she uploads and spends K down tokens for each byte he/she downloads. Now, sort the user classes in accordance with their upload link capacities, with class 1 be the class with the lowest capacity. Along the same lines of the analysis in Section 3.3.1, if leecher m belongs to class 1, we know that all of his/her neighbors always have sufficient tokens to download from him/her. Therefore, they equally share the upload link capacity of leecher m. This suggests that 1i (t)= x l i (t) P K n=1 x l n (t) , ∀i ∈ G. Now, when a class 1 leecher exchanges data with a class 2 leecher, the token earning rate of the class 1 leecher from the class 2 leecher is 12 (t)μ 1 K up , and the token spending rate of the class 1 leecher to the class 2 leecher is 64 21 (t)μ 2 K down . Because in the long run the token earning rate of all class 1 leechers from class 2 leechers (x l 1 (t) 12 (t)μ 1 K up ) equals the token spending rate of all class 1 leechers to class 2 leechers (x l 2 (t) 21 (t)μ 2 K down ), we can write 21 (t)= x l 1 (t) 12 (t)μ 1 Kup x l 2 (t)μ 2 K down .How- ever, notice that 21 (t) cannot exceed the amount of the fair share that class 1 leechers can receive (from class 2 leechers) when class 1 leechers always have enough tokens to download. This amount is x l 1 (t) P K n=1 x l n (t) . Therefore: 21 (t)=min x l 1 (t) 12 (t)μ 1 K up x l 2 (t)μ 2 K down , x l 1 (t) K n=1 x l n (t) . (3.22) Now, leechers in other classes share the remaining capacity of the class 2 leecher (be- cause they all have sufficient tokens to exchange file blocks with this leecher, as they have larger upload link capacities, in accordance with our arguments in 3.3.1). Hence: 2i (t)= (1− 21 (t))x l i (t) K n=2 x l n (t) ,i∈{2,...,K}. Continuing this way, we can derive a general formula for ji (t): Lemma 9. ji (t)= ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ min x l i (t) ij (t)μ i Kup x l j (t)μ j K down , x l i (t) P K n=1 x l n (t) ifi<j, (1− P j−1 n=1 jn (t))x l i (t) P K n=j x l n (t) otherwise. (3.23) We can now solve the 2K differential equations as before, in order to study how the user population and download delays evolve over time in the token-enhanced system. 65 Remark: Compared to our first model, our second model is much simpler, and at the same time, more general. The key approximation that enables one to significantly sim- plify the analysis of BitTorrent-like systems is to ignore the optimistic unchoking reward scenario. As we will demonstrate next, this does not lead to large inaccuracies. 3.5 Experiments 3.5.1 Simulation Setup We use an event driven BitTorrent simulator developed by [5] for our simulations. The detailed simulator description can be found in [6]. We now summarize several important characteristics of this simulator. • The simulator assumes the bottleneck link of a connection is either a user’s upload link or the user’s download link, i.e. the simulator assumes the backbone network has infinite bandwidth. • The simulator simulates the flow level queueing delay rather than the packet level queueing delay, which implies the simulator assumes all connections traversing a link share the link capacity equally, if they are not bottlenecked elsewhere. • The simulator does not model packet level TCP dynamics, such as slow start, self- clocking, and packet loss. In addition, the simulator does not simulate the propaga- tion delay. 66 Notice that these simplifications do not have significant impacts on the results, as argued in [6]. In addition, we implement the proposed token based scheme to study its impact on the system performance. To validate our model, we simulate a flash crowd scenario where 200 leechers join the system within 20 seconds. Leechers will leave the system as soon as they finish their download. We simulate the system until all leechers depart. To avoid the rampup period at the beginning of the simulation, we randomly assign each user 5% of the blocks of the file. Other simulation settings are: (i) there is only one seed in the system and the upload link capacity of the seed is 800 Kbps, (ii) the file size is 300 MB and the block size is 512 KB, and (iii) the maximum number of concurrent upload transfers is 5. For the flash crowd scenario, we present simulation results for two scenarios, which, as we shall see, yield qualitatively different results when the token based scheme is used. In both scenarios, we simulate a system with two groups of users. The percentage of L-BW users is α =0.8, C downH = 600Kbps, and C upH = 300Kbps. For Scenario 1 we have that C downL = 300Kbps, C upL = 100Kbps, and for Scenario 2 we have C downL = 150Kbps, C upL =50Kbps. To valid our fluid model, we incorporate three groups of users: H-BW users (C downH = 700Kbps and C upH = 700Kbps), M-BW users (C downM = 700Kbps and C upM = 300Kbps), and L-BW users (C downL = 700Kbps and C upL = 100Kbps). All groups have the same arrival rate. 67 10 20 30 40 50 60 70 80 0 0.5 1 1.5 2 2.5 3 3.5 4 L n HL u Theoretical Results Simulation Results 10 20 30 40 50 60 70 80 0 0.5 1 1.5 2 2.5 3 3.5 4 L n HL u Theoretical Results Simulation Results (i) (ii) Figure 3.3: Average number of L-BW users that a H-BW user is uploading to: (i) Scenario 1, and (ii) Scenario2. 3.5.2 Steady State Performance Prediction and Flash Crowd Scenarios 3.5.2.1 Simulation Results for the Original BitTorrent System We first study how n u HL , the average number of L-BW users that are downloading from a H-BW user, behaves as the number of neighbors L increases. This will give us intuition later on, when we show how the download rates and delays change as a function of L. Both theoretical and simulation results are shown in Figure 3.3. First, we observe from the plots that Equation (3.9) can correctly predict n u HL for both cases. Further, we can observe that n u HL decreases as L increases. This is because when L is small H-BW users cannot find enough H-BW peers to upload to, and thus they have to provide uploads to more L-BW users. As L increases there are more H-BW users to upload to, and thus there is no need to upload to L-BW users. The download rates for both H-BW and L-BW users with respect to L are shown in Figure (3.4). Again, we can observe from the plots that our mathematical model is quite accurate. Notice that the download rate of H-BW users increases and the download rate 68 20 40 60 80 100 150 200 250 300 L Average Download Rate (Kbps) L−BW users (Theoretical) L−BW users (Simulation) H−BW users (Theoretical) H−BW users(Simulation) 20 40 60 80 50 100 150 200 250 300 L Average Download Rate (Kbps) L−BW users (Theoretical) L−BW users (Simulation) H−BW users (Theoretical) H−BW users(Simulation) (i) (ii) Figure 3.4: Average download rate for H-BW and L-BW users: (i) Scenario1, and (ii) Scenario2. of L-BW users decreases as L increases. This can be explained in a similar manner like we did with n u HL .As L increases H-BW users provide uploads to fewer L-BW users and to more H-BW users. Finally, theoretical and simulation results for the average file download are shown in Figure 3.5. We can observe from the plots that our model can correctly predict the average file download delay for H-BW users, L-BW users, and for the whole system. 10 20 30 40 50 60 70 80 0.5 1 1.5 2 2.5 x 10 4 L Average File Download Delay (Seconds) L−BW users (Theoretical) L−BW users (Simulation) H−BW users (Theoretical) H−BW users (Simulation) System (Theoretical) System (Simulation) 10 20 30 40 50 60 70 80 0 1 2 3 4 5 x 10 4 L Average File Download Delay (Seconds) L−BW users (Theoretical) L−BW users (Simulation) H−BW users (Theoretical) H−BW users (Simulation) System (Theoretical) System (Simulation) (i) (ii) Figure 3.5: Average file download delay for H-BW users, L-BW users, and for the system: (i) Scenario 1, and (ii) Scenario2. 69 3.5.2.2 Simulation Results for the Token Based System We now let K down =1 and study how the token based system behaves for different values of K up , for the scenarios we have considered earlier. We fix L =40, which is a typical value in BitTorrent [8]. Figure 3.6 shows the theoretical and simulation results for the download rate of H-BW and L-BW users. 1 2 4 6 8 10 12 14 16 50 100 150 200 250 300 K up Average Download Rate (Kbps) L−BW users (Theoretical) L−BW users (Simulation) H−BW users (Theoretical) H−BW users(Simulation) 1 2 4 6 8 10 12 14 16 50 100 150 200 250 300 K up Average Download Rate (Kbps) L−BW users (Theoretical) L−BW users (Simulation) H−BW users (Theoretical) H−BW users(Simulation) (i) (ii) Figure 3.6: Average download rate for H-BW and L-BW users: (i) Scenario1, and (ii) Scenario2. First, from the plots we see again that theoretical and simulation results match. Fur- ther, we make the following interesting observation: The download rate of H-BW users decreases and the download rate of L-BW users increases as K up increases. This is be- cause as K up increases L-BW users earn tokens at a faster rate and they can download more data from H-BW users. This however means that H-BW users provide fewer up- loads to other H-BW users. Thus, H-BW users have to download now from more L-BW users, and hence their download rate decreases. Further, it is interesting to point out that in the first scenario the two classes of users have the same download rate for large K up , whereas in the second scenario the download rates of the two classes are never equal. 70 This is because in the first scenario (for large K up ) both classes of users are downloading from a similar number of H-BW users, and since C downL = C upH and C downH >C upH , both classes of users can fully utilize the H-BW users’ upload capacity. In contrast, in the second scenario, while both classes of users are downloading from a similar number of H-BW users (for large K up ), since C downH >C upH but C downL <C upH , only H-BW users can fully utilize the upload capacity of other H-BW users from whom they are downloading. Figure 3.7 shows theoretical and simulation results for the average file download de- lay for H-BW users, L-BW users, and for the whole system. For comparison, the plots also show the corresponding average download delay in the original BitTorrent system. As before, we observe that our model predicts the simulation results quite accurately. 1 2 4 6 8 10 12 14 16 0.5 1 1.5 2 2.5 x 10 4 K up Average File Download Delay (Seconds) L−BW users (Theoretical) L−BW users (Simulation) H−BW users (Theoretical) H−BW users (Simulation) System (Theoretical) System (Simulation) System (Original BitTorrent) L−BW users (Original BitTorrent) H−BW users (Original BitTorrent) 1 2 4 6 8 10 12 14 16 0 1 2 3 4 5 x 10 4 K up Average File Download Delay (Seconds) (i) (ii) Figure 3.7: Average file download delay for H-BW users, L-BW users, and for the system: (i) Scenario 1, and (ii) Scenario2. Further, we observe that when K up =1 = K down , the performance of the token-based system is almost identical to that of the original BitTorrent system. However, as K up 71 increases the overall system performance can be improved compared to the original Bit- Torrent system. This is because in the token-based system L-BW users are downloading from more H-BW users if K up is large, since as we have mentioned earlier, L-BW users can gain tokens fast. However, as mentioned earlier, we are sacrificing the perceived performance of H-BW users. This motivates us to quantify next how much “unfair” the token based scheme becomes to H-BW users as K up increases. 3.5.3 Predicting System Time Dynamics and Non-flash Crowd Scenarios We now simulate a more dynamic BitTorrent system where new leechers keep joining the system at random times according to a Poisson process, and after finishing their downloads, either depart the system or remain there for a while as seeds. This is in order to demonstrate how accurate the model of Section 3.4 is. We again consider three classes of users: H-BW, M-BW, and L-BW users. The arrival rates of the three classes of leechers are different and change during the simulation as shown in Figure 3.8. The file size is 100MB. Finally, 15% of leechers stay in the system for 3000 seconds after they download the file. That is, p L s = p M s = p H s =0.15 and γ H = γ M = γ L = 1 3000 (recall from Section 3.4 the definition of these parameters). All other simulation parameters are the same as before. 3.5.3.1 Simulation Results for the Original BitTorrent System Figure 3.9(i) shows how the number of users (seeds and leechers) in the original Bit- Torrent system evolves over time. From the plot we can observe that the proposed fluid 72 0.1 0.08 0.07 0.25 Simulation Time (Seconds) Arrival Rate (leechers per second) 30000 50000 H-BW leechers M-BW leechers L-BW leechers Figure 3.8: Leecher arrival rate. 0 1 2 3 4 5 6 x 10 4 0 50 100 150 200 250 300 350 400 450 Time Number of Users L−BW Fluid Model L−BW Simulation M−BW Fluid Model M−BW Simulation H−BW Fluid Model H−BW Simulation Seeds Fluid Model Seeds Simulation 1 2 3 4 5 x 10 4 0 1000 2000 3000 4000 5000 6000 7000 Time Average File Download Delay L−BW Fluid Model L−BW Simulation M−BW Fluid Model M−BW Simulation H−BW Fluid Model H−BW Simulation (i) (ii) Figure 3.9: The original BitTorrent system: (i) Number of users in the system, and (ii) Average file down- load delay for H-BW, M-BW, and L-BW leechers. model is quite accurate. Further, note that at the beginning of the simulation the number of leechers in the system is small and hence the system’s service capacity, which is the aggregate service rate of all users in the system, is also small. Therefore, initially the leecher departure rate is smaller than the leecher arrival rate and this is why the number of leechers in the system initially increases. However, as the number of leechers in the system increases, the system’s service capacity and hence the leecher departure rate also 73 increase. After some time elapses, the leecher departure rate catches up with the leecher arrival rate and the system reaches its steady state, where the number of leechers in the system stabilizes. Clearly, after leechers stop arriving to the system the number of leech- ers starts decreasing. The evolution of the total number of seeds follows the evolution of leechers as expected, since some leechers, after finishing their downloads, remain to the system as seeds for some time before departing. Before proceeding, notice that we have intentionally stopped the arrivals of L-BW and M-BW leechers and increased the arrival rate of H-BW leechers at time 30000, as shown in Figure 3.8. From Figure 3.9(i) we can see that our fluid model can also capture this transition quite accurately. The discrepancies between theoretical and simulation results at the beginning of the simulation are because the model does not consider the fact that leechers initially require a large amount of time to finish their downloads, and hence to depart the system. In particular, Equation (3.19) does not consider the fact that initially the leecher departure rate may be zero, but instead, it always assumes that this is strictly positive. Since the leecher departure rate is initially overestimated in the fluid model, the rate by which the number of leechers in the system increases is lower than the one in the simulation. Finally, Figure 3.9(ii) shows simulation and theoretical results for the average file download delay for H-BW, M-BW, and L-BW leechers. We can observe again that our fluid model is, in general, quite accurate. The discrepancies between theoretical and sim- ulation results in the plot are primarily due to ignoring the optimistic unchoking reward scenario. As a result: (i) the download delays of higher bandwidth leechers may be over- estimated, since we ignore the download rate rewards that these leechers receive when 74 they optimistically unchoke lower bandwidth leechers, and (ii) the download delays of lower bandwidth leechers may be underestimated, since lower bandwidth leechers no longer spend a portion of their upload capacity for rewarding higher bandwidth leechers (as occurs in reality), but instead, they use this portion for uploading to other lower ca- pacity leechers. However, as expected, these discrepancies are relatively small. As a final note, the file download delays initially decrease as time increases because the number of seeds in the system initially increases (and thus leechers can receive more downloads from seeds). After the number of seeds in the system stabilizes the file download delays also stabilize as expected. 3.5.3.2 Simulation Results for the Token-enhanced BitTorrent System We fix K down =1 and vary K up to study its effect on the system dynamics. Figure 3.10 presents analytical and simulation results for the peer population in the token-enhanced system. In the figure we show results for K up =1, K up =2, and K up =10. Again we observe that our model is quite accurate. We see that the token based system (Figure 3.10(i)) resembles the original BitTorrent system (Figure 3.9(i)) when K up = 1= K down , in agreement with our earlier arguments. As we increase K up , the number of lower bandwidth leechers in the system decreases and the number of higher bandwidth leechers increases. The explanation for this is the same as before. As K up increases, lower capacity leechers can earn tokens at a faster rate, download faster, and hence depart the system earlier. On the other hand, higher bandwidth leechers download slower and hence depart the system after a longer period of time. Finally, when K up =10 the 75 0 1 2 3 4 5 6 x 10 4 0 50 100 150 200 250 300 350 400 450 Time Number of Users L−BW Fluid Model L−BW Simulation M−BW Fluid Model M−BW Simulation H−BW Fluid Model H−BW Simulation Seeds Fluid Model Seeds Simulation 0 1 2 3 4 5 6 x 10 4 0 50 100 150 200 250 300 350 400 450 Time Number of Users L−BW Fluid Model L−BW Simulation M−BW Fluid Model M−BW Simulation H−BW Fluid Model H−BW Simulation Seeds Fluid Model Seeds Simulation (i)K up =1 (ii)K up =2 0 1 2 3 4 5 6 x 10 4 0 50 100 150 200 250 300 350 400 Time Number of Users L−BW Fluid Model L−BW Simulation M−BW Fluid Model M−BW Simulation H−BW Fluid Model H−BW Simulation Seeds Fluid Model Seeds Simulation 0 1 2 3 4 5 6 x 10 4 0 50 100 150 200 250 300 350 400 Time Number of Users L−BW Fluid Model L−BW Simulation M−BW Fluid Model M−BW Simulation H−BW Fluid Model H−BW Simulation Seeds Fluid Model Seeds Simulation (iii)K up =3 (iv)K up =10 Figure 3.10: The token-enhanced system: Number of users in the system. number of leechers present in the system from each class at steady state is proportional to the arrival rate of each class. This is explained as follows: As mentioned earlier, for large K up , in particular when K up > C upH C upL , all leechers can download at the same rate and hence the departure rate of each class of leechers is proportional to the steady state population of the class. Since the departure rate equals the arrival rate at steady state, the steady state population is proportional to the arrival rate. Finally, notice that the system throughput does not change with respect to the value of K up , as mentioned earlier. Further, how the download rate and the download delay of 76 each class of users change with respect to K up is similar to the steady state scenario in Section 3.5.2.2, i.e. Figures 3.6 and 3.7 respectively. 3.5.4 Impact of the Proposed Token Based Scheme on Fairness To quantify “fairness” we use the upload-to-download ratio of a user, which is defined as the user’s upload rate divided by his/her download rate. 9 Figure 3.11 shows how the upload-to-download ratio behaves as we vary K up , for each class of users. 1 2 4 6 8 10 12 14 16 0.5 1 1.5 2 K up Upload−to−download ratio L−BW users (Simulation) H−BW users (Simulation) L−BW users (Theoretical) H−BW users (Theoretical) 1 2 4 6 8 10 12 14 16 0.5 1 1.5 2 2.5 K up Upload−to−download ratio L−BW users (Simulation) H−BW users (Simulation) L−BW users (Theoretical) H−BW users (Theoretical) (i) (ii) Figure 3.11: Upload-to-download ratio for H-BW and L-BW users: (i) Scenario1, and (ii) Scenario2. From these plots we observe that the upload-to-download ratio is almost the same for both classes of users when K up =1 = K down . This implies that the system is fair. However, as K up increases, the corresponding ratio for H-BW users increases and for L-BW decreases, as expected. (This suggests that the system becomes unfair.) Looking at Figures 3.7 and 3.11 we can conclude that we can tradeoff between over- all system performance and fairness. Using our analytical model we can predict how much “fairness” we are sacrificing and what performance is achieved. For example, one 9 This metric has been also used to quantify fairness in other studies as well, e.g. [6, 64]. 77 can enforce fairness by setting K up =1= K down , or can minimize the system’s aver- age download delay by choosing a large value for K up . Further, one can also operate somewhere between these two extremes by setting the appropriate value for K up . 3.5.5 Impact of the Proposed Token Based Scheme on Freeriders As mentioned earlier, the rate-based TFT scheme in general can motivate cooperation in BitTorrent. However, as reported in [24, 36, 61], skillful freeriders can still benefit from the system by exploiting the optimistic unchoking scheme. In particular, they can con- nect to more peers than usual to increase the probability of receiving data via optimistic unchoking. In this section we study how the proposed token-based scheme prevents this type of freeriding in BitTorrent-like systems. To this end, we simulate both the origi- nal BitTorrent and the token-enhanced system (with K up = K down ) with two classes of users: freeriders (FR) and non-freeriders (NF). We set the download link capacity of both classes of users to 10Mbps, the upload link capacity of non-freeriders to 300Kbps, and the upload link capacity of freeriders to 0Kbps since they do not offer any uploads. All other simulation parameters are the same as before. We simulate two different scenarios. In the first scenario, both classes of users connect to L =40 neighbors. In the second scenario, freeriders connect to all available leechers in an effort to maximally exploit optimistic unchoking. Figures 3.12(i) and (ii) show the download rates for freeriders and non-freeriders with respect to the percentage of freeriders in the system for Scenario 1 and 2 respectively. As before, theoretical results (from our first model) can predict simulation results well. 78 0.2 0.4 0.6 0.8 0 50 100 150 200 250 300 350 Percentage of Freeriders Download Rate (kbps) NF (Orig BT)(Theoretical) FR (Orig BT)(Theoretical) NF (Token BT)(Theoretical) FR (Token BT)(Theoretical) NF (Orig BT)(Simulation) FR (Orig BT)(Simulation) NF (Token BT)(Simulation) FR (Token BT)(Simulation) 0.2 0.4 0.6 0.8 0 50 100 150 200 250 300 350 Percentage of Freeriders Download Rate (kbps) NF (Orig BT)(Theoretical) FR (Orig BT)(Theoretical) NF (Token BT)(Theoretical) FR (Token BT)(Theoretical) NF (Orig BT)(Simulation) FR (Orig BT)(Simulation) NF (Token BT)(Simulation) FR (Token BT)(Simulation) (i) (ii) Figure 3.12: The token-based scheme prevents freeriding: (i) Scenario1, and (ii) Scenario2. Further, in Scenario 1 the download rate of freeriders is quite low under the original BitTorrent system, which implies that the system is working well. However, in Scenario 2 the download rate of freeriders is quite high and can be higher than even that of non- freeriders, when a small portion of skillful freeriders steal resource from a large portion of non-freeriders. (This interesting observation has also been made in [61].) Nevertheless, the plots confirm our intuition that the token-enhanced scheme does not suffer from this problem as it does not allow freeriders perform any downloads. 3.5.6 Performance Prediction for Large Systems Having established the accuracy of the analytical models, in this section we use them to study the performance of large yet realistic BitTorent-like systems whose size makes simulation-based analysis too expensive. The first system we consider consists of thousands of users. We use our first model to study this system and further argue that BitTorrent’s TFT mechanism is not enough 79 0 5000 10000 10 2 10 4 Number of Users in the System Download Rate (kbps) NF (Orig BT) FR (Orig BT) NF (Token BT) FR (Token BT) 0 100 200 300 400 0 1000 2000 3000 4000 Time (hours) Number of Users Fluid Model Trace (i) (ii) 0 100 200 300 400 0 500 1000 1500 2000 Time (hours) Number of Users T−BW Fluid Model L−BW Fluid Model M−BW Fluid Model H−BW Fluid Model 0 100 200 300 400 0 1000 2000 3000 4000 Time (hours) Number of Users Fluid Model Trace (iii) (iv) Figure 3.13: Performance prediction for large systems: (i) The proposed token scheme can prevent freerid- ers from exploiting large systems, (ii) Our fluid model can reproduce the results of a real trace, (iii) The fluid model can further show the distribution of different classes of users, (iv) Using the proposed token scheme we can tradeoff fairness for overall system performance. to deal with freeriders. We let 100 of those users to be freeriders, and the rest to be non-freeriders. The upload/download link capacity of freeriders and non-freeriders are the same as before. Figure 3.13(i) plots the download rate of freeriders as the total num- ber of users in the system increases. Based on prior discussion, it is somewhat expected that freeriders will do really well as the number of non-freeriders increase, but still their performance is astonishing (the Y-axis on the plot is logarithmic). Essentially, freerid- ers connect to so many users that the aggregate download rate received via optimistic 80 unchoking almost fills their download link capacity. As before, the token-based scheme solves this problem. The second system we consider is based on a popular trace [53] which measured the number of users downloading the The Lord of The Rings DVD. The trace spans a pe- riod of time that is a bit longer than two weeks, and reports on hundreds of thousands of users. The trace has no information about the type of users and their link capacities, so we use the trace-based studies in [6, 58] to decide on the mix of users that we con- sider. Specifically, we assume four classes of users: H-BW (15%), M-BW (25%), L-BW (40%), and T-BW (20%) users (T-BW represents Tiny Bandwidth users) with capacities C upT = 128Kbps, C upL = 384Kbps, C upM = 1000Kbps, and C upH = 5000Kbps. We first find a set of arrival rates for each class of users that, when fed to our second, fluid-based model, yields a synthetic trace that is very similar to the original one, as shown in Figure 3.13(ii). We then use our model to predict the population of different classes of users in the system, as shown in Figure 3.13(iii). (No results of this type are available from the original trace.) Finally, we show what would have happened if the token-enhanced scheme (with K up >> K down ) had been used instead of the original TFT scheme. The average file download delay would have been significantly reduced, evident from the smaller number of users present in the system, at the expense of the perceived performance from higher bandwidth users as explained earlier. Our motivation with these two examples has been to show the wide range of interest- ing results that one can obtain in no time using the set of equations that constitute our analytical models. 81 3.5.7 Comparison Between the Two Models 10 20 30 40 50 60 70 80 100 150 200 250 300 Number of neighbors (L) Average Download Rate (Kbps) Model 1 (H−BW) Simulation Results (H−BW) Model 1 (L−BW) Simulation Results (L−BW) Model 2 (L−BW) Model 2 (H−BW) Figure 3.14: Comparison between our two models. 0 1 2 3 4 5 x 10 4 0 100 200 300 400 500 600 700 Time Number of Users Our Model FPK Model DR Model Simulation Figure 3.15: Inaccuracy due to the homogeneity assumption. As we have seen, our first model (presented in Section 3.3) is tailored for steady state performance analysis, accounts for all details of the BitTorrent protocol, and it is very 82 0 1 2 3 4 5 6 x 10 4 0 100 200 300 400 500 600 700 Time Number of Leechers L−BW Our Model L−BW FPK Model α=0.5 L−BW FPK Model α=0.65 L−BW FPK Model α=0.75 L−BW Simulation H−BW Our Model H−BW FPK Model α=0.5 H−BW FPK Model α=0.65 H−BW FPK Model α=0.75 H−BW Simulation Figure 3.16: Inaccuracy due to the static resource allocation assumption. accurate. It is now interesting to compare how accurate our fluid model is in such cases (i.e. for steady state performance analysis) and compare it with our first model. For this we consider the flash crowd scenario of Section 3.5.2. We then solve our fluid model equations in steady state (i.e. in the interval [t 1 ,t 2 ) in Figure 3.2), and compute the corresponding leecher download rates. The results are depicted in Figure 3.14. As we observe, our first model is more ac- curate as expected, since it captures the effect of connecting to a variable number of neighbors (L), and considers the optimistic unchoking reward scenario. Our second model is visibly inaccurate when L is small, because it does not account for leechers that don’t have sufficient neighbors of the same class to connect to. However, the difference between the two models becomes small when L is sufficiently large, as expected. This difference now is only due to the fact that we do not consider the optimistic unchoking reward scenario in the second model. It is interesting to point out that this difference 83 for H-BW leechers never exceeds 100 Z %. This is explained as follows. First recall that a H-BW leecher provides uploads to Z− 1 neighbors that provide him/her the highest download rates, and to one other neighbor via optimistic unchoking. The maximum rate the H-BW leecher can provide to the optimistic unchoked neighbor does not exceed 100 Z % of its upload link capacity.When we consider the optimistic unchoking reward scenario, we are accounting for the fact that the optimistic unchoked neighbor will, in turn, reward the H-BW leecher with a download rate, which is at most equal to the one the optimistic unchoked neighbor receives from this leecher. Therefore, if we do not consider the opti- mistic unchoking reward scenario we may underestimate the H-BW leecher’s download rate by at most 100 Z %. As mentioned earlier, the first model requires 2n 2 equations to model a system of n classes of users and does not model the system time dynamics. In contrast, the second model captures the system dynamics, and requires only 2n equations to model a system of n classes of users. In summary, the two models comprise a tradeoff between accuracy and simplicity/generality. 3.6 Comparison with Other Models To highlight the contributions of this paper we now compare our models with two of the most representative earlier results in the literature: [56] and [15]. The reason of choosing these two results is that [56] presents the first mathematical model for BitTorrent systems and [15] is one of the most representative works that considers network heterogeneity. We refer to the model proposed in [56] as the DR’s model and to the model proposed 84 in [15] as the FPK’s model. The comparison is summarized in Table 3.1. (The table also gives a summary of the differences between our two models, which we discussed earlier.) From Table 3.1 we see that among all models, our first model is the most inclusive one. It models all details of a BitTorrent-like system, except from the system time dy- namics. In particular, it considers the number of concurrent upload connections a user may have (Z), the number of neighbors to which a user might be connected (L), network heterogeneity, the performance effect of BitTorrent’s TFT scheme and of optimistic un- choking, as well as the optimistic unchoking reward scenario. On the other hand, the DR’s and FPK’s models (as well as our second model), consider the system’s time dynamics but under some approximations/simplifications. The DR’s model considers system time dynamics in a homogeneous BitTorrent system, where all users have the same capacities. The FPK’s model incorporates network heterogeneity (two classes of users) into the DR’s model. It also attempts to model the effect of the op- timistic unchoking and of the TFT scheme. However, it does not accurately capture how these affect system performance in realistic scenarios. In particular, the model assumes that users provide to other users a static/fixed proportion of their upload link capacity, in the sense that this proportion remains constant over time. But the actual computation of this proportion is much more involved, as we have shown in this paper. One must consider, for example, the exact number of concurrent upload connections (Z), the peer capacity distribution, the different user arrival rates and how these might change over time, etc. Among the three models that capture system time dynamics, our second model 85 incorporates the most details of a BitTorrent system and it is the most general one. In particular, it accounts for all the system details, except from the variability in the number of neighbors that a user might be connected to (L), and the optimistic unchoking reward scenario. To our best knowledge, this is also the first model that considers an arbitrary number of user classes in heterogeneous environments. Finally, we highlight scenarios where prior models lead to sizable inaccuracies. We consider a BitTorrent system with two classes of leechers where L-BW and H-BW leech- ers join with rate 0.075users/sec and leave the system as soon as they finish their down- loads. Further, we have C upL = 100Kbps, C upH = 600Kbps, and all other parameters are the same as before. We compare simulation results with theoretical results obtained by our fluid model, the DR’s model 10 , and the FPK’s model. Figure 3.15 shows the total number of users in the system as a function of time. We observe from the plot that, as expected, the homo- geneity assumption (made by the DR’s model) results in significant inaccuracies while our model and FPK’s model can accurately capture simulation results. In Figure 3.16 we present the population of H-BW and L-BW users in the system. As mentioned before, the FPK’s model assumes that users provide to other users a static/fixed proportion of their upload link capacity, denoted in the model by α. Because [15] only provides a range of feasible α values rather than an exact one, equal to 0.5≤ α≤ 1 in our scenario, we pick three representative values from the feasible range and plot the result. We observe from the plot that the static resource allocation assumption (made by the FPK’s model) leads 10 Because the DR’s model assumes users’ homogeneity, we compute the average link capacity and use this value in our calculations whenever we use this model. 86 to sizable inaccuracies for all three different values of α, in particular for L-BW users, while our model can accurately predict simulation results. System details captured by the Model Model 1 Model 2 FPK’s model DR’s model number of concurrent uploads (Z) Yes Yes No No number of neighbors (L) Yes No No No number of user classes (heterogeneity) 3 many 2 1 performance effect of the TFT scheme and of optimistic unchoking Yes Yes Partial No optimistic unchoking reward scenario Yes No No No time dynamics No Yes Yes Yes Table 3.1: Model comparison. 87 Chapter 4 Performance Analysis of P2P Streaming Systems The success of BitTorrent provides a vision for P2P realtime video broadcast, i.e. P2P video streaming. It is widely reported that P2P streaming systems have become more and more popular [2, 27]. This success has invoked plenty of studies on them. For example, [40, 50, 72] study how to design such systems, [2, 27] collect traffic measure- ments, [34,70,71,73] analyze their performance, and [38,39] propose incentive schemes for them. However, despite this large body of work, to our best knowledge, there is no prior attempt to analytically study the performance of P2P streaming systems with MDC incentive scheme. Motivated by this, in this section we first propose a mathemat- ical model that predicts the performance of such systems, which can be used to choose system parameters that achieve a target performance. In addition, there have been few attempts to study the tradeoff between users’ inter- ests, which is usually referred to as social welfare, and the operator’s profit, or between social welfare and fairness. To study these problems, we use a stochastic optimization framework to study the aforementioned problem. The technique of stochastic optimiza- tion has been widely used to study various problems in wireless networks, including 88 resource allocation [35, 44], power allocation [45, 47], dynamic data compression [46], and etc. Base on this framework, we propose a joint demand control and scheduling al- gorithm for P2P streaming systems: the algorithm helps users not only decide a optimal download rate, i.e. demand control, but also determine where to provide uploads, i.e. scheduling. We show that the proposed algorithm can achieve a performance (in terms of users’ social welfare) within O(1/V ) of the optimal with a tradeoff of sacrificing the operator’s profit where V is a control parameter. In addition, we incorporate a fairness constraint in the optimization framework which can maintain fairness among users. 4.1 Related Work J. Liu et al. give a thorough introduction to Internet video broadcast/streaming systems in [37]. They classify these systems into three categories: IP multicast, Content Distribution Network (CDN), and P2P multicast. Because IP multicast relies on router’s support for multicasting and CDN imposes severe bandwidth requirement on the streaming servers, P2P multicast becomes the most promising technique for Internet video streaming [37]. There are two major approaches proposed in the literature for P2P multicast: the tree based approach [12, 14] and the data driven approach [40, 50, 72]. In the tree based approach, source node (streaming server) delivers video content (packets) through the multicast tree(s) built by the system. However, this approach suffers from at least two drawbacks: the overhead of managing multicast trees and low upload link utilization [37, 41]. These limitations along with the simplicity of the data driven approach, the data driven approach becomes the most prevalent implementation for P2P multicast, e.g. 89 [54, 55, 62]. The data driven approach is similar to the most famous file sharing system, BitTorrent, in the way that users directly “pull” out useful data from their neighbors. But it is different from the BitTorrent system in that: (i) it uses a sliding window mechanism to confront the realtime constraint of streaming systems, and (ii) it does not use the TFT incentive scheme adapted in BitTorrent. Now, we refer to P2P streaming systems as P2P multicast systems that use the data driven approach. The prevalence of P2P streaming systems has been widely reported [2,27]. Addition- ally, [26] shows network operators can increase their benefit by properly using rewards to motivate user cooperation. However, as mentioned earlier, current systems do not use the efficient TFT incentive scheme because it does not perform well in current streaming sys- tems [60]. Without efficient incentive schemes, unfair resource sharing among users has been observed in some measurement studies [2, 27]. To solve this problem, Z. Liu et al. propose using Multiple Description Coding (MDC) to provide incentives for P2P stream- ing systems [38, 39]. Using MDC, a user’s perceived video quality is proportional to the user’s received data rate. Therefore, the reciprocity mechanisms, e.g. TFT, used in P2P file sharing systems can be used in the streaming systems. In fact, the studies in [38, 39] also show via simulation that using MDC with TFT can provide strong incentives for P2P streaming systems: cooperative users are rewarded with high download rate (good video quality) while freeriders are punished. As a result, it has been widely believed that using MDC can motivate user cooperation in P2P streaming systems [37–39, 70]. 90 Stochastic optimization provides a powerful infrastructure for optimizing some per- formance metric for dynamic systems with a-prior statistical information. The develop- ment of optimal policies relies on a number of advances including Lyapunov techniques, the introduction of virtual cost queues, etc. [35]. This technique has been widely used in wireless networks to study various problems, e.g. resource allocation [35, 44], power allocation [45, 47], dynamic data compression [46], and etc. We refer interested readers to [35, 44] for further information. 4.2 Performance Analysis for Data-Driven P2P Streaming Systems As mentione earlier, P2P streaming systems have drawn a lot of research interest on them. However, despite this large body of work, to our best knowledge, there is no prior attempt to analytically study the performance of P2P streaming systems with MDC incentive scheme. Motivated by this, we propose a mathematical model that predicts the performance of such systems, which can be used to choose system parameters that achieve a target performance. Additionally, inspired by our token scheme in Chapter 2, we propose a token based MDC scheme to provide incentives for P2P streaming systems. The proposed scheme is more flexible and general than the scheme proposed in [38, 39]. In addition, we extend our model to predict the performance of the token based MDC systems. We also show how our model can be used to decide on the scheme parameters that achieve a target tradeoff between overall system performance and fairness. 91 4.2.1 Preliminary 4.2.1.1 Multiple Description Coding Multiple Description Coding (MDC) is a coding technique which fragments a single me- dia stream into L, one base layer and L− 1 enchancement layers, (L ≥ 2) sub streams referred to as descriptions. The base layer is necessary for the media stream to be de- coded, enhancement layers are applied to improve stream quality.The more layers a user receives the better the video quality is. As a result, this coding technique can provide strong incentives to motivate cooperation in P2P streaming systems. Notice that to use MDC in P2P streaming systems, the scheduling algorithm should be slightly amended: Users should first download packets from the base layer to decode the video and then request packets for enchancement layers to improve the receiving video quality. 4.2.1.2 The Proposed Token Based Scheme In the token based streaming system users use tokens as a means to trade packets. In particular, each user maintains a token table which keeps track of the amount of tokens his/her neighbors possess. When the user uploads X up bytes to a neighbor, he/she de- creases the neighbor’s tokens by K down X up . On the other hand, the user increases a neighbor’s tokens by K up X down if he/she downloads X down bytes from the neighbor un- der study. Notice that a user does not have access to his/her amount of tokens since this is maintained by his/her neighbors. Under the proposed scheme each user requests inter- ested packets from their neighbors and they upload packets to those neighbors who have tokens to download. 92 4.2.2 The Model In this section we propose a mathematical model to study the performance of P2P stream- ing systems that use the proposed token based scheme and the MDC technique. Before proceeding, lets define the variables used in the model. We say that two users are in the same class if they have the same upload link capacity, and we letG = {1,...,K} be the set of user classes in the system. Assume that the user classes are sorted in ac- cordance with their upload link capacity and assume that class 1 users have the lowest upload linkcapacity. Let x i (t) be the number of class i users in the system at time t and let C i down /C i up be the download/upload link capacity of a class i user. Further, let R i down (t)/R i up (t) be the download/upload rate of a class i user at time t. Assume we encodes the video into L layers using MDC. Let layer 1 be the base layer and layers 2...L be enhancement layers. Now denote by s i the rate of layer i and S the aggregate streaming rate of the video. (That is S = L i=1 s i .) Let L i (t) be the highest layer that a class i user requests at time t. Further, let ij (t) be the proportion of upload rate provided by class i users to class j users at time t. Denote by ξ ij (t) the useful factor of class i users to class j users, representing how useful the data that class i users have downloaded is to class j users. First, assume that a user’s download link capacity is larger than or equal to his/her upload link capacity. Therefore, the system’s bottlenecks are the upload links and we can assume that these are fully utilized.This means that R i up (t)= C i up for all i, t.Now the average download rate of class i users is given by the following lemma: 93 Lemma 10. R i down (t)=min K j=1 C j up x j (t) ji (t) x i (t) ,S,C i down . (4.1) Proof. The aggregate download rate of all class i users equals the aggregate rates that all users upload to class i users, which is K j=1 R up (t) j (t)x j (t) ji (t)= K j=1 C j up x j (t) ji (t). Therefore, it is easy to see that the average download rate of class i users is P K j=1 C j up x j (t) ji (t) x i (t) . Further, notice that a user class i user cannot download faster than the streaming rate, S, nor his/her download link capacity, C i down . As a result, the average download rate of class i users is given by Equation (4.1). In order to compute user’s download rates (as we can observe from Equation (4.1)), we have to find ij (t) accordingly, which is given by the following lemma: Lemma 11. ji (t)= ⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎩ x i (t) K n=1 x n (t) ξ ji (t)(1− j−1 n=1 jn ) ifi>j, min x i (t)ξ ji (t) P K n=1 xn(t) , C i up x i (t) ij (t)Kup C j up x j (t)K down , x j (t) P K n=i+1 xn(t) C i spare ifj>i, 1− n=i jn otherwise, (4.2) where C i spare = C i down − i n=1 C n up nn (t). Proof. Notice that, ifi>j, class i users always have sufficient tokens to download from class j users and the dowload link is not the bottleneck link because C i down >C i up >C j up . Therefore, they could fair share available upload link capacity, C j up (1− j−1 n=1 jn ),of 94 class j users. However, we need to consider how useful the data that class j users have is to class i users. As a result, ij (t)= x i (t) P K n=1 xn(t) ξ ji (t)(1− j−1 n=1 jn ) in this case. Ifi<j, first, class i users cannot download faster than their fair shaire, which is x i (t)ξ ji (t) P K n=1 xn(t) . Futher, the dowload rate of class i users may be constrained by the down- load link capacity: Consider that class i users cannot download faster than their “spare” capacity, C i spare , which is C i down − i n=1 C n up nn (t). For all users in class l ∀l> i, they share this spare capacity. Therefore, class j users cannot upload to class i users faster than x j (t) P K n=i+1 xn(t) C i spare . Finally, in long run the token spending rate of class i users C j up x j (t) ji (t)K down should be less than or equal to the token earning rate C i up x i (t) ij (t)K up of class i users. Finally, ξ ji (t) is given the following lemma: Lemma 12. ξ ji (t)= ⎧ ⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎩ 1 if i≥ j, min 1, L i (t) n=1 s n R j down (t) otherwise, (4.3) Proof. In the first case where j ≥ i, the download rate of class j users is greater than the rate of class i users. Therefore, all the data that class j users download are useful to class i users. As a result, we can write: ξ ji (t)=1. In contrast, whenj<i, class j users may be only interested in part of the data that class i users download. This ratio equals to the rate that class i users is downloading to the download rate of class j users, which is P L i (t) n=1 sn R j down (t) . 95 4.2.3 Experiments Figure 4.1: Download rate of L-BW and H-BW users in Scenario 1. We use an event driven P2P simulator developed by M. Zhang to validate our analysis. The detailed simulator description can be found in [71]. In addition, we implement the proposed token based scheme and the MDC functionality to study their impact on the system. To validate the proposed model, we first simulate a P2P streaming with 200 nodes. Nodes join the system in a flash crowd manner and they stay in the system before the video ends. We simulate the system for 10 minutes to obtain the results in steay states. There is only one streaming server in the system with upload link capacity 2000Kbps and the streaming rate of the video is 1000Kbps. Additionally, every user connects to 40 neighbors to exchange the streaming video in the buffer. We present simulation results for three scenarios, which, as we shall see, yield qual- itatively different results when the token based scheme is used. In Scenario 1 there 96 Figure 4.2: Download rate of L-BW, M-BW, and H-BW users in Scenario 2. are two classes of users: 70% low bandwidth (L-BW) users (C L up = 100Kbps and C L down = 1000Kbps) and 30% high bandwidth (H-BW) users (C H up = 1000Kbps and C H down = 1000Kbps). The video is encoded into two layers with s 1 = s 2 = 500Kbps. In Scenario 2, we have three classes of users: 50% L-BW users, 20% H-BW users and 30% medium bandwidth (M-BW) users. We have C M up = 300Kbps and C M down = 1000Kbps and all other settings are the same as Scenario 1. In Sceanrio 3, we have three classes of users: 50% L-BW users, 25% M-BW users, and 25% H-BW users. We have C L up = 100Kbps, C L down = 200Kbps, C M up = 200Kbps, C M down = 600Kbps, C H up = 1000Kbps, and C H down = 1000Kbps. Further, we encode the video into three layers with s 1 = 300Kbps, s 2 = 300Kbps, and s 3 = 400Kbps. We now let K down =1 and study how the token based system behaves for different values of K up in the scenarios we described earlier. Figures 4.1, 4.2, and 4.3 show theoretical and simulation results for Scenarios 1, 2, and 3 respectively. First, from 97 Figure 4.3: Download rate of L-BW, M-BW, and H-BW users in Scenario 3. the plots we see that theoretical and simulation results match. Further, we make the following interesting observation: The download rate of H-BW users first decreases and the download rate of L-BW users first increases, as K up increases. This is because as K up increases L-BW users earn tokens at a faster rate and they can download more data from H-BW users. (Hence their download rate increases.) This however means that H-BW users provide fewer uploads to other H-BW users. Thus, H-BW users have to download now from more L-BW users, and hence their download rate decreases. Further, it is interesting to point out that in the first and second scenarios all classes of users have the same download rate for large K up , whereas in the third scenario the download rates of the three classes are never equal. This is because in the first scenario (for large K up ) both classes of users are downloading from a similar number of H-BW users, and since C L down = C H down = C H up , all classes of users can fully utilize the H-BW users’ upload capacity. (Similar observation can be made in the second scenario too.) In contrast, in 98 the third scenario, while all classes of users are downloading from a similar number of H-BW users (for large K up ), since C H down = C H up but C M down <C H up and C L down <C H up , only H-BW users can fully utilize the upload capacity of other H-BW users from whom they are downloading. 4.3 Trading Social Welfare, Profit and Fairness in P2P Streaming Systems Although it is widely reported that P2P streaming systems are very popular and they have drawn a lot of research interest, there have been few attempts to study the tradeoff between users’ interests, which is usually referred to as social welfare, and the operator’s profit, or between social welfare and fairness. Stochastic optimization provides a powerful infrastructure for optimizing some per- formance metric for dynamic systems with a-prior statistical information. The develop- ment of optimal policies relies on a number of advances including Lyapunov techniques, the introduction of virtual cost queues, etc. [35]. This technique has been widely used in wireless networks to study various problems, e.g. resource allocation [35, 44], power allocation [45, 47], dynamic data compression [46], and etc. We refer interested readers to [35, 44] for further information. 99 In section we use a stochastic optimization framework to study the aforementioned problem. The technique of stochastic optimization has been widely used to study var- ious problems in wireless networks, including resource allocation [35, 44], power allo- cation [45, 47], dynamic data compression [46], and etc. Base on this framework, we propose a joint demand control and scheduling algorithm for P2P streaming systems: the algorithm helps users not only decide a optimal download rate, i.e. demand control, but also determine where to provide uploads, i.e. scheduling. We show that the proposed algorithm can achieve a performance (in terms of users’ social welfare) within O(1/V ) of the optimal with a tradeoff of sacrificing the operator’s profit where V is a control pa- rameter. In addition, we incorporate a fairness constraint in the optimization framework which can maintain fairness among users. In summary, the contributions of this paper are: • We use the technique of stochastic optimization to design a joint demand control and scheduling algorithm that can outperform traditional P2P streaming systems. Further, we show that the proposed algorithm can provide a tradeoff between social welfare and the operator’s profit. • As the by-product of the analysis, we show how the system operator can design a proper pricing scheme to motivate users to operate at the optimal operating point. • We incorporate a fairness constraint to study how the fairness constraint affects system performance and how the constraint motivates user cooperation. We also show that there is a tradeoff between social welfare and fairness. 100 4.3.1 The Model 4.3.1.1 Problem Formulation Consider a P2P streaming system operating in slotted time. Assume there are N receiv- ing nodes, 1...N, and one source node,M, in the system. Now, suppose the source nodeM with upload link capacity C M up wants to stream a video, with a rate r max packets per time slot, to all receiving nodes. In a data driven P2P streaming system each node maintains a buffer to store packets for future playback. In addition to obtaining packets from the source node, nodes will exchange the packets they have with their neighbors. At the beginning of each timeslot nodes send requests to their neighbors and in the mean time they have to decide to which neighbors they will provide uploads (where to send packets). The number of packets that a node can send to a neighbor varies over time because the content in both nodes’ buffers changes over time. Let S ij (t) be the number of packets that node i can send to node j in time slot t and let I ij (t) be the corresponding control decision made by i:If i decides to send packets to j at time t, I ij (t)=1. Otherwise, I ij (t)=0. As a result, the transfer rate on link (i, j) , R ij (t), is given by R ij (S ij (t),I ij (t)) = ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ S ij (t) if I ij (t)=1, 0 otherwise. Assume that S ij (t)≤ C max for all i, j and for all t. Suppose a node can simultaneously transmit packets to K other nodes. That is, the control spaceI equals{I ij (t)| N j=1 I ij (t)≤ 101 K for all i∈{1,...,N}}. Further, a node can also concurrently receive data from mul- tiple neighbors and we assume that the download link capacity is infinite, i.e. a node can receive as much as his/her neighbors provide. Let u i (x) represent the concave non-decreasing utility function of node i with respect to the receiving rate x. The user’s utility can be interpreted as the user’s happiness or benefit regarding to the service provided by a streaming system. For example, as shown in Figure 4.4(i), log(x) is a commonly used utility function: utility increases as the receiving rate x increases but marginal utility decreases as the rate increases. In this section we want to design an algorithm that at every timeslot users observe the current system states and then decide how to request packets and where to send data, in order to maximize social welfare. Denote by x u i and x d 1 the time average upload and download rate of user i respectively. The goal is to achieve a rate vector (x u 1 , x d 1 ,..., x u N , x d N ) that maximizes social welfare under a given fairness constraint. That is, the rate vector solves: Maximize: N i=1 u i (x d i ) Subject to: (x u 1 ,x d 1 ,...,x u N ,x d N )∈ Λ, x d i ≤ r max for all i, C M up + N i=1 x u i = N i=1 x d i . 102 where Λ is the set of all possible time average rate vectors for which a scheduling policy exists that stabilizes the system 1 and satisfy the following fairness constraint: x d i x u i ≤ F for all i, whereF> 1 is a control fairness index. Further, let r min be the minimum packet receiving rate that sustains the video playback and assume thatr min =(r min ,...,r min ) is strictly interior to Λ. To achieve the goal, we use the technique of stochastic optimization to design an algorithm which can result in a rate vector that achieves a performance within O(1/V ) of the optimal solution of the above problem with respect to a policy control parameter V under the given fairness constraint. 4.3.1.2 Algorithm and Analysis We first create a virtual demand queue, Q i (t), for node i as follows: Q i (t+1)= max[Q i (t)− x d i (t), 0] + A i (t) for i∈{1,...,N}, (4.4) where A i (t) and x d i (t) are the packet requesting and receiving rate of node i at time slot t respectively. We choose A i (t) at every slot according to the following constraint: A i (t)≤ r max for i∈{1,...,N}. (4.5) 1 Notice that [44] has shown that the capacity region of this type of network is convex. In addition, if we assume {S ij (t)} is a Markov chain and the stationary distribution of this Markov chain exists, for any rate vector in the capacity region, there exists a stationary randomized algorithm that stabilizes the system. 103 In addition, denote by x u i (t) the packet uploading rate of node i at time slot t. It is easy to see that x u i (t)= N j=1 R ij (t) and x d i (t)= N j=1 R ji (t). Notice that the physical meanings of A i (t) and x d i (t) are the demand (the rate that node i asks for) and the supply (the rate that node i receives) of the streaming video respectively while Q i (t) represents the pending rate. To take fairness into consideration, we create a virtual fairness queue for each node as follows: T i (t+1)= max[T i (t)−Fx u i (t), 0] + x d i (t) for i∈{1,...,N}, (4.6) where T i (t), x d i (t), and Fx u i (t) can be interpreted as the debit, expense (download cost), and earning (upload reward) of user i at time t respectively. Let a i , x u i , and x d i repre- sent the time average requesting rate, upload rate, and download rate of node i respec- tively. That is, a i lim inf t→∞ 1 t t−1 τ=0 E{A i (τ)}, x u i lim inf t→∞ 1 t t−1 τ=0 E{x u i (τ)}, and x d i lim inf t→∞ 1 t t−1 τ=0 E{x d i (τ)}. If we can stabilize the system (i.e. the above queues), we know that the rate vector is in the capacity region and the given fairness constraint is also satisfied. Further, it is easy to see that the flow conservation constraint will also be satisfied by the definition of x u i (t) and x d i (t). Now the original optimization problem reduces to maximizing i u i (x d i ) subject to the system stability [35, 44]. To stabilize the system, we proceed as follows. DefineQ(t)=(Q 1 (t),T 1 (t),...,Q N (t),T N (t)) as the virtual queue backlog vector during time t and define the Lyapunov function as follows: L(Q(t)) = 1 2 N i=1 (Q i (t) 2 + T i (t) 2 ) . 104 The Lyapunov drift is thus given by the following equation: Δ(Q(t)) ≤ 1 2 E N i=1 A i (t) 2 +2x d i (t) 2 +Fx u i (t) 2 |Q(t) − N i=1 Q i (t)E x d i (t)− A i (t)|Q(t) − N i=1 T i (t)E Fx u i (t)− x d i (t)|Q(t) ≤ B− N i=1 Q i (t)E x d i (t)− A i (t)|Q(t) − N i=1 T i (t)E Fx u i (t)− x d i (t)|Q(t) where B Nr 2 max +2N 3 C 2 max + NFK 2 C 2 max 2 where we use the fact that A i (t)≤ r max , x d i (t)≤ NC max , and x u i (t)≤ KC max for all t and all i∈{1,...,N}. Thus, for a given control parameter V ≥ 0,we have Δ(Q(t))−VE N i=1 u i (A i (t))|Q(t) ≤ B− N i=1 Q i (t)E x d i (t)− A i (t)|Q(t) −VE N i=1 u i (A i (t))|Q(t) − N i=1 T i (t)E Fx u i (t)− x d i (t)|Q(t) . (4.7) To minimize the right hand side of the above inequality over all possible control policies I(t) and all possible demand variables{A i (t)} that satisfy Equation (4.5), we perform the following algorithm. The Algorithm Every time slot t, we observeQ(t) andS(t), and do 105 • For each i∈{1,...,N}, choose A i (t)= x, where x solves: Maximize: Vu i (x)− Q i (t)x (4.8) Subject to: x≤ r max • To achieve the optimal rate found in Equation (4.8), we perform as follows: Every time slot, chooseI(t)=I, whereI solves: Maximize: N i=1 Q i (t)x d i (t) (4.9) (without the fairness constraint) Maximize: N i=1 (Q i (t)− T i (t))x d i (t)+ F N i=1 T i (t)x u i (t) (4.10) (with the fairness constraint) Subject to: I∈I. • Update Q i (t) and T i (t) for all i∈{1,...,N} according to Equation (4.4) and (4.6). Remarks: • The pricing scheme: Because users are greedy, if we do not have any pricing scheme, they will choose an x that maximizes their utilities u i (x). Therefore, they will not behave in the same manner as the first part of the proposed algorithm, i.e. Equation (4.8), which tries to maximize social welfare. Now, if we charge users in accordance with the rate they ask for, i.e. Q i (t) V x, users will choose an x that is exactly the same as the solution of Equation (4.8) when they maximize their benefit, 106 i.e. u i (x)− Q i (t) V x. As a result, using the proposed pricing scheme we can motivate users to download at the optimal rates (in the viewpoint of social welfare). Further, as we shall see later, there is a tradeoff between the profit (from this pricing scheme) of the network operator and the performance of the algorithm (how close the pro- posed algorithm can achieve the optimal solution) by controlling the parameter V . Finally, notice that 1 V can be interpreted as the price (per unit rate) that nodes have to pay. • The fairness constraint: In the scenario without the fairness constraint, as we can observe from Equation (4.9), users have no incentive to follow the second part of the the proposed algorithm. However, after introducing the fairness constraint into the system, rational users are motivated to behave the same as the manner described by Equation (4.10) for the following reasons: First, users will provide uploads to reduce their debits when they are in debit, i.e. maximize x u i (t) when T i (t) > 0. Further, they would like to maximize their download rate, if the pending rate (what they have requested) is larger than their debit (Q i (t) ≥ T i (t)), because they want to reduce Q i (t) to lower their payment. Additionally, if their debits are larger than what they have requested (T i (t) >Q i (t)), they will minimize their download rate because, now, they want to reduce their debits. Finally, we will show later that we can also tradeoff fairness for social welfare with different fairness index F . 107 Stability Analysis First recall that for any rate vectorx∈ Λ, there exists a stationary randomized control policy for choosingI(t) by observing the system states S(t) such that: E{R(S(t),I(t))} = x and E x d i (t) x u i (t) ≤ F for all i. Because the proposed algorithm minimizes the right hand side of Equation (4.7) over all alternative policies, we have Δ(Q(t))−VE N i=1 u i (A i (t))|Q(t) ≤ B−VE N i=1 u i (A ∗ i (t))|Q(t) − N i=1 Q i (t)E N j=1 R ji (S(t),I ∗ (t))− A ∗ i (t)|Q(t) − N i=1 T i (t)E F N j=1 R ij (S(t),I ∗ (t))− N j=1 R ji (S(t),I ∗ (t))|Q(t) (4.11) where A ∗ i (t) and I ∗ (t) are any alternative rates and policy chosen subject to the same constraints. Consider the following alternative control policy for time slot t: Choose A ∗ i (t)= r min for all i∈{1,...,N} and chooseI ∗ 1 (t) such that: E{R(S(t),I ∗ 1 (t))} = r min + 108 where =(,...,), and is the largest value such thatr min + ∈ Λ. Plugging this alternative control policy into the right hand side of Equation (4.11) yields: Δ(Q(t))−VE N i=1 u i (A i (t))|Q(t) ≤ B− N i=1 Q i (t)−VE N i=1 u i (r min ) − N i=1 T i (t)((F − 1)(r min + ))≤ B− N i=1 Q i (t). Because u i (A i (t))≤ u i (r max ) for all i and all t,wehave: Δ(Q(t))≤ B + V N i=1 u i (r max )− N i=1 Q i (t) By Lyapunov drift theorem [35], we conclude that: N i=1 Q i ≤ B + V N i=1 u i (r max ) . Similarly, to show the proposed algorithm can stabilize T i , consider the following policy: Choose A ∗ i (t)= r min for all i∈{1,...,N} and chooseI ∗ 2 (t) such that: E{R(S(t),I ∗ 2 (t))} = r min . Along the same lines of analysis, it is easy to show that N i=1 T i ≤ B + V N i=1 u i (r max ) (F − 1)r min . 109 Thus, we conclude that the proposed algorithm can stabilize the system. Additionally, by stabilizing the system, we know that x d i ≥ a i and Fx u i ≥ x d i for all i. Performance Analysis Now, lets analyze how well the proposed algorithm performs. Assume thatx ∗ =(x u 1 ∗ ,...,x d N ∗ ) is the rate vector that solves the original optimization problem. Let’s consider another alternative policy: Choose A ∗ i (t)= x d i ∗ for all t and chooseI ∗ 3 (t) such that: E{R(S(t),I ∗ 3 (t))} = x ∗ . Using this policy, we have: Δ(Q(t))−VE N i=1 u i (A i (t))|Q(t) ≤ B− N i=1 Q i (t)[x d i ∗ − x d i ∗ ]− V N i=1 u i (x d i ∗ ) − N i=1 T i (t) (F − 1)x d i ∗ ≤ B− V N i=1 u i (x d i ∗ ). Using the Lyapunov optimization theorem [35], we have: lim inf t→∞ N i=1 1 t t−1 τ=0 E{u i (A i (τ))}≥ N i=1 u i (x d i ∗ )− B V Using the facts that u i (x) is concave non-decreasing and x d i ≥ a i , along with Jensen’s inequality, we have N i=1 u i (x d i )≥ N i=1 u i (a i )≥ N i=1 u i (x d i ∗ )− B V . (4.12) 110 Further, according to the proposed pricing scheme, we know that the system operator’s profit, P,is lim inf t→∞ N i=1 1 t t−1 τ=0 Q i (t) V Ai(t)≤ N i=1 Q i V r max ≤ Br max V + r max N i=1 u i (r max ) .(4.13) Equations (4.12) and (4.13) show that the proposed algorithm can achieve a performance arbitrarily close to the optimal performance under the given fairness constraint with a tradeoff of sacrificing the operator’s profit. 4.3.2 Simulation Results In this section we present simulation results for a system with 10 nodes: 6 of them are Low-Bandwidth (L-BW) nodes and 4 of them are High-Bandwidth (H-BW) nodes. We let r max = 1000, r min = 100, and K =1. To keep the exposition simple, we assume that S ij (t) has two states for all i, j. (Notice that to the best of our knowledge, there are no real measurements to guide us in our assumptions about S ij (t).) In particular, we assume that S ij (t)= ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ C i up with probability 0.5, 0.4C i up with probability 0.5. where C i up = ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩ 1000 if node i is a H-BW node, 100 otherwise. 111 x u i (x) x u i (x) 10 −4 10 −3 10 −2 10 −1 10 0 0 0.2 0.4 0.6 0.8 1 Profit 1/V 10 −4 10 −3 10 −2 10 −1 10 0 0 0.2 0.4 0.6 0.8 1 Welfare Profit Welfare Welfare (Traditional) (i) (ii) Figure 4.4: (i) Illustration for the user’s utility functionu i (x), and (ii) the tradeoff between social welfare and the operator’s profit for Scenario 1. We consider three different scenarios. In Scenario 1, we assume that all nodes have the same utility function and we do not impose the fairness constraint. In particular, we use u i (x)=log( x r min ) for all i. In Scenario 2, the utility functions are kept the same but we incorporate the fairness constraint. In Scenario 3, in addition to the fairness constraint, we consider a scenario where nodes have different utility functions. In particular, the utility function of H-BW nodes is three times that of the L-BW nodes. 4.3.2.1 Scenario 1 Figure 4.4(ii) shows how social welfare and the operator’s profit change with respect to the control parameter V for Scenario 1. First, notice that the tradeoff between maximiz- ing social welfare and maximizing the operator’s profit can be easily observed from the plot. Further, recall from Section 4.3.1 that the physical meaning of 1 V (X-axis) is the price (per unit rate) that nodes have to pay. Therefore, when 1 V (the price) decreases, the 112 1 2 3 4 5 6 1 2 3 4 5 D/U Ratio F 1 2 3 4 5 6 0.6 0.7 0.8 0.9 1 Welfare D/U Ratio of L−BW Nodes D/U Ratio of H−BW Nodes Welfare Welfare (Traditional) 1 2 3 4 5 0.5 1 1.5 2 2.5 3 D/U Ratio F 1 2 3 4 5 0.5 0.6 0.7 0.8 0.9 1 Welfare D/U Ratio of L−BW Nodes D/U Ratio of H−BW Nodes Welfare Welfare (Traditional) (i) (ii) Figure 4.5: (i) Tradeoff between social welfare and fairness for Scenario 2, (ii) tradeoff between social welfare and fairness for Scenario 3. user download rates (demand) increases and thus social welfare increases. However, the operator’s profit decreases as the price goes down. Before proceeding, notice that, as we can observe from the plot, the optimal perfor- mance of the proposed algorithm is 30% better than the traditional system. The perfor- mance enhancement comes from the fact that the proposed algorithm can increase nodes’ receiving rates and hence utilities. In particular, the algorithm can increase nodes’ receiv- ing rates by observing the system states,S(t), and then sending data to neighbors with high demand, i.e. sending data over links with large S ij , while nodes randomly choose neighbors to provide uploads in the traditional system. (In the rest of this section we will be referring to the standard system where nodes choose randomly the neighbors to which they provide uploads as the “traditional” system.) 113 4.3.2.2 Scenario 2 Figure 4.5(i) shows the results for Scenario 2. First, notice that the X-axis is now the fairness index F and that the tradeoff is between social welfare and fairness. Further, we can observe from the plot that, if we do not consider the fairness constraint, i.e. when F is large, we can maximize social welfare. However, when we enforce the fairness constraint, i.e. when F is small, we sacrifice social welfare for fairness: Now, L-BW users cannot download too much from the system (because they do not contribute much) and thus their utilities decrease. Although the utility of H-BW users increases as their download rates increase, the increase is smaller than the decrease of L-BW users (recall that the utility function is concave). As a result, social welfare decreases as F decreases. Finally, we can observe that the proposed algorithm outperforms the traditional system at most operating points. 4.3.2.3 Scenario 3 Similar observations can be made in Figure 4.5(ii) where we present the results for Sce- nario 3. First, we can observe from the plot that the proposed algorithm outperforms the traditional system for all values of F . This is because: The traditional system does not differentiate nodes with high utility functions, i.e. H-BW nodes, from other nodes. Therefore, all nodes are treated the same and thus they receive the same download rate. In contrast, the propose algorithm takes users’ utility functions into consideration and thus gives H-BW nodes a higher download rate than L-BW users, which is a natural con- sequence of maximizing social welfare. Further, notice that the tradeoff between fairness 114 10 20 30 40 50 60 0.6 0.7 0.8 0.9 1 1.1 Decision Period Welfare Proposed Algorithm Traditional System Figure 4.6: The tradeoff between social welfare and the decision making period. and social welfare can be also observed in 4.5(ii). Additionally, compared with Scenario 2 (Figures 4.5(i)), we can observe that the fairness constraint has less impact in Scenario 3. This can be explained as follows: Because H-BW nodes have a high utility function in Scenario 3, as mentioned earlier, the proposed algorithm will give H-BW nodes a high download rate, in order to maximize social welfare. This implies another form of fair- ness. (Intuitively, nodes which provide high upload rates will have high utility functions and they should receive high download rates.) As a result, the fairness constraint does not have significant impact in Scenario 3. 4.3.3 Implementing the algorithm In this section we address implementation concerns for the proposed algorithm. First notice that a central server is required to monitor the system, measure the statistics, and make decisions. Further, recall that the proposed algorithm consists of two parts: (i) 115 demand control, i.e. Equation (4.8), and (ii) scheduling decision, i.e. Equation (4.10). In Section 4.3.2 we solve Equations (4.8) and (4.10) every timeslot and we show that the proposed algorithm performs very well. Given that a timeslot in P2P streaming systems is in the order of seconds, it is easy to see that nodes can easily solve Equation (4.8) locally every timeslot. However, taking into consideration the communication and computation overhead along with the fact that the number of users in a P2P streaming system is in the order of thousands, it means that it is impractical to solve Equation (4.10) at the central server at every timeslot. For example, it takes a machine with an Intel Pentium IV 3GHz CPU about one minute to solve Equation (4.10) for a system with 1000 nodes. Considering that a central server could have much more computational power and use a better/faster solver than the one we used, it is reasonable to assume that the server can easily solve Equation (4.10) and then make decisions every say 10 timeslots. Motivated by this discussion, we are interested in how the proposed algorithm performs with respect to the decision period, i.e. the time period between two consecutive decisions. We simulate Scenario 1 described in Section 4.3.2 for different decision periods. The results are shown in Figure 4.3.3. First, we observe from the plot that the performance of the proposed algorithm decreases as the decision period increases, as expected. Indeed, since the proposed algorithm computes the optimal scheduling policy at the first timeslot of a decision period, this policy may not be optimal for the following timeslots of the decision period. Finally, notice that if we assume that the proposed algorithm will be making decisions every say 10 timeslots, it is still doing 20% better than the traditional system. 116 Chapter 5 Conclusions In this thesis we study performance and incentive schemes of P2P systems. We analyze the performance of different P2P systems, propose a general token based incentive algo- rithm, and study how different incentive schemes affect the performance of P2P systems. We first the performance of Gnutella-like systems and then we use a general idea that users use tokens to trade resources to motivate cooperation in P2P systems. We also derive a mathematical model that describes the system’s dynamics and which can be used for parameter tuning and performance prediction. We demonstrate the effectiveness of the algorithm via experiments with TCP networks. Further, we have proposed a mathematical model to study the performance of hetero- geneous BitTorrent-like systems. In particular, we have presented a model that can be used to predict the average file download delay among users with different capacities. Additionally, we use a revised fluid model to capture time dynamics (transient behavior) of BitTorrent-like system under heterogeneous environment. Additionally, we also pro- posed an alternative TFT scheme, based on the proposed token algorithm, that can be 117 used to tradeoff between fairness and system performance. We have extended our math- ematical models in order to predict the system performance under the proposed scheme and for tuning the scheme’s parameters. All results have been verified using extensive simulations. Based on our framework in BitTorrent, we first analyze the performance of data- driven P2P streaming systems with the MDC scheme. We show that the proposed model can accurately predict the performance of such systems. Additionally, we propose a to- ken based MDC scheme to provide incentives for P2P streaming systems. The proposed scheme is flexible and general. We further extend our model to predict the performance of the token based MDC systems. We also show how our model can be used to decide on the scheme parameters that achieve a target tradeoff between overall system performance and fairness. Finally, we present a joint demand control and scheduling algorithm which, by ob- serving the system states, makes decisions to optimize social welfare under a given fair- ness constraint for P2P streaming systems. We first show that the proposed algorithm can outperform the traditional system in which nodes choose randomly to which neighbors to upload data. More interestingly, we show that the proposed algorithm can tradeoff between social welfare and the operator’s profit by choosing a proper control parame- ter. Further, we study how the fairness constraint affects system performance and we also show that we can tradeoff between system performance and fairness by choosing a proper fairness index. 118 Reference List [1] E. Adar and B. Huberman. Free riding on gnutella. http://www.firstmonday.dk/ issues/issue5 10/adar. [2] Shahzad Ali, Anket Mathur, and Hui Zhang. Measurement of commercial peer-to- peer live video streaming. In Proc. of Workshop in Recent Advances in Peer-to-Peer Streaming, August 2006. [3] K. Anagnostakis and M. Greenwald. Exchanged-based incentive mechanisms for peer-to-peer file sharing. In Proc. of 24th International Conference on Distributed Computing Systems, 2004. [4] R. Bhagwan, S. Savage, and G. M. V oelker. Understanding availability. In Proc. of 2nd IPTPS, 2003. [5] A. Bharambe, C. Herley, and V . Padmanabhan. Microsoft research simulator for the bittorrent protocol. http://www.research.microsoft.com/projects/btsim. [6] A. Bharambe, C. Herley, and V . Padmanabhan. Analyzing and improving bittorrent performance. In Proc. of IEEE INFOCOM, 2006. [7] Bigchampagne. http://www.bigchampagne.com/. [8] Bittorrent. http://www.bittorrent.com/protocol.html. [9] H. Bretzke and J. Vassileva. Motivating cooperation in peer to peer networks. In Proc. of User Modeling UM03, June 2003. [10] C. Buragohain, D. Agrawal, and S. Suri. A game-theoretic framework for incen- tives in P2P systems. In Proc. of International Conference on Peer-to-Peer Com- puting, Sep 2003. [11] K. Calvert, M. Doar, and E. W. Zegura. Modeling internet topology. IEEE Com- munications Magazine, 1997. [12] M. Castro, P. Druschel, A. M. Kermarrec, A. Nandi, A. Rowstron, and A. Singh. Splitstream: High-bandwidth multicast in cooperative environments. In Proc. of ACM SOSP, October 2003. [13] J. Chu, K. Labonte, and B. N. Levine. Availability and locality measurements of peer-to-peer file sharing systems. In Proc. of SPIEITCom: Scalability and Traffic Control in IP Networks, July 2002. 119 [14] Y . H. Chu, S. G. Rao, and H. Zhang. A case for end system multicast. In Proc. of ACM SIGMETRICS, June 2000. [15] F. Clevenot, P. Nain, and K.W. Ross. Multiclass p2p networks: Static resource allocation for service differentiation and bandwidth diversity. In Proc. of IFIP WG 7.3 PERFORMANCE, 2005. [16] B. Cohen. Incentives build robustness in bittorrent. In Workshop on Economics of Peer-to-Peer Systems, 2003. [17] R. Durrett. Probability: Theory and Examples. Duxbury Press, 2nd edition, 1996. [18] B. Fan, D.-M. Chiu, and J. Lui. Stochastic differential equation approach to model bittorrent-like file sharing systems. In Proc. of 14th IEEE International Workshop on Quality of Service, 2006. [19] M. Feldman, C. Papadimitriou, I. Stoica, and J. Chuang. Free-riding and white- washing in Peer-toPeer systems. In SIGCOMM Workshop, 2004. [20] R. Gaeta, M. Gribaudo, D. Manini, and M. Sereno. Analysis of resource trans- fers in peer-to-peer file sharing applications using fluid models. In Performance Evaluation, volume 63, pages 149–174. Elsevier, 2006. [21] L. Guo, S. Chen, Z. Xiao, E. Tan, X. Ding, and X. Zhang. Measurements, anal- ysis, and modeling of bittorrent-like systems. In Proc. of Internet Measurement Conference, 2005. [22] L. Guo, S. Chen, Z. Xiao, E. Tan, X. Ding, and X. Zhang. A performance study of bittorrent-like peer-to-peer systems. In IEEE Journal on Selected Areas in Commu- nications, volume 25, January 2007. [23] Hack KaZaA participation level. http://www.davesplanet.net/kazaa/. [24] D. Hales and S. Patarin. How to cheat bittorrent and why nobody does. Technical Report UBLCS-2005-12, University of Bologna, Italy, 2005. [25] Qi He and Mostafa Amar. Congestion control and massage loss in gnutella net- works. In Proc. of Multimedia Computing and Networking, 2004. [26] Mohamed M. Hefeeda, Ahsan Habib, and Bharat K. Bhargava. Cost-profit analysis of a peer-to-peer media streaming architecture. Technical report, Purdue University, 2003. [27] X. Hei, C. Liang, Y . Liu J. Liang, and K.W. Ross. A measurement study of a large- scale p2p iptv system. IEEE Transactions on Multimedia, 9(8), December 2007. [28] D. Hughes, G. Coulson, and J. Walkerdine. Free riding on gnutella revisited: the Bell Tolls? In IEEE Distributed Systems Online Journal, volume 6, June 2005. 120 [29] J. Ioannidis, S. Ioannidis, A. Keromytis, and V . Prevelakis. Fileteller. paying and getting paid for file storage. In Proc. of 6th International Conference on Financial Cryptography, March 2002. [30] M. Izal, G.U. Keller, E.W. Biersack, P.A. Felber, A.A. Hamra, and L.G. Erice. Dissecting bittorrent: Five months in a torrent’s lifetime. In Proc. of Passive and Active Measurements, 2004. [31] S. D. Kamvar, M. T. Schlosser, and H. Garcia-Molina. EigenRep: Reputation man- agement in P2P networks. In Proc. of 12th International World Wide Web Confer- ence (WWW 2003), May 2003. [32] KaZaA media desktop. http://www.kazaa.com/. [33] KaZaA participation level. http://www.kazaa.com/us/help/glossary/participation ratio.htm. [34] Rakesh Kumar, Yong Liu, and Keith Ross. Stochastic fluid theory for p2p streaming systems. In Proc. of IEEE INFOCOM, May 2007. [35] L. Tassiulas L. Georgiadis, M. J. Neely. Resource allocation and cross-layer control in wireless networks. Foundations and Trends in Networking, 1(1), 2006. [36] Nikitas Liogkas, Robert Nelson, Eddie Kohler, and Lixia Zhang. Exploiting bittor- rent for fun (but not profit). In Proc. of 6th IPTPS, 2006. [37] Jiangchuan Liu, Sanjay G. Rao, Bo Li, and Hui Zhang. Opportunities and chal- lenges of peer-to-peer internet video broadcast. In Proc. of the IEEE, volume 96, January 2008. [38] Z. Liu, Y . Shen, S. Panwar, K.W. Ross, and Y . Wang. Using layered video to provide incentives in p2p live streaming. In Proc. of Sigcomm P2P-TV Workshop, August 2007. [39] Z. Liu, Y . Shen, S. Panwar, K.W. Ross, and Y . Wang. P2p video live streaming with mdc: Providing incentives for redistribution. In Proc. of ICME, July 2007. [40] Nazanin Magharei and Reza Rejaie. Prime: Peer-to-peer receiver-driven mesh- based streaming. In Proc. of IEEE INFOCOM, May 2007. [41] Nazanin Magharei, Reza Rejaie, and Yang Guo. Mesh or multiple-tree: A compar- ative study of live p2p streaming approaches. In Proc. of IEEE INFOCOM, May 2007. [42] Mojonnation. http://www.mojonation.net/Mojonation.html. [43] A. Nazareno, B. Francisco, C. Walfredo, and M. Miranda. Discouraging free riding in a peer-to-peer CPU-sharing grid. In Proc. of 13th IEEE Int. Symposium on High Performance Distributed Computing (HPDC’04), 2004. 121 [44] M. J. Neely. Dynamic power allocation and routing for satellite and wireless net- works with time varying channels. Ph.D. dissertation, Massachusetts Institute of Technology, LIDS, 2003. [45] M. J. Neely. Energy optimal control for time varying wireless networks. IEEE Transaction on Information Theory, 52(7), July 2006. [46] M. J. Neely. Dynamic data compression for wireless transmission over a fading channel. In Information Sciences and Systems (CISS), March 2008. [47] M. J. Neely, E. Modiano, and C. E. Rohrs. Dynamic power allocation and routing for time varying wireless networks. IEEE Journal on Selected Areas in Communi- cations, Special Issue on Wireless Ad-Hoc Networks, 23(1), Jan 2005. [48] Network simulator. http://www.isi.edu/nsnam/ns. [49] Packet-level Peer-to-Peer Simulation Framework and GnutellaSim. http://www.cc. gatech.edu/computing/compass/gnutella/. [50] V . Pai, K. Tamilmani, V . Sambamurthy, and K. Kumar. Chainsaw: Eliminating trees from overlay multicast. In Proc. of IPTPS, February 2005. [51] A. Parker. The true picture of peer-to-peer filesharing. http://www.cachelogic. com/. [52] F.L. Piccolo and G. Neglia. The effect of heterogeneous link capacities in bittorrent- like file sharing systems. In Proc. of Hot-P2P, 2004. [53] J.A. Pouwelse, P. Garbacki, D.H.J. Epema, and H.J. Sips. The bittorrent p2p file- sharing system: Measurements and analysis. In Proc. of 5th IPTPS, 2005. [54] Pplive. http://www.pplive.com/. [55] Ppstream. http://www.ppstream.com/. [56] D. Qiu and R. Srikant. Modeling and performance analysis of bittorrent-like peer- to-peer networks. In Proc. of ACM SIGCOMM, 2004. [57] L. Ramaswanmy and L. Liu. Free-riding: A new challenge to peer-to-peer file sharing systems. In Proc. of the 36th Hawaii international conference on system sciences, 2003. [58] S. Saroiu, K. P. Gummadi, R. J. Dunn, S.D. Gribble, and H. M. Levy. An analysis of internet content delivery systems. In Proc. of the Fifth Symposium on Operating System Design and Implementation (OSDI), 2002. [59] S. Saroiu, K.P. Gummadi, R.J. Dunn, and S.D. Gribble. A measurement study of peer-to-peer file sharing systems. In Proc. of Multimedia Computing and Network- ing 2002 (MMCN ’02), 2002. 122 [60] Purvi Shah and Jehan-Francois Paris. Peer-to-peer multimedia streaming using bit- torrent. In Proc. of IEEE IPPPC, April 2007. [61] Michael Sirivianos, Jong Han Park, Rex Chen, and Xiaowei Yang. Free-riding in bittorrent networks with the large view exploit. In Proc. of IPTPS, 2007. [62] Sopcast. http://www.sopcast.com/. [63] The emule project. http://www.emule-project.net/. [64] R.W. Thommes and M.J. Coates. Bittorrent fairness: analysis and improvements. In Proc. Workshop Internet, Telecom. and Signal Proc., December 2005. [65] Y . Tian, D. Wu, and K.W. Ng. Modeling, analysis and improvement for bittorrent- like file sharing networks. In Proc. of IEEE INFOCOM, 2006. [66] V . Vishnumurthy, S. Chandrakumar, and E. G. Sirer. KARMA: A secure economic framework for P2P resource sharing. In 1st Workshop on Economics of Peer-to- Peer Systems, June 2003. [67] J. Wang, C. Yeo, V . Prabhakaran, and K. Ramchandran. On the role of helpers in peer-to-peer file download systems: Design, analysis, and simulation. In Proc. of IPTPS, 2007. [68] Wolfram mathematica. http://www.wolfram.com/. [69] X. Yang and G. D. Veciana. Service capacity of peer to peer networks. In Proc. of IEEE INFOCOM, 2004. [70] Meng Zhang, Chunxiao Chen, Yongqiang Xiong, Qian Zhang, , and Shiqiang Yang. Measurement of commercial peer-to-peer live video streaming. In Proc. of ACM Multimedia Modeling, January 2007. [71] Meng Zhang, Qian Zhang, and Shiqiang Yang. Understanding the power of pull- based streaming protocol: Can we do better? IEEE Journal on Selected Areas in Communications, 25(8), December 2007. [72] Xinyan Zhang, Jiangchuan Liu, Bo Li, Tak-Shing, and Peter Yum. Coolstream- ing/donet: A data-driven overlay network for peer-to-peer live media streaming. In Proc. of IEEE INFOCOM, July 2007. [73] Y . Zhou, D. Chiu, and J. Lui. A simple model for analysis and design of p2p streaming algorithms. In Proc. of ICNP, October 2007. 123
Abstract (if available)
Abstract
Peer-to-peer (P2P) systems provide a powerful infra-structure for large scale distributed systems, such as file sharing and content distribution. The performance of peer-to-peer systems depends on the level of cooperation of the system's participants. While most existing peer-to-peer architectures have assumed that users are generally cooperative, there is great evidence from widely deployed systems suggesting the opposite. To date, many schemes have been proposed to alleviate this problem.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Techniques for peer-to-peer content distribution over mobile ad hoc networks
PDF
Satisfying QoS requirements through user-system interaction analysis
PDF
Peer-to-peer content networking with copyright protection and jitter-free streaming
PDF
QoS-aware algorithm design for distributed systems
PDF
Scalable reputation systems for peer-to-peer networks
PDF
Distributed indexing and aggregation techniques for peer-to-peer and grid computing
PDF
Approximate query answering in unstructured peer-to-peer databases
PDF
QoS-based distributed design of streaming and data distribution systems
PDF
Distributed resource management for QoS-aware service provision
PDF
High-performance distributed computing techniques for wireless IoT and connected vehicle systems
PDF
Anycast stability, security and latency in the Domain Name System (DNS) and Content Deliver Networks (CDNs)
PDF
Adaptive resource management in distributed systems
PDF
Design-time software quality modeling and analysis of distributed software-intensive systems
PDF
Scalable peer-to-peer streaming for interactive applications
PDF
Networked cooperative perception: towards robust and efficient autonomous driving
PDF
IEEE 802.11 is good enough to build wireless multi-hop networks
PDF
Performant, scalable, and efficient deployment of network function virtualization
PDF
Distributed edge and contour line detection for environmental monitoring with wireless sensor networks
PDF
Location-based spatial queries in mobile environments
PDF
Large system analysis of multi-cell MIMO downlink: fairness scheduling and inter-cell cooperation
Asset Metadata
Creator
Liao, Wei-Cherng
(author)
Core Title
Performance and incentive schemes for peer-to-peer systems
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
02/25/2009
Defense Date
12/02/2008
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Fairness,incentive,OAI-PMH Harvest,P2P systems,performance analysis,token scheme
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Psounis, Konstantinos (
committee chair
), Golubchik, Leana (
committee member
), Govindan, Ramesh (
committee member
)
Creator Email
wcliao@gmail.com,weicherl@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m1986
Unique identifier
UC1304752
Identifier
etd-Liao-2582 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-148143 (legacy record id),usctheses-m1986 (legacy record id)
Legacy Identifier
etd-Liao-2582.pdf
Dmrecord
148143
Document Type
Dissertation
Rights
Liao, Wei-Cherng
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
incentive
P2P systems
performance analysis
token scheme