Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 904 (2009)
(USC DC Other)
USC Computer Science Technical Reports, no. 904 (2009)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
On the Tradeoff Between Playback Delay and Buffer Space in Streaming Alix L.H. Chow ∗ Leana Golubchik ∗† Samir Khuller ‡ Yuan Yao † ∗ CS Department USC Los Angeles, CA 90089 {lhchow,leana}@usc.edu † EE Department USC Los Angeles, CA 90089 yuanyao@usc.edu ‡ CS Department University of Maryland College Park, MD 20742 Samir@cs.umd.edu Abstract We consider the following basic question: a source node wishes to stream an ordered sequence of packets to a collection of receivers, which are in K clusters. A node may send a packet to another node in its own clus- ter in one time step and to a node in a different clus- ter in T c time steps (T c > 1). Each cluster has two special nodes. We assume that the source and the spe- cial nodes in each cluster have a higher capacity and thus can send multiple packets at each step, while all other nodes can both send and receive a packet at each step. We construct two (intra-cluster) data communica- tion schemes, one based on multi-trees (using a collec- tion ofd-ary interior-disjoint trees) and the other based on hypercubes. The multi-tree scheme sustains stream- ing within a cluster withO(dlogN) maximum playback delay and O(dlogN) size buffers, while communicat- ing withO(d) neighbors, whereN is the maximum size of any cluster. We also show that this protocol is opti- mal when d = 2 or 3. The hypercube scheme sustains streaming within a cluster, with O(log 2 ( N d )) maximum playback delay and O(1) size buffers, while communi- cating withO(log( N d )) neighbors, for arbitraryN. 1 Introduction Continuous media (CM) streaming over a variety of networks is an application which provides a rich source of interesting algorithmic problems. The specific prob- lem we consider here is that of a source node stream- ing data to a collection of receiving nodes, where the receivers need to contribute to the delivery process, i.e., due to communication (bandwidth) resource limitations it is not possible for all nodes to receive the stream di- rectly from the source. This is a standard motivation for use of peer-to-peer (P2P) systems or systems that in- volve application layer multicast. The streamed data can correspond to either a live source (i.e., the data is pro- duced during the delivery process), or to a pre-recorded stream (i.e., all data is available at the beginning of the delivery process). In a real environment, we may have nodes that are in different geographical locations and thus communica- tion delays may be significantly different. Assume that the nodes are divided into several clusters (e.g., based on geographic proximity). Each node can transmit one packet to any node within the same cluster in one time step. However, although packets could be sent from one node to any node in another cluster, the transmission de- lay across clusters is large (this is similar to the model in [9]). As a result, when streaming packets through the network, it is desirable to have a small number of trans- missions across clusters. To be more specific, assume that there are K clusters, each containing a sufficiently large number of nodes. Let the delay to send a message between nodes in two different clusters beT c . However, within a cluster we can send a message to another node in the same cluster in one time step. Our packet distribu- tion scheme will distribute the stream of packets using a “super-tree”,τ, on the clusters, constructed by selecting a special node from each cluster. Within each cluster a special (local) root node will be responsible for distribut- ing the stream of packets to the members of that cluster. Details of this are given in Section 2. For abstraction purposes, we use the following com- munication model for a cluster (with more details given in Section 2). We view the cluster (logically) as a fully connected graph. That is, any node i can send/receive packets to/from any other nodej in the cluster. In a sin- gle time slot (as defined in Section 2), each node i can transmit one packet and receive one packet; a number of works use this model [6, 1, 10, 8] as a communi- cation abstraction. The packets can arrive at a node in any order, but they must be played back in a specific or- der (and at a specific rate), corresponding to the original recording rate of the stream. The subset of the (fully connected) graph edges used for packet delivery form a mesh. The system’s performance is a function of the algorithms used to construct and maintain this mesh as well as the algorithms used for scheduling packet deliv- ery over the mesh. Hence, our paper focuses on specific approaches to constructing such meshes within a single cluster. Our primary goal is to develop an understanding of two related quantities – playback delay and buffering requirements. As an example, one could simply chain the receiving nodes of a cluster (of size N) in a list, where the source streams packets to the first node in the list. Each node then simply forwards the packets to the next node in the list, and so on. While the buffering re- quirements are minimal in this approach, the playback delay is unacceptable for all but a few nodes, particu- larly since the cluster could be large. Another simple approach might be to arrange the nodes, for instance, in a binary tree, with the source being the root of that tree. Each node would then need to forward the packets to its two children. While this results in constant buffering re- quirements andO(logN) delay, this also requires each node to have at least twice as much upload bandwidth as bandwidth needed for downloading (i.e., streaming). (Note also that approximately half the nodes, i.e., leaves of the tree, are not contributing to the streaming pro- cess; hence the need for other nodes to make up for the lack of upload capacity in the system.) This is not a reasonable requirement as typically a node’s upload bandwidth is significantly lower than its download band- width. Hence, better approaches to mesh construction are needed, which result in acceptable playback delay and buffering characteristics, while utilizing system re- sources efficiently. To this end, in this paper we explore two approaches to mesh construction and subsequent streaming, one based on multi-trees and the other based on a hypercubes and generalization of [5] (which was designed for mes- sage broadcast). We use these approaches to explore the resulting playback delay, buffer space, and communica- tion 1 requirements. Specifically, we first adapt the scheme in [5] to streaming (as discussed in Section 3) with O(1) buffer space requirements andO(logN) playback delay. How- ever, the more direct adaptation is done in the context of certain values of N (number of nodes in a cluster). In doing so, we observe that particular care must be taken in such an adaptation in order to limit the number of neighbors with which a node needs to communicate. Our motivation for limiting the number of neighbors with which a node communicates is that such commu- nication requires protocol maintenance overhead, e.g., due to “keep alive” messages, due to nodes joining and departing (under node churn), and so on. Thus, we ex- tend this scheme further, such that (a) it works for ar- bitrary values of N while (b) limiting the number of neighbors with which each node needs to communicate toO(logN). In the case of multi-tree based schemes, we perform streaming on a collection of d, d-ary interior-disjoint, trees where we show the playback delay to be at most dlog d N for all receivers. By interior-disjoint we mean that each of the receivers does not appear as an inte- rior node in more than one of the d trees. Unlike the hypercube-based scheme described above, the multi- tree-based schemes only require each node to commu- nicate with at most 2d nodes in its cluster 2 . Our approach has provable quality-of-service (QoS) guarantees, and we provide analysis of corresponding performance characteristics. Specifically, in this work we focus on playback delay and buffer size requirements, as our metrics of QoS, and we study them as a function ofd andN. Note that in this paper we focus on a more “struc- tured” approach to mesh construction, i.e., the set of edges used for delivery of packets is fixed by our algo- 1 By communication requirements we mean the number of neigh- bors with which a node needs to communicate in a particular scheme, as detailed later. 2 As shown later, smaller values of d are more desirable for play- back delay and buffer space requirements, and thus should result in a smaller number of neighbors with which a node needs to communi- cate. rithms. An alternate approach might be to use an “un- structured” approach – i.e., the edges used for delivery are determined on a per packet basis (essentially on the fly when the data is needed). This allows the system to more easily adapt to node churn. However, existing un- structured approaches to streaming are essentially “best effort”, and little exists in the way of formal analysis of resulting QoS guarantees. A number of very nice works have looked at formal analysis of unstructured ap- proaches to file downloads, e.g., as in [6, 1]. Here the file is decomposed into k chunks, and then distributed using a randomized gossip mechanism. However, since we are streaming a very large (potentially infinite) num- ber of packets, the arrival order of packets is important, otherwise they all have to be buffered. In addition, the chunks need to be all available initially, which is not the case for live streaming. Techniques with provable QoS guarantees, such as ours, may be more suitable for sce- narios where QoS is of importance. Independently, [12] focuses on characterizing limits of peer-assisted live streaming (for “structured” and “unstructured” systems) and gives performance bounds (on minimum source ca- pacity and tree depth, and maximum streaming rate) us- ing a fluid flow (rather than packet) model and under dif- ferent assumptions from ours (e.g., they assume a poten- tially unlimited source capacity, they do not constraint trees to be interior-disjoint, etc.). The contributions of this work are as follows: • We provide algorithms for constructing multiple streaming trees and a corresponding transmission schedule so as to maintain QoS characteristics (see Section 2). We are also able to extend our schemes to dynamic scenarios while maintain our nice tree properties. Due to lack of space, these results are given in the appendix. • We analyze the QoS of the multi-tree-based schemes and derive an upper bound on the delay required at each node before it can start playback, and use it to bound the required buffer size; we also prove a lower bound on the average playback delay (see Section 2.3) 3 . • Interestingly, our work establishes that it is only useful to consider degree2 and3 trees, if one wants to minimize worst-case delay (see Section 2.3). 3 We also evaluated our schemes through simulations; the results are omitted here due to lack of space. • We present algorithms for extending the scheme in [5] such that it works for streaming and for ar- bitrary values of N with a provable limit on the number of neighbors with which a node needs to communicate (see Section 3); we refer to this as a hypercube-based scheme. • We analyze the hypercube-based schemes to give upper bounds on worst case and average case play- back delay (see Section 3.2). • Our main schemes are presented under the model of each cluster being a fully connected graph, i.e., where we assume that all nodes in a cluster are capable of communicating with each other. The existence of two interior disjoint trees is an NP - complete problem under arbitrary graphs. Due to lack of space, theNP -completeness proof is given in the appendix. Before presenting our algorithms, we give a brief overview of related literature. The original end-system multicast approach [3] explores the use of an appli- cation level multicast tree for streaming; however, it suffers from several shortcomings, including: (i) leaf nodes contribute no resources (and hence some system resources are essentially wasted), (ii) less resilience to node failures, (iii) less flexibility in use of bandwidth (e.g., bandwidth is used in units of needed streaming rate), and (iv) internal nodes require significantly higher upload bandwidth (than the streaming rate of a stream) in order to keep a multicast tree shallow (which is de- sirable as a single deep tree results in long startup de- lays and large buffer size requirements). Existing litera- ture explores two directions in trying to overcome these shortcomings. One direction (as in our work) is the use of multiple trees, e.g., as in [2]. However, most such works focus on providing different quality of service to users with different capabilities as well as on adapting to failures and bandwidth heterogeneity, e.g., through the use of Multiple Description Coding (MDC). Our tech- niques can be combined with MDC as well, but we do not rely on their use. Other works, [14], address sys- tem dynamics at the cost of not maintaining nice tree properties (e.g., balance), whereas our schemes do main- tain such properties. One generalization of [2] examines stability properties of the overlay [4] and, unlike in this work, considers QoS in a probabilistic setting. Overall, the distinction of our effort, is that we provide provable QoS guarantees (such as startup delay). Another direction is to abandon the use of trees in favor of unstructured peer-to-peer (P2P) systems, e.g., [15], which allows for greater flexibility in dealing with system dynamics and heterogeneity. However, to date most such solutions have been fairly heuristic in nature and (as a result) difficult to analyze, from the perspec- tive of performance as well as QoS guarantees. One no- table exception is the analysis of BitTorrent in [1]. This work gives a very nice analysis which explains BitTor- rent’s success for download applications; however it is not clear how successfully it can be applied to stream- ing applications, where the data needs to be delivered in time for playback. That is, one can view a stream as a collection of small segments and apply the analysis in [1] (or in other works, e.g., [6]) to each segment. How- ever, given the bound in [1], keeping up with real-time constraints would require an assumption of faster packet transmission than playback, an assumption we do not make here. Lastly, a comparison study of multi-tree-based ap- proaches and unstructured P2P schemes is given in [13]. One interesting difference, as compared to our results, is that the results in [13] suggest the use of higher degree trees, whereas our work shows optimality of lower de- gree trees (refer to Section 2.3). But, the main metric in [13] is bandwidth utilization under heterogeneous con- ditions, whereas we focus on playback delay. Overall, a high level conclusion from [13] is that unstructured as- pects of the system allow for better utilization of hetero- geneous resources and better adaptivity to node churn which lead to better QoS. We note that this study is (a) done for specific multi-tree and unstructured schemes, (b) assumes the use of MDC, and (c) is simulation based, i.e., no provable QoS guarantees are given. By contrast, the work here does derive provable QoS guarantees for a class of multi-tree based schemes, and thus is more useful to scenarios where QoS is of importance. 2 Multi-Tree Construction and Transmis- sion Let S be the source node of the streamed data (e.g., video and/or audio), which corresponds to a (poten- tially infinite) sequence of data units (also referred to as “packets”) being delivered to a number of receivers. We assume that the network provides sufficient band- width, so that a packet can be delivered within a time slot. This is essential, if we are to have a solution with- out the use of (potentially) unbounded buffers. (If a re- ceiver cannot receive a packet in each time slot, then it must accumulate a lot of packets before playback can start. Even with such a scheme, when the stream is in- finite, eventually the receiver will starve and cause the playback to temporarily stop when the buffer is empty.) This assumption is quite reasonable. For instance, if we stream MPEG-1 video, recorded at the rate of 1.5 Mbps using 1400 byte packets (which is quite common), then each packet would play for ≈ 7.5 msec. If we stream these packets over a 10 Mbps connection, then it would take≈ 1.1 msec to transmit one packet. If the propaga- tion delay is also significant – e.g., a packet sent across US might experience a one-way delay on the order of30 msec (including propagation, queueing, and processing delays) – then, we could think of transmitting a set of packets as one “large packet”, in order not to waste net- work resources. Given above numbers, that would be on the order of 5 packets. 2.1 Construction of tree τ We assume that there areK clusters, with each clus- ter having at mostN nodes. As mentioned earlier,T c is the transmission time to send a packet from a node in one cluster to a node in another cluster. Within a cluster we assume that the nodes form a complete subgraph, with the transmission time beingT i . For clarity of presenta- tion (and without loss of generality), we will assume in the remainder (unless otherwise stated) thatT i is 1. The source S has D(D ≥ 3) times the capacity of a receiver node. Moreover, in each cluster i there are two “super nodes”, S i and S ′ i , where S i has the same capacity as the source,S, andS ′ i has capacityd times the capacity of a receiver node. Under these assumptions we construct the treeτ using the following algorithm: Step 1: Build a tree using nodesS 1 through S K , with S being the root of this tree. The degree of S is D; other interior nodes have degree at most D − 1. In order to keep the tree tight, at most one interior node can have degree less than D−1, and this node must be in the next to the last layer. Step 2: MakeS ′ i the child ofS i for alli. Step 3: Within each cluster build degree d interior-disjoint trees withS ′ i as the root of the cluster. The data is distributed (in order) from S to all the nodesS i . Each nodeS i forwards the packet it receives from its parent on to its D children (D − 1 children in treeτ and the childS ′ i ). An example of the resulting tree τ using this construction is depicted in Figure 1. The main idea here is to use a set of “super nodes” as the “backbone” of our network. As a result, we have the following theorem: Theorem 1 The worst case playback delay is on the order ofT c ·log D−1 K+T i ·d(h−1), whereh is the max- imum height of the interior-disjoint trees in all clusters. Proof: The log D−1 K term is due to the “backbone” tree, while thed(h−1) term is given in Theorem 2.⊠ What remains is to determine and evaluate tree con- struction and transmission schemes for each cluster. Thus (unless otherwise stated), in the remainder of the paper we focus our discussion on a single cluster. For clarity of presentation we use the term “source S” (or “rootS”) to mean the rootS ′ i of clusteri. 2.2 Construction of interior disjoint trees Recall that the packets can be delivered out of order. However, since our data corresponds to continuous me- dia (such as audio or video), the packets must be played in order and at the rate at which data was recorded – we assume that the playback rate is one packet per time slot. We define a time slot as the playback time of a single packet. We also assume that the receivers are homogeneous and can transmit and receive one packet per time slot to other nodes in their own cluster. Similar models are used, for instance in [6, 1]. In practice, a node may send and receive more than one packet in a time slot, although still a bounded number 4 . The schemes we pro- pose here work with either model. However, not all schemes which work under the latter model would carry over to the former. Since our goal is to develop an un- derstanding of playback delay and buffer requirements inherent in such schemes, we use the former model. Ideally, we would like each receiver to receive a new packet in every time slot. However, the source S does not have the capability to stream a new packet to each receiver in each time slot. We assume thatS is powerful enough to send packets to up tod receivers in each time slot, where d ≥ 1. (It is reasonable to assume that the source is a server which is slightly more powerful than the clients receiving the stream.) Thus, the general scheme is then based on receivers themselves acting as senders of packets which they have just received. One approach would be to construct a sin- gle tree, with S as the root, and deliver the data along that tree, e.g., as in [3]. However, that would waste the upload capacity of the leaf nodes; e.g., in a binary tree approximately half of the upload capacity of the system is potentially wasted. This would also require internal nodes to have significantly higher upload capacity (e.g., twice the leafs’ capacity in a binary tree), whereas most technologies actually provide higher download capac- ity. Thus, a more resource-efficient approach (e.g., as in [2]) is to construct multiple transmission trees where re- ceivers can obtain a different fraction of the data stream from each of the trees. This leads to efficient use of re- sources and reduction in playback start-up delay as well as buffer space requirements. If a node splits its upload bandwidth among itsd children, then in fact a node may belong to up to d trees. To maximize efficiency of re- source usage, we focus on schemes where each node does belong to d trees. Thus, we choose trees so that each node is an interior node in at most one tree, in which it has exactlyd children (a few dummy receivers may be added to ensure this). In the remainingd−1 trees it needs to be a leaf node. Constructing thed trees with this property is actually quite easy. What is surprising is that this can be done in a way that enables a schedule where in each time slot, each node will receive a packet from exactly one parent, with no collisions. Note that each node hasd parents in thed trees. After receiving a packetj, each node then sends packetj to itsd children 4 Essentially, this would correspond to a node splitting its band- width between multiple transmissions, each at a slower rate, with a longer time slot. S 1 S’ 1 S 4 S’ 4 S 5 S’ 5 S 2 S’ 2 S 6 S’ 6 S 7 S’ 7 S 3 S’ 3 S 8 S’ 8 S 9 S’ 9 S Figure 1. Cluster construction with sourceS,D = 3,d = 4: the ovals represent clusters, where thick and thin arrows represent inter- and intra-cluster transmission, respectively. in the next d time slots. Unlike the N receivers in the system, the sourceS distributes one packet in each time slot to a receiver in each of thed trees. Thus, below we constructd trees, each being ad-ary tree, whereS acts as the root in each of the trees and all N receivers appear in each tree. The data stream is then split among the d trees (e.g., the first packet might be delivered through the first tree, the second packet might be delivered through the second tree, and so on). The tree construction and transmission schemes we devise must then satisfy the following constraints: (1) each re- ceiver node receives at most one packet in each time slot, (2) each receiver node transmits at most one packet in each time slot, (3) nodeS can transmit at mostd pack- ets in each time slot, and (4) after some finite amount of time, referred to as playback delay, each receiver node should be able to start playback and continue that play- back without hiccups (i.e., without reaching a situation where the next packet to be played has not arrived yet). In what follows, we refer to receiver i, 1 ≤ i ≤ N, as a receiver with node id i, in order to distinguish its “name” from the various positions it might occupy in the d trees. We number the positions in any tree in a breadth first order. Intuitively, we would also like to construct these d trees in such a manner so as to reduce the playback delay. This implies that the trees should be, in some sense, balanced and that they should be interior node disjoint, i.e., that each of the N receivers should not appear as an interior node in more than one of the d trees. Let I be the number of interior nodes in a tree; then, I = § N d ¨ − 1. Let G 0 = {1,2,3,...,I},G 1 = {I +1,I +2,...,2I},...,G d−1 ={(d−1)I +1,(d− 1)I+2,...,dI},G d ={dI+1,dI+2,...,N}. Node ids inG 0 toG d−1 correspond to those nodes which will ap- pear as interior nodes in some tree. The node ids inG d will always be leaf nodes. We refer to thej th element of G i asG j i . For notational convenience, we would like to assume that each internal node in each of thed trees has exactly d children. We accomplish this by adding dummy re- ceiver nodes. In the constructions that follow, we ensure that the dummy nodes only appear as leaf nodes in our trees (i.e., we add them into G d ). Thus, they can sim- ply be removed in the real system. The key point in the construction is that each node appears as the i th child in only one tree. This is what enables a transmission schedule with no collisions (see Figures 2 and 3). We describe two slightly different construction schemes. The essential properties achieved by both schemes are identical; however, the locations of the nodes themselves can be different. We now give tree construction and transmission schemes. (b) Greedy Construction (a) Structured Construction S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 S 5 6 7 8 9 10 11 12 1 2 3 4 15 13 14 S 9 10 11 12 1 2 3 4 5 6 7 8 14 15 13 S 9 10 11 12 1 2 3 4 5 6 7 8 15 13 14 S 5 6 7 8 3 1 2 9 4 11 12 10 14 15 13 13 S 1 2 3 4 5 6 7 8 9 10 11 12 14 15 real node dummy node Packet transmission on link when: t mod d = 1 t mod d = 0 t mod d = 2 k = 0 k = 1 k = 2 k = 0 k = 1 k = 2 Figure 3. Example of interior disjoint tree construction using the two schemes withN = 15,d = 3, whereG 0 ={1,2,3,4},G 1 ={5,6,7,8},G 2 ={9,10,11,12},G 3 ={13,14,15} Greedy Construction Structured Construction S 6 11 12 1 1 11 S 6 2 9 4 1 11 (b) (a) t mod d = 1 Packet transmission on link when: t mod d = 0 t mod d = 2 Figure 2. Receiving and sending sched- ules of node id 6, for example in Figure 3. 2.2.1 Structured Disjoint Tree Construction Let P = d gcd(I,d) . We number the d trees T 0 ,T 1 ,...,T d−1 . We construct these trees by filling in the nodes, in breadth first order, using the current order- ing of the groupsG i , where the first group in the current order always corresponds to interior nodes. Specifically, letG be the concatenation (⊕) ofd elements. Step 1: Initialization: Let G = G 0 ⊕G 1 ⊕ ...⊕G d−1 . ConstructT 0 usingG⊕G d . Letk = 0. Step 2: Construct group sequence for next tree: Let k = k + 1. Construct a new G by rotating the current G to the left where G i takes place of G i−1 and the first element of the current G becomes the last element of the newG. Ifk mod P 6= 0, go to Step 4. Step 3: Adjust groups (after P rotations): Construct new G i s, 0 ≤ i ≤ d − 1, by rotating elements of each G i to the right, i.e., G j i takes place of G j+1 i and the last element of the current G i be- comes the first element of the new G i . Step 4: Construct next tree: RotateG d to the right, i.e., G j d takes place of G j+1 d and the last element of the current G d be- comes the first element of the new G d . ConstructT k usingG⊕G d . Step 5: Loop: Ifk <d−1, go to Step 2. Due to lack of space, we give the correctness proof for this construction in the appendix. The main idea behind the proof is to show that no node will receive more than one packet in one time slot. 2.2.2 Greedy Disjoint Tree Construction In this scheme, for ease of exposition, we assign a parity to each receiving node, where node idi has parityp i = (i− 1 mod d), i ∈ {1,2,3,...,N}. The node’s parity determines which child slot the node occupies in each of the d trees. Specifically, node i with parity p i occupies child slot(p i −k) modd in treek, where0≤k ≤d−1. The scheme can then be described as follows: Step 1: Initialization: Let G = G 0 ,G 1 ,...,G d−1 and construct T 0 usingG,G d . Letk = 0. Step 2: Interior node selection for next tree Letk =k+1. All interior nodes of tree T k are chosen from the setG k where we fill in the nodes in a breadth first man- ner, for positions i, i = 1,2,...,I, by choosing the smallest node id j which satisfies the following conditions: (a)j ∈G k , (b)j has parityi+k−1, (c)j has not been placed in treeT k yet. Step 3: Leaf node selection for next tree: Let L = {1,2,3,...,N}/G k be the set of nodes that were not yet placed in tree T k in Step 2. All leaf nodes are cho- sen from the set L where we fill in the nodes in a breadth first manner, for po- sitions i, i = I + 1,I + 2,...,N, by choosing the smallest node id j which satisfies the following conditions: (a)j ∈G, (b)j has parityi+k−1, (c)j has not been placed in treeT k yet. Step 4: Loop: k = k + 1. If k < d go to Step 2. Due to lack of space, we give the correctness proof for this construction in the appendix. 2.2.3 Transmission Schedule The data streamed from S can either be produced live (e.g., as in a broadcast of a sporting event) or it can be pre-recorded (e.g., as in the case of delivery a movie). For ease of presentation, we first assume that we are de- livering pre-recorded data, i.e., all packets are available at nodeS at time 0. It is a simple extension to make our schemes work for live streams, and we explain it briefly at the end of this section. The transmission schedule can be obtained as fol- lows. Let k ∈ {0,1,2,...,d − 1}, and let t be a time slot, where t = m · d + r and 0 ≤ r < d. In time slot t, S transmits packet (k + m · d) to its r th child in tree T k . All other interior nodes in tree T k send one packet to theirr th child in time slott. (The children are numbered, left to right, from0 tod−1. Thus, the trans- mission essentially proceeds in a round-robin manner.) For example, in the multi-tree constructed in Figure 3, in time slot 0, S sends packet 0 to node id 1 in treeT 0 , packet 1 to node 5 in treeT 1 , and packet 2 to node 9 in treeT 2 . Then, in time slot 1,S sends packet 0 to node 2 in treeT 0 , packet 1 to node 6 in treeT 1 and packet 2 to node 10 in tree T 2 . After receiving packet 0 from S in time slot 0 in treeT 0 , node 1 will send packet 0 to node 5 in time slot 1, node 6 in time slot 2 and node 4 in time slot 3, etc. In the case of live streaming, it is not possible to send packet 1 in time slot 0 because it has not been gener- ated yet. One approach to address this problem would be to, in a sense, pipeline the packets and thus modify the schedule as follows. Letr = (t +k) modd. Then, in time slott+k,S transmits packet (k +m·d) to its r th child in treeT k , if(k+m·d)≤t. All other interior nodes in treeT k send one packet to theirr th child in time slott, as long as they have an appropriate new packet to send to that child. In this case, the transmission sched- ules of the different trees are not homogeneous; thus, this scheme is not easy to analyze. Another approach to address this problem is for S to delay the streaming until it accumulates d packets. For ease of illustration we assume that the d packets were “pre-buffered” before time 0, and S can send out d packets at time 0. Thus, all nodes experience d units of additional delay as compared to the above described approach. However, we can then assume the same trans- mission scheduling procedure as in the case of “pre- recorded” streaming. 2.3 Delay Analysis In this section we first give an upper bound on the overall playback delay and buffer space requirements, given the construction and transmission schemes of Sec- tion 2. We then use this bound to determine an optimal value for the degree of our trees, i.e., for d. Finally, a lower bound on the average delay is given. We do all this in the context of pre-recorded streaming applications. Worst-case Playback Delay: We derive the upper bound on playback delay under the assumptions that: (1) a packet playback takes one time slot; (2) alld trees are complete, i.e., d +d 2 +... +d h = N for some inte- ger h – here (h + 1) is the depth of our trees; and (3) S begins packet transmission in time slot 0. These as- sumptions simplify the analysis significantly. (We have also performed a simulation-based evaluation of the de- lay characteristics without the assumption of complete trees; this is omitted here due to lack of space.) We now state the following theorem. Theorem 2 Worst-case playback delay, T , satisfies the following inequality: T ≤ h · d, where h = § log d [N(1− 1 d )+1] ¨ andh+1 is the depth of the trees. Due to lack of space, we give the proof of Theorem 2 in the appendix. Given this worst-case delay, a buffer of sizeh·d·(size of a packet) is sufficient at every node. Note that this is an upper bound on the buffer space requirements, i.e., not all nodes need that much buffer space. For instance, in the multi-tree system constructed in Figure 3, node 1 will receive packets 0,1, and 2 in time slots 0,2, and 1, respectively. Therefore a buffer size of 3 is sufficient for node 1. Tree Degree Optimization: Given that we would like to minimize worst-case delay (in Theorem 2), we can state the following. Asymptotically, we can show that for large N, degree 3 trees are optimal. Moreover, for anyN, either degree 2 or degree 3 trees are optimal. Specifically, let us assume that N is large. Then, a reasonable approximation for the upper bound on play- back delay is T ≈ log d [N(1 − 1 d )] · d. Let F(d) = log d [N(1− 1 d )]·d. Then, using natural logs and tak- ing the derivative with respect to d, we obtain dF dd = (logd−1)[log(d−1)+logN]+ d d−1 logd (logd) 2 − 1 where d must be an integer and d ≥ 2. Note that, when d = 2, dF dd = (log2−1)logN (log2) 2 + 2 log2 − 1 ≈ 1.89 − 0.64logN < 0. And, when d ≥ 3, logd − 1 > 0, so dF dd > 0. Con- sequently, an optimal value of d should always be ei- ther 2 or 3. Note that F(2) = 2(log 2 N − 1) and F(3) = 3( log 2 N log 2 3 − log 3 (3/2)). Thus, for sufficiently largeN (and these values do not have to be very large), degree 3 trees are optimal. We note that numerical results depicted in Figure 4 (obtained through simulations) indicate that for small and large values of N, the resulting delays for degree 2 and 3 trees are quite close, and they are better than for higher degree trees. Thus, we believe that it is reason- able to used = 2 in practice. 0 5 10 15 20 25 30 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Maximum Startup Delay (# Time Steps) Number of Nodes Degree 2 Degree 3 Degree 4 Degree 5 Figure 4. Worst-case delay Average Playback Delay: In addition to the worst-case playback delay, average playback delay, P N i=1 a(i) N , is also an important metric for evaluating the performance of our scheme; here a(i) corresponds to the playback delay of node id i. However, this is a difficult metric to derive analytically. Instead Theorem 3 gives a lower bound on this metric, under the same assumptions stated above (due to lack of space, the proof is given in the appendix). Theorem 3 The following inequality gives a lower bound on the average playback delay: P N i=1 a(i) N ≥ d h (d+1)(h−1)−d 2 (h−2)−d(d+1)/2 N(d−1) . 3 Hypercube-based Scheme In this section we present another approach to mesh construction for streaming purposes; we refer to this scheme as hypercube-based streaming (for reasons made clear below). For simplicity of presentation we fo- cus on streaming within a single cluster. However, this scheme can be easily adapted to streaming over multi- ple clusters, using the treeτ, as in the context of multi- trees. All assumptions remain the same as in the context of multi-trees (Section 2), except that, at first, we do not assume that the sourceS has a greater transmission ca- pability than any other node. Towards the end of the section we comment on how to adjust the results if the same assumption is made as in the case of multi-trees, i.e., if the sourceS hasd times the capacity of a receiver node. 3.1 Hypercube Streaming for Special N For ease of illustration, we first present a stream- ing scheme under the assumption that the number of nodes, N, is one less than a power of two. That is, let N = 2 k − 1, where k is an integer. We can con- struct a scheme with O(1) buffer size requirement and O(k) playback delay (using a generalization of [5]) by reaching a state in which N 2 i nodes have the i th packet (i = 1...k). In the next round, we can have all (re- maining) nodes receive packet 1 and at the same time double the nodes having packets i, i = 2...k, while the source sends out a new packet (k +1). Packet 1 can now be consumed, and the protocol repeats. We take a somewhat different approach than in [5, 11] as we are designing our scheme for streaming rather than for mes- sage distributions. Moreover, in [5, 11], the number of messages is limited to a finite number, saym; thus after m time slots the source can aid in packet exchange with- out having to send new packets. However, in streaming (and particularly live streaming which is potentially in- finite) this is often not possible as the source always has new packets to send. Figure 5 depicts a simple example of our scheme. The main problem with this scheme, as described above, is that it can potentially involve an arbitrary communi- cation pattern, where a node may actually need to com- municate with all other nodes. (Due to lack of space, an illustration of such a possible communication pattern is given in the appendix.) To limit the number of neighbors with which a node communicates, we construct a specific communication pattern as follows. For simplicity let the sourceS have node ID 0. In order to transmit packets, in each time slot we pair up the N + 1 nodes (i.e., including the source) and have them exchange packets as follows. Let n = 0,1,2,3,...,j = 0,1,2,...,k − 1. In time slot kn+j, we pair up nodes with IDs(xx...x0xx...x) 2 and (xx...x1xx...x) 2 , where 0 and 1 appear in the j + 1 st position from the right. (Here we use “() 2 ” to indicate a node ID written in binary form.) Then, the N + 1 nodes can be viewed as vertices of ak dimensional hy- percube 5 , where in each time slot, communication be- tween vertices is performed along the same dimension. For instance, suppose we have 7 nodes, plus a source, with node IDs 0 to 7. Then, (a) in time slot 3n + 1 we pair up nodes with IDs (xx0) 2 and (xx1) 2 , i.e., we pair up node IDs 0, 2, 4, and 6 with node IDs 1, 3, 5, and 7, respectively; (b) in time slot3n+2 we pair up node IDs (x0x) 2 and(x1x) 2 , i.e., we pair up node IDs 0, 1, 4, and 5 with node IDs 2, 3, 6, and 7, respectively; (c) in time slot3n we pair up node IDs(0xx) 2 and(1xx) 2 , i.e., we pair up node IDs 0, 1, 2, and 3 with node IDs 4, 5, 6, and 7, respectively, and so on. (Due to lack of space, we give a corresponding depiction in the appendix.) Then, the performance of our scheme is described by Proposi- tion 1. Proposition 1 GivenN = 2 k −1, under the hyper- cube streaming scheme, where nodes are arranged as vertices of a k-dimensional hypercube, each node only communicates with k other nodes and can begin play- back after time slotk +1. Moreover, each node is only required to store 2 packets in its buffer, i.e., this scheme hasO(1) buffer space requirements. 3.2 HypercubeStreamingforArbitrary N We now extend our hypercube-based streaming ap- proach to arbitrary values of N, where the basic idea is to divide the N nodes into multiple hypercubes. Let k 1 = ⌊log 2 (N + 1)⌋. Then, we construct a hypercube from the sourceS andN 1 = 2 k1 −1 nodes; we refer to this hypercube asHC 1 . InHC 1 , in each time slot, the node receiving data from node ID 0 (i.e., the source) has nothing to send. Thus, this spare capacity can be utilized to send packets to a node in another hypercube. Specif- ically, letn = 0,1,2,...,j = 0,1,2,...,k 1 − 1; then in time slot nk 1 +j, the node with ID 2 j receives a new packet fromS, while consuming another packet (previ- ously received from another node). Since this node has nothing to send to a node within HC 1 (refer to Figure 5 We would like to thank Matt McCutchen for suggesting that we consider hypercube communication. 1 1 1 1 1 1 2 3 4 5 6 7 source 8 2 3 4 5 6 7 2 3 4 5 6 7 8 (a) beginning of timeslot X (b) end of timeslot X Figure 5. (a) depicts the beginning of time slot X, where each parallelogram corresponds to a group of nodes which transmit, the packet number indicated (by the number inside the paral- lelogram); the arrows indicate the direction of transmission (i.e., sender/receiver relationship between groups of nodes). The shaded parallelogram indicates an absence of a node. Note that the scheme is depicted forN = 2 k −1 nodes, where one node is receiving the next packet (here, packet 8) from the source. (b) depicts how many nodes have packet i, 2 ≤ i ≤ 8, at the end of time slotX (each node also has packet 1, so we omit that from the figure), i.e., we have doubled the number of nodes which have packet i as compared to the beginning of the time slot. 5), we let this node send the packet it just consumed to another hypercube. As a result,HC 1 as a whole can be viewed as a logical source for the remaining N − N 1 nodes, as it has the capacity to send one packet (in ap- propriate order), in every time slot. The one difference here is that this logical source begins sending packets in time slot k 1 + 1. Given the logical source HC 1 and N−N 1 nodes, we can repeat the above process until all nodes are assigned to some hypercube. The performance of the resulting scheme is described by Proposition 2. Proposition 2 Given an arbitrary N, in the hypercube-based scheme each node communicates with at most O(logN) other nodes. The corresponding worst case playback delay is at mostO(log 2 N), where only two packets need to be stored in each node’s buffer. In scenarios where we stream over multiple hypercubes, nodes start playback in different time slots. Thus, the average delay (average playback start time) is also of interest and is characterized in Theorem 4. Theorem 4 Hypercube-based streaming over N nodes (for arbitrary values of N) results in an average delay of no more than 2logN. Due to lack of space, the proof of Theorem 4 is given in the appendix. Lastly, in Section 2, we assumed that the source can send d packets in one time slot. Under this assump- tion, the hypercube-based scheme can be adjusted as follows. We can divide N nodes (as evenly as pos- sible) into d groups, and then apply hypercube-based streaming to these groups individually. Since each group has at most ⌈ N d ⌉ nodes, the worst case and average playback delay would be bounded by O(log 2 ( N d )) and 2log⌈ N d ⌉, respectively, with each node communicating withO(log⌈ N d ⌉) other nodes. 4 Summary and Future Work We summarize the comparison between hypercube- based and multi-tree-based streaming in Table 1. As can be seen from this table, the multi-tree-based scheme provides better worst case playback delay but requires larger size buffers. Moreover, in the multi-tree-based scheme each node communicates with a constant num- ber of nodes, while in the hypercube-based scheme com- munication is needed with O(logN). This is particu- larly important, given that small values of d are prefer- Schemes Max Delay Ave Delay Buffer Size Num of Neighbors Multi-tree O(dlogN) O(dlogN) O(dlogN) O(d) hypercube for specialN O(logN) O(logN) O(1) O(logN) hypercube for arbitraryN O(log 2 ( N d )) O(log( N d )) O(1) O(log( N d )) Table 1. Comparison between multi-tree based streaming and hypercube based streaming. able in the multi-tree scheme. Our current and future efforts are focused on several directions. In this paper, our algorithms are given in a static context, i.e., where allN nodes are present in the system initially and for the duration of the data deliv- ery process. However, in a real world streaming sys- tem nodes arrive and depart throughout the streaming process, i.e., there is node churn. Effective algorithms are needed for multi-tree-based and hypercube-based schemes to adjust to node dynamics with as little affect as possible on the remaining participating nodes. Due to lack of space, we give our adaptation of the multi-tree- based scheme to node dynamics in the appendix. Our ongoing efforts include constructing algorithms for deal- ing with node dynamics in the context of the hypercube- based scheme, which would work well for arbitrary val- ues of N. In addition, the maximum playback delay of the hypercube-based scheme isO(log 2 N), for arbitrary values of N. As N grows, this can be quite large. Our current efforts are focused on determining whether there exists an algorithm, such that for arbitraryN, it has the following characteristics: a maximum playback delay of O(logN), buffer space requirements ofO(1), and com- munication requirements of O(logN) (i.e., each node only needs to communicate with at mostO(logN) other nodes in the system). References [1] D. Arthur and R. Panigrahy. Analyzing the efficiency of bittorrent and related peer-to-peer networks. In ACM-SIAM SODA, 2006. [2] M. Castro, P. Druschel, A. Kermarrec, A. Nandi, A. Rowstron, and A. Singh. Splitstream: High- bandwidth multicast in cooperative environments. In SOSP, 2003. [3] Y . Chu, S. Rao, S. Seshan, and H. Zhang. A case for end system multicast. IEEE JSAC, 20(8), 2002. [4] G. Dan, V . Fodor, and I. Chatzidrossos. On the performance of multi-tree-based peer-to-peer live streaming. In IEEE INFOCOM, 2007. [5] A. M. Farley. Broadcast time in communication networks. SIAM Journal on Applied Mathematics, 39(2):385–390, 1980. [6] C. Fernandess and D. Malkhi. On collaborative content distribution using multi-message gossip. In IEEE IPDPS, 2006. [7] J. Hastad. Some optimal inapproximability results. JACM, 48:798–859, 2001. [8] D. Kempe, A. Dobra, and J. Gehrke. Gossip-based computation of aggregate information. In IEEE FOCS, 2003. [9] S. Khuller, Y . Kim, and Y .-C. Wan. Broadcasting on networks of workstations. In ACM Symp. on Parallel Algs. and Arch., 2005. [10] S. Khuller, Y . Kim, and Y .-C. Wan. On generalized broadcasting and gossiping. Journal of Algorithms, 59(2):81–106, 2006. [11] A. L. Liestman, T. C. Shermer, and M. J. Suderman. Broadcasting multiple messages in hypercubes. In ACM ISPAN, 2000. [12] S. Liu, R. Zhang-Shen, W. Jiang, J. Rexford, and M. Chiang. Performance bounds for peer-assisted live streaming. In SIGMETRICS, 2008. [13] N. Magharei, R. Rejaie, and Y . Guo. Mesh or multiple-tree: A comparative study of live p2p streaming approaches. In IEEE INFOCOM, 2007. [14] V . Venkataraman and P. Francis. Chunkyspread: Multi-tree unstructured peer-to-peer multicast. In 5t IPTPS, 2006. [15] X. Zhang, J. Liu, B. Li, and T. P. Yum. Coolstream- ing/donet: A data-driven overlay network for effi- cient live media streaming. In IEEE INFOCOM, 2005. APPENDIX Structured Disjoint Tree Construction–Proof of Cor- rectness: Note that only nodes in G k act as the interior nodes in tree T k . Thus all d trees are indeed interior node dis- joint. Hence, what remains to be proven is that no node will receive more than one packet in one time slot. Ac- cording to the transmission schedule described in Sec- tion 2, the time slots during which one node, say node IDx, receives packets in a certain tree is determined by its position in that tree modulod. Therefore, it is suffi- cient to show that among all positions of node IDx, no two positions are congruent modulod. Ifx ∈ G d , then according to the algorithm it will be placed in positionsN −d+1 throughN once without repetition. Thus no two positions are congruent modulo d. Ifx∈G, then letg =gcd(d,I). Thus,d =P ·g and I =I 1 ·g, whereI 1 andP are relatively prime. According to the algorithm, the positions of node x in alld trees are equivalent to: T 0 −−−T P−1 :x,x−I,...,x−(P −1)I T P − − −T 2P−1 : x − PI + 1,x − (P + 1)I + 1,...,x−(2P −1)I +1 ...... T (g−1)P −−−T gP−1 :x−(g−1)I+g−1,...,x− (Pg−1)I +g−1. And, we want to show that no two of these d numbers are congruent modulod. Note that,N is ignored in some of the above terms sinced|N. Also,PI =PgI 1 =dI 1 ; thus,d|PI. We can further refine the positions to: T 0 −−−T P−1 :x,x−I,...,x−(P −1)I T P −−−T 2P−1 :x+1,x−I+1,...,x−(P−1)I+1 ...... T (g−1)P −−−T gP−1 :x+g−1,...,x−(P−1)I+ g−1. Now, assume that x−a 1 I +b 1 ≡ x−a 2 I +b 2 (mod d), where a 1 ,a 2 ∈ {0,1,2,...,P − 1} and b 1 ,b 2 ∈ {0,1,2,...,g−1}. We haved|(a 1 −a 2 )I +(b 2 −b 1 ), where (a 1 −a 2 ) ∈ {1−P,...,P −1} and (b 1 −b 2 ) ∈ {1−g,...,g−1}. Thus,Pg|(a 1 −a 2 )gI 1 +(b 2 −b 1 ). So g|(b 2 − b 1 ), which indicates that b 2 = b 1 . Then, P|(a 1 − a 2 )I 1 . Since P and I 1 are relatively prime, P|(a 1 −a 2 ). As a result,a 1 = a 2 . This proves that no two distinct positions are congruent.⊠ Greedy Disjoint Tree Construction–Proof of Correct- ness: The following facts can be easily observed from the al- gorithm. Firstly, according to the construction, only nodes in G k act as the only interior nodes in tree T k . Thus, all d trees are indeed interior node disjoint. Sec- ondly, suppose that node id i is in position P a and P b in trees T a and T b , respectively. Then according to the construction algorithm, P a ≡ a + i(mod d) and P b ≡b+i(modd). Thus,P a 6=P b (modd). This shows that node idi can receive at most one packet in one time slot. Finally, sinced|N, the number of nodes with parity j is N d , for allj. On the other hand, alld trees are filled from position 1 through position N. Thus, the number of positions in treeT k to be filled with parityj nodes is exactly N d , which indicates that before all positions are filled, we can not exhaust our supply of nodes with ap- propriate parities. Given all of the above observations, we conclude that the construction algorithm is correct.⊠ Proof of upper bound on worst case playback delay (Theorem 2): For ease of exposition, we first assume thatT is a com- plete tree and then comment on what happens when that is not the case. We claim that one (achievable) upper bound on the playback delay ish·d time slots. That is, any node can begin playback after time sloth·d and be guaranteed not to experience hiccups due to lack of data. This claim can be proven by considering the following two observations: Observation 1: It takesh·d time slots to transmit packet 0 to node id N, i.e., the node in the last position of the first tree. So node id N cannot start playback prior to time sloth·d. Observation 2: Given our round robin transmission schedule, if one node receives packet j in time slot t, then it will definitely receive packet (j +d) in time slot (t +d). Consequently, if by time slott 0 a node has re- ceived packets 1 through d, then by time slot (t 0 +d) it will receive packets (d + 1) through 2d, by time slot (t 0 +2d) it will receive packets(2d+1) through3d, and so on. Therefore, it is safe for this node to start playback at time slott 0 without being concerned with running out of data and experiencing hiccups. Note that, by time slot h ·d each node would have received at least one packet in each of thed trees. And, after that, each node continues to received packets every d time slots. Since nodes do not receive redundant pack- ets, h·d is a safe value for t 0 . Thus, given our claim above, the playback delay T satisfies the following in- equality: T ≤h·d. For general values ofN, these trees may not be complete; hence, it is possible for T to be strictly less thanh·d.⊠ Proof of Lower Bound on Average Delay (Theorem 3): LetA(i,k),i∈{1,2,3,...,N} andk ∈{0,2,3,...,d− 1} denote the delay of node id i in tree T k , e.g., A(1,1) = 1 and A(d,1) = d. Also let a(i), i ∈ {1,2,3,...,N} denote the playback delay of node id i, and let a ′ (i), i ∈ {1,2,3,...,N −d}, denote the delay which node id i experiences as interior node only. For node idsi ∈ {N −d+1,...,N} (i.e., nodes which are leaves in all trees), let a ′ (i) = A(i,1). (For example, in the multi-tree constructed in Figure 3(b), a ′ (1) = 1 and a ′ (6) = 2.) Then, similarly to the argument we made in case of worst-case playback delay, we have: a(i) = max{A(i,0),A(i,2),A(i,3),...,A(i,d − 1)}. Note that (d − 1)a(i) ≥ P d j=1 A(i,j) − a ′ (i). In- deed, for i ∈ {1,2,3,...,N − d}, the right hand side of this inequality is the average delay of node id i when it is a leaf node. Then, (d − 1) P N i=1 a(i) ≥ P d i=1 P i∈Lj A(i,j)−d 2 (h− 2)− d(d+1) 2 , where the right hand side is the sum of the delays of all leaves in all d trees minus the delay of node ids N − d + 1 through N in tree T 0 . These d nodes are in positions N −d+1,N −d+2,...,N in treeT 0 ; thus, the corre- sponding delay isd(h−2)+1,d(h−2)+2,...,d(h−1). Now we prove that 1 |L k | P i∈L k A(i,k) = (d+1)(h−1) 2 . For k ∈ {0,2,...,d − 1}, let L k = {1,2,3...,N}/G k . L k denotes the set of leaf nodes in tree T k . We first prove the following lemma: Lemma 1 In L k , the number of nodes with delay j is equal to the number of nodes with delay (d+1)(h− 1)−j. Proof: Let X 1 ,X 2 ,...,X h−1 ∈ {1,2,3,...,d} be the delay (in number of time slots) between each layer. Then each vector(X 1 ,X 2 ,...,X h−1 ) corresponds to a unique node, sayi, inL k , andX 1 +X 2 +...+X h−1 =A(i,k). Thus, the number of nodes with delay j is equal to the number of solutions of the equation X 1 + X 2 + ... + X h−1 = k. Also, from symmetry, the number of so- lutions of X 1 + X 2 + ... + X h−1 = k is equal to the number of solutions of X 1 +X 2 +... +X h−1 = (d+1)(h−1)−k, which is the number of nodes with delay(d+1)(h−1)−j. This indicates that the number of nodes with delay j is equal to the number of nodes with delay (d+1)(h−1)−j. According to Lemma 1, 1 |L k | P i∈L k A(i,k) = 1 |L k | |L k | 2 (d + 1)(h − 1) = (d+1)(h−1) 2 . Also we have |L k | = d h−1 . Putting it all together gives: (d − 1) P N i=1 a(i) ≥ d · d h−1 · (d+1)(h−1) 2 − d 2 (h − 2) − d(d+1) 2 . Thus, the average delay is P N i=1 a(i) N ≥ d h (d+1)(h−1)−d 2 (h−2)−d(d+1)/2 N(d−1) .⊠ Not fully connected networks: In this paper, we modeled each cluster as a fully con- nected graph and hence constructed the multiple trees within this fully connected graph. Consider now a net- work which is represented by an arbitrary undirected graph,G, where an edge exists between a pair of nodes if one packet can be transmitted between these nodes in a single time slot. An interesting question then is, for instance, can we construct two interior disjoint spanning trees usingG, each rooted at a nodeS, i.e. as before the root is permitted to be an interior node in both trees. We refer to this problem as the Two Interior-Disjoint Tree problem. As stated in Section 1, this problem is NP - complete; the corresponding proof is given below. NP-completeness Proof: Recall the following known NP -Complete problem, i.e., the E-4 Set Splitting prob- lem [7]: Given a collection of elementsV and a collec- tion of sets R i , such that for all i, R i contains exactly four elements ofV , is there a way of splitting the setV into V 1 and V 2 such that for each i, R i has at least one element in both sets. We now reduce this problem to the Two Interior-Disjoint Tree problem. Construct a bipartite graph with a node for each ele- ment inV ; call it setV ′ . For eachR i we have a nodex i . The collection of nodesx i will be calledX. Add a root r and add edges fromr to all nodes inV ′ . Also connect x i to the nodes contained inR i . Suppose there exist two interior-disjoint trees in this graphT 1 andT 2 . Now do this operation: for anyi, ifx i is an interior node, then connect all its children directly to the root and move all the edges between x i and its children. This is possible because all the children ofx i must inV ′ . Note that after completing all the operations for all i. The two trees are still interior disjoint(we did not add any interior nodes at all). Also all thex i nodes are leaves in both trees now. LetV 1 ,V 2 be a solution for the E-4 splitting problem. Take the interior nodes of two trees asV 1 andV 2 . Con- nect eachx i as a leaf to each tree, since eachx i has an element from each partition. This completes the proof. ⊠ Proof of upper bound on average delay in hypercube scheme (Theorem 4): We prove this claim by induction. Let ave(N) denote the average playback delay of N nodes. When N is small, we can verify thatave(N)≤ 2logN. WhenN is large, according to the scheme above, we take N1 = 2 k1 − 1 nodes to form the first cube. The playback delay of all nodes in this cube isk 1 . ave(N) = k1(2 k 1 −1)+(N−2 k 1 +1)(k1+ave(N−2 k 1 +1)) N = k 1 + (N−2 k 1 +1)ave(N−2 k 1 +1) N Note that2 k1 −1≥N/2 so(N −2 k1 +1)ave(N − 2 k1 +1) ≤ N logN, alsok 1 < logN, thusave(N) ≤ 2logN, which completes the prove.⊠ Arbitrary communication pattern of O(1) buffer space scheme: Figure 6 depicts in more detail the communication pat- tern of the O(1) buffer space scheme, which was de- picted at a higher level in Figure 5. Dynamics: node addition and deletion in multi-trees: Our tree construction schemes were described under static conditions. In a real streaming system, there is node churn. That is, it is quite likely that some nodes will arrive and some nodes will depart after S has be- gun the streaming process, i.e., after the original trees are constructed and are in use for data streaming. Thus, we must also be able to add new nodes to and delete ex- isting nodes from our trees “on-the-fly”, ideally without having to reconstruct the trees from scratch. Below, we describe how this can be done for our schemes, when the node churn is due to the arrival and departure of “regu- lar” nodes (i.e., not the “super nodes” forming the “back- bone” of Figure 1). It is reasonable for us to assume that the “super nodes” will not exhibit significant churn (and thus focus our attention on the “regular” node churn) for the following reasons. It is common for real systems to provide some infrastructure which is static over long pe- riods of time, and it is common in real P2P systems to adopt the use of “super nodes”, e.g., as in Skype, Kazaa, and Gnutella. In real streaming systems, “super nodes” could be provided, e.g., with the aid of content distri- bution companies, such as Akamai - this is an approach taken by a popular P2P streaming system, Joost. Note that the tree construction schemes given in Sec- tion 2 have a nice property that the nodes in the setG d are all leaf nodes (i.e., they appear as leaves in all d trees), and moreover, the nodes inG d are always at the end of our trees (i.e., when considered in the breadth- first-order). Thus, it is always easy for us to find nodes which are not transmitting any data to anyone and thus can be used to (a) take on the role of interior nodes when they are deleted and (b) take on children when new nodes are added to the system.We now give the details of the node deletion and addition algorithms. Deletion: Suppose our system has N (real) nodes, and node idi has decided to leave. Let node idx be the last all leaf node in tree T 0 . Then the deletion of i can be done as follows: Step 1: Find replacement: Swapi withx in alld trees. Step 2: Restore property: Ifd|(N −1) (i.e., ifG d only has one node), then letP(i) be the set of the (new) parents of i in alld trees (i.e., the nodes which became its parents after it was swapped withx); thus |P(i)| = d. In each treeT k , swap the nodes inP(i) with the nodes in po- sitionsN −d toN −1 in treeT k . Step 3: Remove node: Delete i from all Time N 1 N 2 N 3 N 4 N 5 N 6 N 7 S k k k k+1 k+1 k+2 k+3 k+2 k+1 k+1 k k k k k-1 k-1 k-1 k-1 k-1 k-1 k-1 N 1 N 2 N 3 N 4 N 5 N 6 N 7 S k+3 k+2 k+1 k+1 k+4 k+2 k+1 k+1 k+2 k+1 k+1 k+3 k k k k k k k N 1 N 2 N 3 N 4 N 5 N 6 N 7 S k+3 k+4 k+3 k+2 k+2 k+2 k+5 k+2 k+2 k+2 k+2 k+3 k+4 k+3 k+1 k+1 k+1 k+1 k+1 k+1 k+1 k+1 k+2 Figure 6. Example of the O(1) buffer space scheme with N=7: here S is the source, each oval represents a node, N j , with rectangles inside an oval representing buffer occupancy of the corresponding node - the shaded rectangle depicts the packet number being consumed, and the clear rectangle depicts the packet number being transmitted to another node, where the arrow indicates to which node it is being transmitted. trees. Note that Step 2 is executed only whenx was originally the only child of its parents in all d trees. Thus, after deletingi, all nodes inP(i) will become all leaf nodes. Hence, the purpose of Step 2 is to make sure that the nodes inP(i) end up in positionsN−d throughN−1, i.e., at the end of alld trees, in breadth-first-order. (An- other minor detail is that, for consistency of presenta- tion,x takes oni’s node id.) Addition: Suppose our system hasN nodes, and a new node id i arrives. Let N ≡ r 1 (mod d) and ⌊ N d ⌋ ≡ r 2 (mod d). If d|N, i.e., all nodes in G 0 through G d−1 are full, i.e., haved (real) children in some tree, and thus nodes inG d will have to become interior nodes in some trees. Otherwise, nodes not in G d still have vacancies andi can simply be added in appropriate positions as a child of those nodes in all trees. The addition algorithm is as follows: Step 1: Make room for growth: Ifd|N, swap the node in position⌊ N d ⌋ with the node in positionN−d+(r 2 −1) in each tree. Step 2: Grow trees: Addi to positionN +1 in treeT 0 ,N +2 in treeT 1 , ..., position N + d − r 1 in tree T d−r1−1 , position N−r 1 +1 in treeT d−r1 , ..., positionN in treeT d−1 . Intuitively, the main purpose ofr 1 in Step 2 above is to count the number of children of an interior node with fewer thand (but more than 0) children, i.e., such nodes haved−r 1 vacancies. Intuitively, the main purpose ofr 2 is to determine the parity of the node where tree growth will occur. Note that the growth in all trees has to occur in position⌊ N d ⌋. And in Step 1 we ensure that the node in that position is an all leaf node (fromG d ) in each of the trees, before addingi as its child. Thus, in Step 1 we swap the node in position⌊ N d ⌋ with one of the nodes in G d ; specifically, we swap it with an all leaf node of the same parity (to ensure that it continues to receive data in that tree in appropriate time slots). Note that, when swapping occurs, either during ad- dition or deletion, nodes participating in the swapping process may suffer from hiccups. This may occur, for example, because they lose data which was delivered be- fore they were moved up a tree, or perhaps because they wait longer than originally planned for some data be- cause they were moved down a tree. In the case of dele- tion, if a node being deleted is an all leaf node, then the number of resulting swaps is between 0 and d 2 (where the higher value occurs whend|(N −1)). If an interior node is deleted, then the number of resulting swaps is between d and d 2 + d (where the higher value occurs when d|(N − 1)). In the case of addition, the num- ber of resulting swaps is between 0 and d (where the higher value occurs when d|N). Thus, up to d 2 nodes may suffer from hiccups. Appropriate evaluation of the resulting QoS (due to hiccups) is a complex issue. We performed an empirical evaluations of such effects (us- ing simulation); the results are omitted here due to lack of space. One inefficiency of the above algorithms is that whend|(N−1) and deletion occurs, up tod 2 swap- pings have to be made to keep all the G d nodes in ap- propriate positions in all trees. However, these swaps are not really necessary, if the next event is an addition of a new node, i.e., the addition of a new node will force us to “undo” swaps made during the deletion process. Thus, in such situations, if we had this sequence of dele- tions/additions, it would have been better to just replace the deleted node with the newly added one, i.e., thus savingd 2 +d swaps. Given this observation, we explore “lazy” versions of the deletion and addition algorithms where we wait until a new event occurs before deciding whether swapping is needed. Lazy Deletion: Step 1, Restore property: Ifd|N, swap nodes inG d with nodes in positionsN−d toN−1 in all trees. Step 2, Find replacement: Ifi is not inG d , swap it with nodex in all trees. Step 3, Remove node: Deletei. Lazy Addition: Step 1, Make room for growth: If d|N, check if nodes in position⌊ N d ⌋ in all trees are inG d . If not, swap the node in position ⌊ N d ⌋ with the node in positionN −d+(r 2 −1) in each tree. Step 2, Grow trees: Addi to positionN +1 in tree T 0 , positionN+2 in treeT 1 , ..., positionN+d−r 1 in treeT d−r1−1 , positionN−r 1 +1 in treeT d−r1 , ..., positionN +d in treeT d−1 . node ID 5 (101) 2 node ID 4 (100) 2 node ID 6 (110) 2 node ID 7 (111) 2 node ID 0 (000) 2 node ID 2 (010) 2 node ID 1 (001) 2 node ID 3 (011) 2 Figure 7. Hypecube-based communication Note that the difference with these “lazy” schemes is that now we wait until a new event occurs before de- ciding whether swapping is needed in order to keep the all leaf nodes in appropriate positions in the trees. We have evaluated the difference between the original and the “lazy” versions of these algorithms empirically, us- ing simulations; the results are omitted here due to lack of space. Hypercube-based communication pattern of O(1) buffer space scheme: As an example, suppose we have 7 nodes, plus a source, with node IDs 0 to 7. The resulting communication pat- tern is depicted in Figure 7. In time slot 3n + 1 we pair up nodes with IDs (xx0) 2 and (xx1) 2 , i.e., we pair up node IDs 0, 2, 4, and 6 with node IDs 1, 3, 5, and 7, respectively. Here, the nodes communicate along the dashed lines. In time slot 3n + 2 we pair up node IDs (x0x) 2 and(x1x) 2 , i.e., we pair up node IDs 0, 1, 4, and 5 with node IDs 2, 3, 6, and 7, respectively. Here, the nodes communicate along the dotted lines. In time slot 3n we pair up node IDs(0xx) 2 and(1xx) 2 , i.e., we pair up node IDs 0, 1, 2, and 3 with node IDs 4, 5, 6, and 7, respectively. Here, the nodes communicate along the solid lines.
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 919 (2011)
PDF
USC Computer Science Technical Reports, no. 924 (2012)
PDF
USC Computer Science Technical Reports, no. 920 (2011)
PDF
USC Computer Science Technical Reports, no. 913 (2009)
PDF
USC Computer Science Technical Reports, no. 815 (2004)
PDF
USC Computer Science Technical Reports, no. 914 (2010)
PDF
USC Computer Science Technical Reports, no. 969 (2016)
PDF
USC Computer Science Technical Reports, no. 906 (2009)
PDF
USC Computer Science Technical Reports, no. 905 (2009)
PDF
USC Computer Science Technical Reports, no. 917 (2010)
PDF
USC Computer Science Technical Reports, no. 888 (2007)
PDF
USC Computer Science Technical Reports, no. 918 (2010)
PDF
USC Computer Science Technical Reports, no. 766 (2002)
PDF
USC Computer Science Technical Reports, no. 928 (2012)
PDF
USC Computer Science Technical Reports, no. 894 (2008)
PDF
USC Computer Science Technical Reports, no. 613 (1995)
PDF
USC Computer Science Technical Reports, no. 693 (1999)
PDF
USC Computer Science Technical Reports, no. 697 (1999)
PDF
USC Computer Science Technical Reports, no. 834 (2004)
PDF
USC Computer Science Technical Reports, no. 830 (2004)
Description
Alix L.H. Chow, Leana Golubchik, Samir Khuller, Yuan Yao. "On the tradeoff between playback delay and buffer space in streaming." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 904 (2009).
Asset Metadata
Creator
Chow, Alix L.H. (author), Golubchik, Leana (author), Khuller, Samir (author), Yao, Yuan (author)
Core Title
USC Computer Science Technical Reports, no. 904 (2009)
Alternative Title
On the tradeoff between playback delay and buffer space in streaming (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
17 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16270236
Identifier
09-904 On the Tradeoff Between Playback Delay and Buffer Space in Streaming (filename)
Legacy Identifier
usc-cstr-09-904
Format
17 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/