Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 742 (2001)
(USC DC Other)
USC Computer Science Technical Reports, no. 742 (2001)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Paper number 39 SCADDAR: An Efficient Randomized Technique to Reorganize Continuous Media Blocks Ashish Goel, Cyrus Shahabi, Shu-Yuen Didi Yao, and Roger Zimmermann Computer Science Department University of Southern California Los Angeles, California 90089 Abstract Scalable storage architectures allow for the addition of disks to increase storage capacity and/or bandwidth. This is an important requirement for continuous media servers for two reasons. First, multimedia objects are ever increasing in size, numbers and bandwidth requirements. Second, magnetic disks are continuously improving in capacity and transfer rate. In its general form, disk scaling also refers to disk removals when either capacity needs to be conserved or old disk drives are retired. There are two basic approaches to scatter the blocks of a continuous media object on multiple disk drives: random and constrained placement. Assuming random placement, our optimization objective is to redistribute a minimum number of media blocks after disk scaling. This objective should be met under two restrictions. First, uniform distribution and hence a balanced load should be ensured after redistribution. Second, the redistributed blocks should be retrieved at the normal mode of operation in one disk access and through low complexity computation. We propose a technique that meets the objective, while we prove that it also satisfies both restrictions. The SCADDAR approach is based on using a series of REMAP functions which can derive the location of a new block using only its original location as a basis. 1 Introduction Continuous media (CM) servers are becoming more and more popular as networks become increasingly able to deliver high-quality media. At the same time, the amount of media such as video, audio, and interactive virtual reality is rapidly growing. Today’s CM servers may not be well suited to handle tomorrow’s ever-increasing media sizes and bandwidth requirements. Thus, it is necessary to build a scalable CM server which allows the addition of new disks, the replacement of older model disks, and quite possibly the removal of older model disks. Adding disks to a CM server will increase overall server capacity and/or bandwidth without the need to judge and predict the amount of capacity and/or bandwidth needed during the initial setup of the server. Moreover, adding newer generation disks with (higher bandwidth and more capacity) to a CM server may cause the existing disks to become bottlenecks during data transfer [18]. These existing disks may eventually need to be replaced with newer disks or simply removed. Note that removing disks is different from disk failure because disk removal is known a priori while disk failure is unpredictable. Hence, during disk removal, necessary steps can be taken before the actual removal so the data can be properly redistributed to other disks allowing the CM server to function in its normal mode of This research has been funded in part by NSF grants EEC-9529152 (IMSC ERC) and ITR-0082826, NASA/JPL contract nr. 961518, DARPA and USAF under agreement nr. F30602-99-1-0524, and unrestricted cash/equipment gifts from NCR, IBM, Intel and SUN. 1 operation after disk removal. While the focus of this paper is on disk scaling (including disk removal), later in Section 6 we explain a simple extension to our proposed technique to address fault tolerance as well. The need for a scalable continuous media server is clear, but creates a need for the data residing on the CM server to be efficiently redistributed to the newly added disks without interruption to the activity of the CM server. Continuous media service providers, such as video-on-demand services, cannot afford to or may not be willing to stop services to its customers in order to add, remove, or upgrade the CM server disks. The downtime would not only prevent customers from connecting to the CM server to order video streams, but might also break existing connections to customers causing billing problems and customer complaints. In addition to minimizing downtime, a CM server cannot be subjected to an inefficient method for disk scaling during uptime since this might affect its services. A CM service provider needs to deliver high-quality, uninterrupted service even during maintenance periods. This motivates the need for an efficient, online method to scale disks in a continuous media server. Due to the large size and bandwidth requirements of CM objects and CM servers’ large number of simul- taneous users, a CM object is broken into fixed sized blocks which are distributed across all the disks. Several popular placement techniques include round-robin striping (e.g., [2]), RAID striping (e.g., [17]), and hybrid (e.g., [7, 16]). All these techniques can be categorized as constrained placement approaches where the location of a block is fixed and determined by the placement algorithm. However, an alternative placement approach is the random placement of blocks to disks. For example, the RIO (Randomized I/O) project at the University of California, Los Angeles, has demonstrated the many advantages of a randomized data placement technique when performing data accesses [14]. These advantages include: 1) load balancing by the law of large numbers, 2) no need for synchronous access cycles, 3) single traffic pattern, and 4) support for unpredictable access patterns as generated, for example, by interactive applications or VCR-style operations on CM streams. In this paper, we also assume the random placement technique. However, one disadvantage with random placement is that some sort of a directory system is required for the media retrieval to keep track of the location of every block. This is as opposed to the constrained placement techniques where the same algorithm used for placement of the blocks can be used for their retrieval. To address this shortcoming, we assume a pseudo- random placement where we can always regenerate the random sequence for object retrieval using a standard pseudo-random number generator and the same seed. Redistributing blocks placed in a random manner requires much less overhead when compared to redistribut- ing blocks placed using a constrained placement technique. For example, with round-robin striping, when adding or removing a disk, almost all the data blocks need to be moved to another disk. Instead, with a randomized place- ment, only a fraction of all data blocks need to be relocated. That is, only enough data blocks are moved to fill an appropriate fraction of the new disks. For disk removal, trivially, only the data blocks on the removed disk must be moved. Redistribution with random placement must ensure that data blocks are still randomly placed after disk scaling in order to balance the load on the multiple disks. With pseudo-random placement, for each block i a random number X (i) is generated and the block is placed on disk (X (i) mod N), where N is the total number of disks. When adding or removing disks, one approach is to simply redistribute each block i to disk (X (i) mod N j ), where N j is the new number of disks. The main disadvantage of this approach is that potentially all the blocks may need to be redistributed to new locations. We propose an approach termed SCADDAR: SCAling Disks for Data Arranged Randomly. With SCAD- DAR, we are able to use pseudo-random placement without redistributing all the blocks after each scaling oper- ation. Besides minimizing block movement, SCADDAR does not require a directory for storing block locations, only a storage structure for recording scaling operations, which is significantly less than the number of all block locations. In addition, SCADDAR computes the new locations of blocks on-the-fly for each block access by using a series of inexpensive mod and div functions. SCADDAR also maintains randomized block placement 2 Term Definition B Total number of CM object blocks R 2 b 1 where b is the bit length of p r (s)’s return value s m Seed used by p r () to retrive block locations of object m p r (s) Function that returns a unique random sequence for each unique seed s. Each iteration returns the next element, in the range of 0 :::R, of the sequence. n d Number of blocks on disk d N 0 Initial number of disks before any scaling operations X (i) 0 i-th iteration of p r (s m ) D (i) 0 Disk on which block i of object m resides. D (i) 0 = X (i) 0 mod N 0 N j Total number of disks after j scaling operations REMAP j Function that remaps X (i) j 1 to X (i) j , where REMAP 0 = p r (s m ) X (i) j Derived from a series of REMAP functions, REMAP 0 ...REMAP j D (i) j Disk on which block i of object m resides. D (i) j = X (i) j mod N j Table 1: List of parameters used repeatedly in this study and their respective definitions. for a certain number of successive scaling operations which, in turn, induces load balancing of the disks. The remainder of this paper is organized as follows. Section 2 reviews some of the related work. In Section 3, we provide a formal definition of the problem. Section 4 describes the SCADDAR algorithm and provides examples for disk additions and removals. Section 5 contains our performance evaluation of SCADDAR. Finally, Section 6 concludes this paper and presents the future direction of this work. 2 Related work Several studies have focused on data placement and retrieval scheduling of continuous media objects, for collec- tions see [9, 13, 6, 10, 4, 5]. Among which, only a few studied the random placement of data blocks on continuous media servers [3, 12, 11, 14]. Traditional constrained placement techniques such as round-robin data placement allow for deterministic service guarantees while random placement techniques are modeled statistically. Overall, random placement increases the flexibility to support various applications while it maintains a competitive per- formance [15]. We assume a slight variation of random placement, pseudo-random placement, in order to locate the residence of a block at the retrieval time, without the overhead of maintaining a directory. This is achieved by exploiting the fact that we can regenerate the sequence of numbers generated via a pseudo-random generator function. One study has addressed the redistribution of data blocks with round-robin data striping [8]. Inherently such a technique requires that almost all the blocks be relocated. The overhead of such block movement (bandwidth is consumed on both the source and the target disk drives) may be amortized over a certain amount of time but is, nevertheless, significant and wasteful. Work has also been done to write continuous media data to a server [1]. This work is orthogonal to our approach since we also need a similar technique to write blocks during the redistribution. To our knowledge, no previous work has identified and quantified the benefits of randomly placing data blocks on a scalable continuous media server when block redistribution is desired or required. Additionally, no prior work has addressed such redistribution while the CM server is online. 3 3 Problem Statement Pseudo-random placement We use randomized placement of CM object blocks as the technique by which blocks are placed onto disks such that during block retrieval, a block has roughly equal probabilities of residing on each disk. Continuous media objects are split into fixed size blocks and pseudo-randomly distributed over a group of homogeneous disks such that each disk carries an approximately equal load. With pseudo-random distribution, blocks are placed onto disks in a random, but reproducible, sequence. Definition 3.1 : Pseudo-random placement of CM object blocks is defined as a random placement of blocks whose random sequence can be reproduced. Definition 3.2: X (i) 0 is the random number, with range 0 ::: R, generated by a pseudo-random number generator for a specific block, i, before any scaling operations. D (i) 0 = (X (i) 0 mod N 0 ) determines the disk number on which block i resides, where N 0 is the total number of disks. The disk location of a block i of object m is disk D (i) 0 = (X (i) 0 mod N 0 ), where X (i) 0 is returned by the function p r (s m ) after i calls, N 0 is the number of total disks, and s m is the unique seed of object m. The function, p r (s m ), is assumed to return a random number in the range of 0 ::: R where R is 2 b 1 and p r (s m ) returns a b-bit random number. Every iteration of p r (s m ) will produce an identical random sequence given a unique seed s m . Table 1 lists all the parameters and their definitions used throughout this paper. Scaling operation We use the notion of disk group as a group of k disks that is added or removed during a scaling operation. Hence, Definition 3.3: A scaling operation on a CM server with N disks, is defined as either adding or removing one disk group. The initial number of disks in a CM server is denoted as N 0 and, subsequently, after j scaling operations the number of disks will be denoted as N j . During scaling operation j,a redistribution function RF () redistributes the blocks residing on N j 1 to N j disks. Consequently, after scaling operation j, a new access function AF () is needed to identify the location of a block, since its location might have been changed due to the scaling operation. Scaling operations are dynamic and uncontrolled such that any number of these operations can occur at any time. Scaling operations will alter the total number of disks and will require an (N j N j 1 )= N j fraction of all blocks to be moved onto the added disks in order to maintain load balancing across disks, during disk addition. All blocks on a removed disk should be moved and randomly distributed across remaining disks to maintain load balancing, during disk removal. The number of block movements just described is the fewest needed to maintain an even load. The original seed used to reproduce the sequence of disk locations can no longer be used to reproduce the blocks’ new sequence. The problem is how to represent this new sequence using a simple computation, the least possible movement of blocks, and the same seed per object no matter how many operations are performed, all while maintaining load balancing and a random distribution of data blocks. We can formally state the redistribution problem as: Definition 3.4: Given j scaling operations on N 0 disks, find RF () such that: 4 [RO1:] Block movement is minimized during redistribution. Only z j B blocks should be moved, where z j ( (N j N j 1 )= N j if N j >N j 1 (N j 1 N j )= N j 1 if N j 1 >N j (1) and B is the total number of object blocks. [RO2:] Randomization of all object blocks is maintained. Randomization leads to load balancing of all blocks across all disks where E[n 0 ] E [n 1 ] E[n 2 ] ::: E [n N j 1 ]. E[n k ] is the expected number of blocks on disk k. and find the corresponding AF () such that: [AO1:] Disk access is minimized using a low complexity function to compute a block location. Definition 3.5: X (i) j is a random number that is remapped from X (i) 0 for a specific block, i, after j scaling operations. A function, REMAP j , will remap X (i) j 1 to X (i) j . Initial Approaches Several initial approaches for solving block redistribution have been considered such as a directory storage structure, complete redistribution after each scaling operation, and extendible hashing. While none of these approaches satisfies all the objectives of Def. 3.4, we use specific principles from each one to formulate SCADDAR. Interested readers may turn to Appendix A for details on each approach. 4 SCADDAR: SCAling Disks for Data Arranged Randomly The main element of the SCADDAR approach is a function, REMAP j , which takes X (i) j 1 as input and generates X (i) j as the output, where D i k = (X (i) k mod N k ) determines the disk on which block i resides and N k is the total number of disks, after k scaling operations. Note that REMAP 0 is the original pseudo-random generator function. REMAP functions are used within both the RF () and AF () functions. In particular, during scaling operation j: if disks are added, then RF () applies a sequence of REMAP functions (from REMAP 0 to REMAP j )to compute X (i) j of every block i residing on all the disks, if disks are removed, then RF () applies a sequence of REMAP functions (from REMAP 0 to REMAP j )to compute X (d) j of every block d residing only on the removed disks. Similarly, after scaling operation j, to find the location of block i, AF () applies a sequence of REMAP functions (from REMAP 0 to REMAP j ) to compute X (i) j . Subsequently, RF () and AF () compute the location of block i as D i j = (X (i) j mod N j ), where N j is the total number of disks after scaling operation j. That is, the sequence X (i) 0 ;X (i) 1 ;::: ; X (i) j can be used to determine the location of block i after each scaling operation. Now, we can restate the objectives of Def. 3.4: RO1, RO2 and AO1 for SCADDAR as follows. The REMAP functions should be designed such that: 5 RO1: X (i) j 1 mod N j 1 and X (i) j mod N j result in different disk numbers only for z j (see RO1 in Def. 3.4) blocks and not more. RO2: For those X (i) j ’s that D i j 1 6= D i j , there should be an equal probability that D i j is any of the newly added disks (in case of addition operation), or any of the non-removed disks (in case of removal operation). AO1: The sequence X (i) 0 ;X (i) 1 ;::: ;X (i) j , and hence D i j can be generated with a low complexity. The design of the REMAP function hence becomes a challenging problem. For the remainder of this section we first explain our initial attempt, in Section 4.1, on the design of REMAP and show that while it satisfies RO1 and AO1, it fails to satisfy RO2 after more than one scaling operation. Subsequently in Section 4.1, we discuss a superior approach which satisfies all the objectives for up to k number of scaling operations. Finally in Section 4.3, we compute the upper-bound for k and prove that RO2 will again be violated after more than k scaling operations. In this case, we suggest a redistribution of all the blocks. 4.1 A naive approach We first describe a scheme that allows a single scaling operation to be performed. Although random distribution is preserved after a single operation of disk additions, we show that a truly random distribution is not achieved after subsequent operations. If scaling operation j is an addition of disks: REMAP j = ( X (i) 0 mod N j if X (i) 0 mod N j N j n j REMAP j 1 otherwise (2) This scheme allows one scaling operation, but successive operations will violate RO2. Fig. 1 shows this case with an initial group of 4 disks. At first, the blocks appear to be placed in a round-robin fashion. However, the numbers below the disks are the random numbers, X (i) 0 , generated by REMAP 0 and their ordering shown is not significant. When the next scaling operation is an addition of one disk, then 1/5 of all blocks are randomly moved onto the new disk, disk 4. However, with another scaling operation of adding one disk, 1/6 of all the blocks are not randomly moved onto the new disk, disk 5. Fig. 1c shows that only certain blocks from disks 1, 3 and 4 are moved onto disk 5 while disk 0 and disk 2 are ignored. Even though RO1 and AO1 are upheld, this does not guarantee load balancing of blocks and does not indicate that disk 5 contains randomly moved blocks. The reason is that we are using the same random number, X (i) 0 , in the range of 0 ::: 2 b 1, where b is 64, for finding REMAP j . We need to choose from a new set of random numbers during each successive scaling operation to guarantee random placement. The same results are seen when the scaling operation is a removal of a disk group so further explanations of disk removals for the naive approach are omitted. 4.2 SCADDAR approach SCADDAR extends the naive approach by allowing successive scaling operations to be performed while main- taining RO2, unlike the naive approach. First, REMAP j for deriving X (i) j after a disk group removal during the j-th operation is shown. Next, REMAP j for deriving X (i) j after a disk group addition during the j-th operation is shown. In each case, X (i) j results after remapping X (i) j 1 . For easier readability, we will now use the notation, X j and D j instead of X (i) j and D (i) j to indicate the random number and disk location of block i after j scaling opera- tions. We restate that the random number of block i before any scaling operations is REMAP 0 = X 0 = p r (s m ). 6 0 4 8 12 16 20 24 28 32 36 40 1 5 9 13 17 21 25 29 33 37 41 2 6 10 14 18 22 26 30 34 38 42 3 7 11 15 19 23 27 31 35 39 43 Disk 0 Disk 1 Disk 2 Disk 3 a) 4 9 14 19 24 29 34 39 3 7 11 15 23 27 31 35 43 2 6 10 18 22 26 30 38 42 1 5 13 17 21 25 33 37 41 0 8 12 16 20 28 32 36 40 Disk 0 Disk 1 Disk 2 Disk 3 Add Disk 4 b) 0 8 12 16 20 28 32 36 40 2 6 10 18 22 26 30 38 42 5 11 17 23 29 35 41 1 13 21 25 33 37 3 7 15 27 31 43 4 9 14 19 24 34 39 Disk 0 Disk 1 Disk 2 Disk 3 Disk 4 Add Disk 5 c) a. Initial State b. 1 st 1-disk add operation c. 2 nd 1-disk add operation Figure 1: Initial approach To simplify our notations,we use Def. 4.1 as the underlying basis of reasoning for computing REMAP j . Definition 4.1: Let q j =(X j div N j ) and r j =(X j mod N j ) where X j = q j N j + r j . 4.2.1 Block location after disk removal For this section, we focus on disk removals as the scaling operation. Fig. 1 depicts that using the same range of random numbers, 0 ::: R, to generate D j may not lead to a random distribution for all blocks, thus violating RO2. All blocks on the removed disk need to be moved to a random, non-removed disk according to RO1. REMAP j 1 was used previously for computing D j 1 to assign blocks across the N j 1 disks. Now that a disk is removed, REMAP j should return an X j that is chosen from a new range to find D j for the blocks in transit. Instead of using the same source of random numbers as REMAP j 1 ,REMAP j must draw upon a new source of randomness, namely (X j 1 div N j 1 ),or q j 1 , even though this is a smaller range. The shrinking range leads us to believe that there exists a threshold for the maximum number of scaling operations which we can perform before the range becomes too small, as shown in Section 4.3. Using q j 1 , the range shrinks to 0 ::: bR= N j 1 c. If a disk removal occurs during operation j, then block i either moves off of a removed disk or remains on a non-removed disk D j 1 ,or r j 1 . Notice that D k always equals r k for any k-th operation. Eq. 3 defines REMAP j if scaling operation j is a removal of disks: REMAP j = X j = ( q j 1 N j + new(r j 1 ) if r j 1 is not removed (a) q j 1 otherwise (b) (3) Upon first inspection for blocks on a non-removed disk, we want D j to be r j 1 which is satisfied by Eq. 3a since (X j mod N j ) = r j 1 . However, in the future case that block i will be moved due to another scaling operation, we need to include the new source of randomness, q j 1 , in computing REMAP j . Eq. 3a incorporates q j 1 since REMAP j = q j 1 N j + r j 1 (where q j 1 is a random number from a new range), instead of simply using REMAP j = r j 1 . Later we can retrieve q j 1 by (X j div N j ). In addition, when setting REMAP j = q j 1 N j + r j 1 in Eq. 3a, r j 1 is actually the r j 1 -th disk during operation j 1 which may not be the r j 1 -th disk during operation j. For example, if disk 1 were removed from the disk set 0, 1, 2, and 3 and r j 1 = 2 then new(r j 1 ) should become 1 during scaling operation j since r j 1 is now the first disk. The function, new(), maps r j 1 from 2 to 1. This leads to REMAP j = q j 1 N j + new(r j 1 ) for Eq. 3a. In the other case when r j 1 is a removed disk during operation j, we can set REMAP j = q j 1 , Eq. 3b, and D j will give us a new random block location. 7 Example of disk removal We show how a block location is affected by disk removals when one disk is re- moved. Assume disks consist of Disk 0, 1, 2, 3, 4 and 5 so N j 1 =6 and N j =5. The first case shows the new disk location if a block is moved. The second case shows the disk location if a block remains. SCADDAR also allows disk group removals but a single disk removal is shown in this example for simplicity. First, we consider the case where a block residing on a removed disk, Disk 4, is moved to another disk. Assume that for this block that REMAP j 1 = X j 1 = 28 since (X j 1 mod N j 1 ) = 4. We refer to this random number as X j 1 instead of X 0 since there may have been prior scaling operations. We now need to find REMAP j in order to determine whether D j is Disk 0, 1, 2, 3, or 5. We want to use a random number, q j 1 = (X j 1 div N j 1 ) = 4, chosen from a new range such that REMAP j = q j 1 = 4 as stated in Eq. 3b. D j is now 4 since (X j mod N j )=4. However, 4 indicates the 4-th disk among all the disks, so a final step of mapping results in Disk 5. For the second case we determine that a block on Disk 5 should remain on Disk 5 after Disk 4 is removed. Assume REMAP j 1 = X j 1 =41 for this block since (X j 1 mod N j 1 )=5. Using Eq. 3a we find REMAP j = X j =34. Notice that X j contains q j 1 which is a new random number used for determining future block locations. The function new() maps r j 1 to the (r j 1 )-th disk so new(5)=4-th disk. The 4-th disk is actually the current location of the block since we want the block to remain on D j (34 mod 5 = 4-th disk). The 4-th disk is Disk 5: the original disk. 4.2.2 Block location after disk addition Now suppose that disks are added during operation j. Again, the blocks should be located on D j after operation j. The task is to derive a REMAP j function in order to find D j . Block i will either be moved to one of the added disks or remain on disk D j 1 ,or r j 1 . We restate that D k always equals r k for any k-th operation. Eq. 4 defines REMAP j if scaling operation j is an addition of disks: REMAP j = X j = ( (q j 1 div N j ) N j + r j 1 if (q j 1 mod N j ) <N j 1 (a) (q j 1 div N j ) N j +(q j 1 mod N j ) otherwise (b) (4) We use (X j 1 div N j 1 ),or q j 1 , as the new source of randomness to maintain RO2. Using q j 1 , the range shrinks to 0 ::: bR= N j 1 c. After the addition operation, j, we use (q j 1 mod N j ) to randomly choose a disk. To uphold RO1 we want to move blocks only if they move to an added disk. In other words, if (q j 1 mod N j ) N j 1 for a particular block then that block is moved to an added disk during operation j. In fact, (q j 1 mod N j ) is the target disk for that block so we formulate Eq. 4b so that D j is (q j 1 mod N j ). In the other case, if (q j 1 mod N j ) <N j 1 then we know that block i will remain on its current disk so we formulate Eq. 4a so that D j is r j 1 . We have just partially computed REMAP j by being able to solve D j which, to restate, is (X j mod N j ).We now need to compute the new source of randomness (X j div N j ),or q j 1 , to complete REMAP j since we want REMAP j = (X j div N j ) N j +(X j mod N j ). Again, we want REMAP j to return an X j that contains a new random number range, q j 1 , for future scaling operations. So setting (q j 1 div N j ) to (X j div N j ) gives us X j =(q j 1 div N j ) N j + r j 1 for Eq. 4a and X j =(q j 1 div N j ) N j + q j 1 mod N j for Eq. 4b. We further simplify (q j 1 div N j ) N j to (q j 1 q j 1 mod N j ), thus: REMAP j = X j = ( (q j 1 q j 1 mod N j )+ r j 1 if (q j 1 mod N j ) <N j 1 (a) (q j 1 q j 1 mod N j )+ (q j 1 mod N j ) otherwise (b) (5) 8 Example of disk addition Performing disk addition operations will create a slight degradation in randomized placement and load balancing due to the shrinking random numbers range as the number of operations increase. An example of successive disk additions is further extended in Section 5. All the objectives of RF () and AF () are met using the SCADDAR approach. RO1 is satisfied since only those blocks which need to be moved are moved. Blocks either move onto an added disk or off of a removed disk. RO2 is satisfied since REMAP j always uses a new source of randomness to compute X j . The shrinking of the random number range is shown in Section 4.3. AO1 is satisfied since block accesses only require one disk access per block. Block location is determined through computation using inexpensive mod and div functions instead of a disk-resident directory. The following section quantifies the fairness of our randomization method after successive scaling operations. Random distribution will lead to a load balanced placement of blocks across all disks. 4.3 Analysis: Bounding the reduction in randomness Define the unfairness coefficient of a random load distribution scheme as The largest expected load on any machine The smallest expected load on any machine 1: If we pick an integer x uniformly at random from the range [0 ::: R 1] and then use x mod N k to compute the disk to which the block must be assigned after k scaling operations, then the unfairness coefficient is given by f (R ; N k )= 1 R div N k : We will pretend in this analysis that the pseudo-random number generator in fact generates a truly random number in the range 0 ::: 2 b 1 (i.e. we have b truly random bits). Let R i denote the range of the random number space that we have after i operations; further let > 0 be the largest unfairness coefficient we are willing to tolerate. Then f (R k ; N k ) must be at most . Lemma 4.2: R k div N k R 0 div (N 0 N 1 N 2 ::: N k ). Proof: We first observe that after the i-th addition/removal operation, the random number range is at least R i 1 div N i 1 . Thus, after k operations, the random number range is at least (((R 0 div N 0 ) div N 1 ) ::: div N k 1 ). Thus, R k div N k (((R 0 div N 0 ) div N 1 ) ::: div N k ). Given any three positive integers x; a; and b, it is easy to verify that (x div a) div b = x div (ab). Hence, (((R 0 div N 0 ) div N 1 ) ::: div N k )= R 0 div (N 0 N 1 N 2 ::: N k ), and therefore, R k div N k R 0 div (N 0 N 1 N 2 ::: N k ). Let k represent the product N 0 N 1 N 2 ::: N k . Lemma 4.3: If k R 0 (= (1 + )) then f (R k ; N k ) < . Proof: By Lemma 4.2, R k div N k R 0 div k . But R 0 div k > R 0 = k 1. Therefore, f (R k ; N k ) < 1=(R 0 = k 1). By the precondition in the lemma, R 0 = k (1 + )= . This implies that R 0 = k 1 1= . Now, R k div N k > 1= ,or f (R k ; N k ) < . Let k represent the average number of disks during the first k scaling operations. Then k k +1 k (the geometric mean of a set of positive numbers can never be larger than the arithmetic mean). Hence, the above condition results in the following approximate rule of thumb: 9 We can continue to use the same random sequence without redistributing the load as long as R 0 is larger than k +1 k = . Taking logarithm (base two) on both sides, the rule of thumb translates into k +1 (b log (1= ))=(log k ): For example, if we have an average of sixteen disks, desire 1%, and are using a 64-bit random number generator, then k +1 (64 log 100)=4 i.e. k +1 57=4 i.e. k 13. Therefore, a total of 13 disk addition/removal operations can be supported. The above rule of thumb should only be used for obtaining a good a priori estimate on how well the system will perform. In an implementation of this scheme, we can keep track of the quantity k explicitly and find out whether the next operation will lead to a violation of the precondition in Lemma 4.3. 5 Performance Evaluation We have performed several experiments to show that SCADDAR maintains load balancing, however due to lack of space we omit the figures and will include them in the final submission. We show that SCADDAR maintains load balancing of blocks across disks after several scaling operations. From the rule-of-thumb equation at the end of Section 4.3, we find k 8 where 5%, k =8 and b=32. Using simulation, we discover that after eight scaling operations performed on 20 different objects, the percentage of load fluctuation reaches the threshold level in which redistribution of all blocks is recommended. We compute the coefficient of variance, which is the standard deviation divided by the average number of blocks across all disks, for successive scaling operation. As the number of scaling operations increases, the load on each disk remains fairly equivalent. We observe that there is a slight increase in the variation of the number of blocks across the disks. This is due to the shrinking range of random numbers after each operation. As a comparison, this curve is growing at a higher rate than the curve representing redistributions of all blocks after scaling operations. 6 Conclusions and Future Research Directions The SCADDAR approach meets all the objectives required in redistributing randomly placed blocks. The ob- jectives of minimizing block movement, maintaining randomness and few disks accesses holds true for up to k scaling operations, where the upper-bound of k can be computed. We would like to apply SCADDAR to redistribute randomly placed blocks on hetergeneous disk arrays. Currently, SCADDAR is applicable to both homogeneous physical disks and logical disks. By applying previous work of mapping homogeneous logical disks to heterogeneous physical disks [18], SCADDAR may naturally evolve to allow block redistribution on hetergeneous physical disks. As for fault tolerance, data mirroring may be a simple solution with SCADDAR. Mirrored blocks could be placed at a fixed offset determined by a function f(N j ). For example, f(N j ) could return N j =2 as an offset. We also plan to investigate using data parity bits to handle faults with less required storage space. SCADDAR can also be extended to online disk scaling where scalings can be performed during a normal mode of operation. We are currently implementing SCADDAR on our own continuous media server to test the practicality of its load balancing, low-complexity and preservation of block randomness. 10 References [1] W. Aref, I. Kamel, Niranjan, and S. Ghandeharizadeh. Disk Scheduling for Displaying and Recording Video in Non-Linear News Editing Systems. In Multimedia Computing and Networking Conference, February 1997. [2] S. Berson, S. Ghandeharizadeh, R. Muntz, and X. Ju. Staggered Striping in Multimedia Information Systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data, 1994. [3] S. Berson, R. R. Muntz, and W. R. Wong. Randomized Data Allocation for Real-Time Disk I/O. In COMPCON, pages 286–290, 1996. [4] J.F. Buford, editor. Multimedia Systems. Addison-Wesley, 1994. [5] S.M. Chung, editor. Multimedia Information Storage and Management. Kluwer Academic Publishers, 1996. [6] A. Ghafoor. Special Issue on Multimedia Database Systems. ACM Multimedia Systems, 3(5-6), November 1995. [7] S. Ghandeharizadeh and C. Shahabi. Management of Physical Replicas in Parallel Multimedia Information Systems. In Proceedings of the Foundations of Data Organization and Algorithms (FODO) Conference, October 1993. [8] Shahram Ghandeharizadeh and Dongho Kim. On-line Reorganization of Data in Scalable Continuous Media Servers. In 7 th International Conference and Workshop on Database and Expert Systems Applications (DEXA’96), September 1996. [9] Shahram Ghandeharizadeh and Cyrus Shahabi. Distributed Multimedia Systems. In John G. Webster, editor, Wiley Encyclopedia of Electrical and Electronics Engineering. John Wiley and Sons Ltd., New York, 1999. [10] W. Grosky, R. Jain, and R. Mehrotra, editors. The Handbook of Multimedia Information Management. Prentice-Hall, 1997. [11] S. H. Kim and S. Ghandeharizadeh. Design of Multi-user Editing Servers for Continuous Media. In IEEE International Workshop on Research Issues in Data Engineering (RIDE’98), Orlando, Florida, February 1998. [12] R. Muntz, J. Santos, and S. Berson. RIO: A Real-time Multimedia Object Server. In ACM Sigmetrics Performance Evaluation Review, volume 25, September 1997. [13] K. Nwosu, B. Thuraisingham, and P.B. Berra. Multimedia Database Systems-A New Frontier. IEEE Multimedia, 4(3):21–23, Jul.-Sep. 1997. [14] J. R. Santos and R. R. Muntz. Performance Analysis of the RIO Multimedia Storage System with Heterogeneous Disk Configurations. In ACM Multimedia, pages 303–308, 1998. [15] J. R. Santos, R. R. Muntz, and B. Ribeiro-Neto. Comparing Random Data Allocation and Data Striping in Multimedia Servers. In SIGMETRICS, Santa Clara, California, June 17-21 2000. [16] C. Shahabi, S. Ghandeharizadeh, and S. Chaudhuri. On Scheduling Atomic and Composite Continuous Media Objects. To appear in IEEE TKDE, 2001. [17] F.A. Tobagi, J. Pang, R. Baird, and M. Gang. Streaming RAID-A Disk Array Management System for Video Files. In First ACM Conference on Multimedia, August 1993. [18] Roger Zimmermann and Shahram Ghandeharizadeh. Continuous Display Using Heterogeneous Disk-Subsystems. In Proceedings of the Fifth ACM Multimedia Conference, pages 227–236, Seattle, Washington, November 9-13, 1997. 11 A Initial Approaches This problem could be considered a simple book-keeping problem where RF ()’s keep track of the location of every block of every CM object after scaling operations. Hence, the access function, AF (), is simply a directory lookup where a directory entry contains the disk location of that directory index. For each scaling operation, new random locations must be determined per block. These new random numbers would then be updated in the directory for later block retrieval. Adding and deleting CM objects will also cause directory updates, in addition to the directory updates needed after scaling operations. Hence, AF ()’s and RF ()’s become functions of the number of blocks and the number of objects in the CM server. Considering the fact that a typical CM server can contain in the order of thousands of CM objects and each CM object contains tens of thousands of blocks, the directory can potentially expand to millions of entries. Moreover, using a centralized directory could render it as a potential bottleneck since multiple directory accesses and modifications require concurrency control. Using a distributed directory such as in a distributed CM server requires integrity checks to ensure that all directories are consistent. Therefore, we are in need of an RF () and AF () that are independent from the number of CM objects and that use data structures that need updating only when scaling operations occur (which are assumed to be infrequent events). An alternative initial approach to redistribute blocks would be using RF () = AF () =(X (i) 0 mod N j ). Here, a new initial state arises after scaling operations and only the seed of each object needs to be stored just as before any scaling operation occurred. However, this would require all blocks to be moved to a new location thus violating RO1 which is to minimize block movement. The method for finding a block location prior to the occurrence of any scaling operations is similar to using a hash function X mod N. We wish to extend this idea so that rehashing occurs at every scaling operation. Extendible hashing seems to be a viable approach that uses hashing and a directory structure. Hashing assigns blocks to a specific directory entry which might point to a bucket. Each directory entry is labeled with a d- bit binary number. Rehashing occurs when disks overflow and new disks must be added. Disk additions will cause the directory size to double since each entry now has a (d +1)-bit binary number label, which is required for hashing. At first, extendible hashing seems very analogous to reorganizing blocks on disks with buckets representing disks. The important difference is that in order to ensure load balancing and randomized placement of blocks, the directory entries can only point to one disk since each entry has equal probability of being hashed into. This results in the doubling of disks during addition operations or halving of disks during removal operations which is not a feasible or flexible solution. Although the above approaches do not offer complete solutions, useful principles from each can be used to formulate a better approach. We realize, from the directory scheme, that a new random number sequence, X (i) j , is needed to track new block locations. We also realize, from extendible hashing, that rehashing can be used to remap a random number sequence to a new sequence. We wish to devise a scheme to remap the random numbers, X (i) 0 , generated for each block i to a new sequence of random number, X (i) j , which will indicate where block i should reside after the j-th scaling operation. We restate the fact that this location is derived from (X (i) j mod N j ) which could either indicate a new location for block i or the previous location of block i. If a new sequence of X (i) j ’s can be found for each scaling operation, then we can simply take (X (i) j mod N j ) to find the block location after the j-th operation. Therefore, AF () and RF () need to compute the new X (i) j random numbers for every block while maintaining the objectives of RO1, RO2 and AO1. Taking the above methodologies into consideration, we introduce our approach for finding X (i) j , a remapping of X (i) 0 for each block, called SCADDAR. 12
Abstract (if available)
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 799 (2003)
PDF
USC Computer Science Technical Reports, no. 739 (2001)
PDF
USC Computer Science Technical Reports, no. 785 (2003)
PDF
USC Computer Science Technical Reports, no. 590 (1994)
PDF
USC Computer Science Technical Reports, no. 748 (2001)
PDF
USC Computer Science Technical Reports, no. 740 (2001)
PDF
USC Computer Science Technical Reports, no. 766 (2002)
PDF
USC Computer Science Technical Reports, no. 647 (1997)
PDF
USC Computer Science Technical Reports, no. 758 (2002)
PDF
USC Computer Science Technical Reports, no. 744 (2001)
PDF
USC Computer Science Technical Reports, no. 964 (2016)
PDF
USC Computer Science Technical Reports, no. 825 (2004)
PDF
USC Computer Science Technical Reports, no. 826 (2004)
PDF
USC Computer Science Technical Reports, no. 587 (1994)
PDF
USC Computer Science Technical Reports, no. 893 (2007)
PDF
USC Computer Science Technical Reports, no. 845 (2005)
PDF
USC Computer Science Technical Reports, no. 694 (1999)
PDF
USC Computer Science Technical Reports, no. 948 (2014)
PDF
USC Computer Science Technical Reports, no. 959 (2015)
PDF
USC Computer Science Technical Reports, no. 802 (2003)
Description
Ashish Goel, Cyrus Shahabi, Shu-Yuen Didi Yao, and Roger Zimmermann. "SCADDAR: An efficient randomized technique to reorganize continuous media blocks." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 742 (2001).
Asset Metadata
Creator
Goel, Ashish
(author),
Shahabi, Cyrus
(author),
Yao, Shu-Yuen Didi
(author),
Zimmermann, Roger
(author)
Core Title
USC Computer Science Technical Reports, no. 742 (2001)
Alternative Title
SCADDAR: An efficient randomized technique to reorganize continuous media blocks (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
12 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16269670
Identifier
01-742 SCADDAR An Efficient Randomized Technique to Reorganize Continuous Media Blocks (filename)
Legacy Identifier
usc-cstr-01-742
Format
12 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/