Page 219 
Save page Remove page  Previous  219 of 243  Next 

small (250x250 max)
medium (500x500 max)
Large (1000x1000 max)
Extra Large
large ( > 500x500)
Full Resolution
All (PDF)

This page
All

On the other hand, for any set Vi with i = Ne(d), there is a unique set in VF that includes Vi, which is itself. Hence, the base cases for i = Ne(d) are directly provided by available messages as follows. YA,Vi = Y 0 B\Vi . (F.63) Using dynamic programming, we jointly compute the above steps for all needed YA’s. Note that some of the intermediate variables YA,Vi are identical for different sets of A, so they do not need to be recomputed.2 Precisely, each YA,Vi is uniquely determined by only two parameters, the integer i and the set3 A [ [i]\Vi. Thus, we can define a set of variables, denoted YS,i, which is identical to any YA,Vi with S = A [ [i]\Vi. Note that each YS,i is only needed if S \{1, k} 6= ;, S = t+1, and S \{i+1, ...,Ne(d)} = ;. For brevity, we denote the collection of such sets S for each i = 0, 1, ...,Ne(d) by SF,i. Moreover, when i is decremented to i − 1, only a subset of YS,i’s needs to be updated. We denoted the corresponding family of sets of S by S0 F,i, which is given by {S 2 SF,i−1 9 b 2 S, s.t. db = i}. The intermediate variables YS,i’s and needed messages YA’s can be computed using the following algorithm. Algorithm F.1 Example Algorithm for Computing Needed Messages 1: initialize ˜ YS Y 0S for any S 2 SF,Ne(d) . ˜ YS stores YS,Ne(d) 2: for i = Ne(d), ..., 1 do 3: for S 2 S0 F,i do . Compute YS,i−1 at ˜ YS 4: for b 2 S with db = i do 5: ˜ YS ˜ YS ˜ YS[{i}\{b} 6: return YA ˜ YA for any S 2 SF,0 . YA = YA,0 The decoding complexity for recovering YA’s can be computed as follows. The complexity for initialization is linear to the size of SF,Ne(d) multiplied by the length of a single YA. Because SF,Ne(d) 2K−1 t , the complexity for initialization is O(K−1 t · F (K t )) = O(K−t K F). In the following computation stage, the complexity is identical to the number of xor operations multiplied by the length of YA. Note that each xor operation can be injectively mapped to a set of tuples (b, S) that characterizes the operation4, where the elements satisfies b 2 S 2 SF,Ne(d). The size of this set is upper bounded by S · SF,Ne(d) = O(K−1 t (t + 1)). Hence, the complexity for this stage is O((K−t)(t+1) K F). The complexity of returning the needed messages is no greater than the complexity of initialization due to SF,0 SF,Ne(d), which would not increase the overal complexity. Finally, 2The required storage overhead can be made negligible for large F, by partitioning the subfiles and messages into smaller fractions and decoding them separately using the same steps. 3[i] , {1, 2, ..., i}. 4The injectivity is based on the fact that the iteration number i is determined by b. 204
Object Description
Title  Coded computing: a transformative framework for resilient, secure, private, and communication efficient large scale distributed computing 
Author  Yu, Qian 
Author email  qyu880@usc.edu;qianyu0929@gmail.com 
Degree  Doctor of Philosophy 
Document type  Dissertation 
Degree program  Electrical Engineering 
School  Viterbi School of Engineering 
Date defended/completed  20200324 
Date submitted  20200804 
Date approved  20200805 
Restricted until  20200805 
Date published  20200805 
Advisor (committee chair)  Avestimehr, Salman 
Advisor (committee member) 
Luo, HaiPeng Ortega, Antonio Soltanolkotabi, Mahdi 
Abstract  Modern computing applications often require handling massive amounts of data in a distributed setting, where significant issues on resiliency, security, or privacy could arise. This dissertation presents new computing designs and optimality proofs, that address these issues through coding and informationtheoretic approaches. ❧ The first part of this thesis focuses on a standard setup, where the computation is carried out using a set of worker nodes, each can store and process a fraction of the input dataset. The goal is to find computation schemes for providing the optimal resiliency against stragglers given the computation task, the number of workers, and the functions computed by the workers. The resiliency is measured in terms of the recovery threshold, defined as the minimum number of workers to wait for in order to compute the final result. We propose optimal solutions for broad classes of computation tasks, from basic building blocks such as matrix multiplication (entangled polynomial codes), Fourier transform (coded FFT), and convolution (polynomial code), to general functions such as multivariate polynomial evaluation (Lagrange coded computing). We develop optimal computing strategies by introducing a general coding framework called “polynomial coded computing”, to exploit the algebraic structure of the computation task and create computation redundancy in a novel coded form across workers. Polynomial coded computing allows for orderwise improvements over the state of the arts and significantly generalizes classical codingtheoretic results to go beyond linear computations. The encoding and decoding process of polynomial coded computing designs can be mapped to polynomial evaluation and interpolation, which can be computed efficiently. ❧ Then we show that polynomial coded computing can be extended to provide unified frameworks that also enable security and privacy in the computation. We present the optimal designs for three important problems: distributed matrix multiplication, multivariate polynomial evaluation, and gradienttype computation. We prove their optimality by developing informationtheoretic and linearalgebraic converse bounding techniques. ❧ Finally, we consider the problem of coding for communication reduction. In the context of distributed computation, we focus on a MapReducetype framework, where the workers need to shuffle their intermediate results to finish the computation. We aim to understand how to optimally exploit extra computing power to reduce communication, i.e., to establish a fundamental tradeoff between computation and communication. We prove a lower bound on the needed communication load for general allocation of the task assignments, by introducing a novel informationtheoretic converse bounding approach. The presented lower bound exactly matches the inverseproportional coding gain achieved by coded distributed computing schemes, completely characterizing the optimal computationcommunication tradeoff. The proposed converse bounding approach strictly improves conventional cutset bounds and can be widely applied to prove exact optimally results for more general settings, as well as more classical communication problems. We also investigate a problem called coded caching, where a single server is connected to multiple users in a cache network through a shared bottleneck link. Each user has an isolated memory that can be used to prefetch content. Then the server needs to deliver users’ demands efficiently in a following delivery phase. We propose caching and delivery designs that improve the stateoftheart schemes under both centralized and decentralized settings, for both peak and average communication rates. Moreover, by developing informationtheoretic bounds, we prove the proposed designs are exactly optimal among all schemes that use uncoded prefetching, and optimal within a factor of 2.00884 among schemes with coded prefetching. 
Keyword  coding theory; information theory; distributed computing; security; privacy; matrix multiplication; caching 
Language  English 
Part of collection  University of Southern California dissertations and theses 
Publisher (of the original version)  University of Southern California 
Place of publication (of the original version)  Los Angeles, California 
Publisher (of the digital version)  University of Southern California. Libraries 
Provenance  Electronically uploaded by the author 
Type  texts 
Legacy record ID  uscthesesm 
Contributing entity  University of Southern California 
Rights  Yu, Qian 
Physical access  The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository email address given. 
Repository name  University of Southern California Digital Library 
Repository address  USC Digital Library, University of Southern California, University Park Campus MC 7002, 106 University Village, Los Angeles, California 900897002, USA 
Repository email  cisadmin@lib.usc.edu 
Filename  etdYuQian8883.pdf 
Archival file  Volume13/etdYuQian8883.pdf 
Description
Title  Page 219 
Full text  On the other hand, for any set Vi with i = Ne(d), there is a unique set in VF that includes Vi, which is itself. Hence, the base cases for i = Ne(d) are directly provided by available messages as follows. YA,Vi = Y 0 B\Vi . (F.63) Using dynamic programming, we jointly compute the above steps for all needed YA’s. Note that some of the intermediate variables YA,Vi are identical for different sets of A, so they do not need to be recomputed.2 Precisely, each YA,Vi is uniquely determined by only two parameters, the integer i and the set3 A [ [i]\Vi. Thus, we can define a set of variables, denoted YS,i, which is identical to any YA,Vi with S = A [ [i]\Vi. Note that each YS,i is only needed if S \{1, k} 6= ;, S = t+1, and S \{i+1, ...,Ne(d)} = ;. For brevity, we denote the collection of such sets S for each i = 0, 1, ...,Ne(d) by SF,i. Moreover, when i is decremented to i − 1, only a subset of YS,i’s needs to be updated. We denoted the corresponding family of sets of S by S0 F,i, which is given by {S 2 SF,i−1 9 b 2 S, s.t. db = i}. The intermediate variables YS,i’s and needed messages YA’s can be computed using the following algorithm. Algorithm F.1 Example Algorithm for Computing Needed Messages 1: initialize ˜ YS Y 0S for any S 2 SF,Ne(d) . ˜ YS stores YS,Ne(d) 2: for i = Ne(d), ..., 1 do 3: for S 2 S0 F,i do . Compute YS,i−1 at ˜ YS 4: for b 2 S with db = i do 5: ˜ YS ˜ YS ˜ YS[{i}\{b} 6: return YA ˜ YA for any S 2 SF,0 . YA = YA,0 The decoding complexity for recovering YA’s can be computed as follows. The complexity for initialization is linear to the size of SF,Ne(d) multiplied by the length of a single YA. Because SF,Ne(d) 2K−1 t , the complexity for initialization is O(K−1 t · F (K t )) = O(K−t K F). In the following computation stage, the complexity is identical to the number of xor operations multiplied by the length of YA. Note that each xor operation can be injectively mapped to a set of tuples (b, S) that characterizes the operation4, where the elements satisfies b 2 S 2 SF,Ne(d). The size of this set is upper bounded by S · SF,Ne(d) = O(K−1 t (t + 1)). Hence, the complexity for this stage is O((K−t)(t+1) K F). The complexity of returning the needed messages is no greater than the complexity of initialization due to SF,0 SF,Ne(d), which would not increase the overal complexity. Finally, 2The required storage overhead can be made negligible for large F, by partitioning the subfiles and messages into smaller fractions and decoding them separately using the same steps. 3[i] , {1, 2, ..., i}. 4The injectivity is based on the fact that the iteration number i is determined by b. 204 