Page 1 |
Save page Remove page | Previous | 1 of 201 | Next |
|
small (250x250 max)
medium (500x500 max)
large ( > 500x500)
Full Resolution
All (PDF)
|
This page
All
Subset |
DISTRIBUTED INDEXING AND AGGREGATION TECHNIQUES
FOR PEER-TO-PEER AND GRID COMPUTING
by
Min Cai
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Ful llment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
December 2006
Copyright 2006 Min Cai
Object Description
| Title | Distributed indexing and aggregation techniques for peer-to-peer and grid computing |
| Author | Cai, Min |
| Author email | mincai@usc.edu |
| Degree | Doctor of Philosophy |
| Document type | Dissertation |
| Degree program | Computer Science |
| School | Viterbi School of Engineering |
| Date defended/completed | 2006-10-25 |
| Date submitted | 2006 |
| Restricted until | Unrestricted |
| Date published | 2006-11-19 |
| Advisor (committee chair) | Hwang, Kai |
| Advisor (committee member) |
Kuo, C.-C. Jay Neuman, Clifford |
| Abstract | Peer-to-Peer (P2P) systems and Grids are emerging as two novel paradigms of distributed computing for wide-area resource sharing on the Internet. In these two paradigms, it is essential to discover resources by their attributes and to acquire global information in a fully decentralized fashion. This dissertation proposes a multi-attribute addressable network (MAAN) for resource indexing, a distributed aggregation tree (DAT) for information aggregation, and a distributed counting scheme for estimating global cardinalities.; MAAN indexes resources by their attribute values on Chord via uniform locality preserving hashing. It resolves multi-attribute range queries with a single-attribute dominated algorithm that scales to both the network size and the number of attributes. The DAT scheme implicitlybuilds an aggregation tree from Chord routing paths without membership maintenance. It also improves the Chord routing algorithm to build a balanced DAT tree, when nodes are evenly distributed in the identifier space. Furthermore, the distributed counting scheme digests large sets into small cardinality summaries and merges them through a DAT tree. The global cardinality is estimated by using an adaptive counting algorithm that not only scales to large cardinalities but also scales to small ones.; Based on these techniques, this dissertation has developed three applications including a P2P replica location service (P-RLS), a distributed RDF repository (RDFPeers) and a worm signature generation system (WormShield). P-RLS extends the Globus RLS system with properties of self-organization, fault-tolerance and improved scalability. It also reduces query hotspots by using a predecessor replication scheme. RDFPeers stores, searches and subscribes to RDFtriples in a MAAN network. In RDFPeers, the routing hops for triple insertion, for most query resolution and for triple subscription are logarithmic to the network size.; WormShield automatically generates worm signatures by using distributed fingerprint filtering and aggregation at multiple edge networks. Due to Zipf-like fingerprint distribution, the filtering scheme reduces the amount of aggregation traffic by several orders of magnitude. The global fingerprint statistics are computed through DAT trees in a scalable and load-balanced fashion. The experimental results demonstrate that the indexing and aggregation schemes perform well in different P2P and Grid applications. |
| Keyword | distributed indexing; distributed aggregation; peer-to-peer computing; grid computing; worm signature generation |
| Language | English |
| Part of collection | University of Southern California dissertations and theses |
| Publisher (of the original version) | University of Southern California |
| Place of publication (of the original version) | Los Angeles, California |
| Publisher (of the digital version) | University of Southern California. Libraries |
| Type | texts |
| Legacy record ID | usctheses-m172 |
| Rights | Cai, Min |
| Repository name | Libraries, University of Southern California |
| Repository address | Los Angeles, California |
| Repository email | http://www.usc.edu/isd/libraries/services/ask_a_librarian/email/ |
| Filename | etd-Cai-20061119 |
| Archival file | uscthesesreloadpub_Volume26/etd-Cai-20061119.pdf |
Description
| Title | Page 1 |
| Full text | DISTRIBUTED INDEXING AND AGGREGATION TECHNIQUES FOR PEER-TO-PEER AND GRID COMPUTING by Min Cai A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Ful llment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) December 2006 Copyright 2006 Min Cai |
Comments
Post a Comment for Page 1

