Page 1 |
Save page Remove page | Previous | 1 of 156 | Next |
|
small (250x250 max)
medium (500x500 max)
Large (1000x1000 max)
Extra Large
large ( > 500x500)
Full Resolution
All (PDF)
|
This page
All
|
MODELING, SEARCHING, AND EXPLAINING ABNORMAL INSTANCES IN MULTI-RELATIONAL NETWORKS by Shou-de Lin A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) December 2006 Copyright 2006 Shou-de Lin
Object Description
Title | Modeling, searching, and explaining abnormal instances in multi-relational networks |
Author | Lin, Shou-de |
Author email | sdlin@isi.edu |
Degree | Doctor of Philosophy |
Document type | Dissertation |
Degree program | Computer Science |
School | Viterbi School of Engineering |
Date defended/completed | 2006-08-24 |
Date submitted | 2006 |
Restricted until | Unrestricted |
Date published | 2006-09-27 |
Advisor (committee chair) | Knight, Kevin |
Advisor (committee member) |
Chalupsky, Hans Hovy, Eduard Rosenbloom, Paul S. O'Leary, Daniel E. |
Abstract | An important research problem in knowledge discovery and data mining is to identify abnormal instances. Finding anomalies in data has important applications in domains such as fraud detection and homeland security. While there are several existing methods to identify anomalies in numerical datasets, there has been little work aimed at discovering abnormal instances in large and complex relational networks whose nodes are richly connected with many different types of links. To address this problem we designed a novel, unsupervised, domain independent framework that utilizes the information provided by different types of links to identify abnormal nodes. Our approach measures the dependencies between nodes and paths in the network to capture what we call "semantic profiles" of nodes, and then applies a distance-based outlier detection method to find abnormal nodes that are significantly different from their closest neighbors. In a set of experiments on synthetic data about organized crime, our system can almost perfectly identify the hidden crime perpetrators and outperforms several other state-of-the-art methods that have been used to analyze the 9/11 terrorist network by a significant margin.; To facilitate validation, we designed a novel explanation mechanism that can generate meaningful and human-understandable explanations for abnormal nodes discovered by our system. Such explanations not only facilitate the verification and screening out of false positives, but also provide directions for further investigation. The explanation system uses a classification-based approach to summarize the characteristic features of a node together with a path-to-sentence generator to describe these features in natural language. In an experiment with human subjects we show that the explanation system allows them to identify hidden perpetrators in a complex crime dataset much more accurately and efficiently. We also demonstrate the generality and domain independence of our system by applying it to find abnormal and interesting instances in two representative natural datasets in the movie and bibliography domain. Finally, we discuss our solutions to several related applications including abnormal path discovery, local node discovery, automatic node description and explanation-based outlier detection. |
Keyword | knowledge discovery; data mining; semantic network; anomaly detection; natural language generation; interestingness; semantic graph; artificial intelligence |
Language | English |
Part of collection | University of Southern California dissertations and theses |
Publisher (of the original version) | University of Southern California |
Place of publication (of the original version) | Los Angeles, California |
Publisher (of the digital version) | University of Southern California. Libraries |
Type | texts |
Legacy record ID | usctheses-m41 |
Contributing entity | University of Southern California |
Rights | Lin, Shou-de |
Repository name | Libraries, University of Southern California |
Repository address | Los Angeles, California |
Repository email | cisadmin@lib.usc.edu |
Filename | etd-Lin-20060927 |
Archival file | uscthesesreloadpub_Volume4/etd-Lin-20060927.pdf |
Description
Title | Page 1 |
Contributing entity | University of Southern California |
Repository email | cisadmin@lib.usc.edu |
Full text | MODELING, SEARCHING, AND EXPLAINING ABNORMAL INSTANCES IN MULTI-RELATIONAL NETWORKS by Shou-de Lin A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) December 2006 Copyright 2006 Shou-de Lin |