Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
00001.tif
(USC Thesis Other)
00001.tif
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
SEM ANTIC KNOW LEDGE ACQUISITION FO R INFORM ATION EX TRA CTIO N FROM TEX TS ON PARALLEL M ARKER-PASSING C O M PU TER by Jun-Tae Kim A D issertation Presented to the FACULTY OF TH E GRADUATE SCHOOL UNIVERSITY O F SOUTHERN CALIFORNIA In P artial Fulfillment of the Requirem ents for the Degree D O C TO R O F PHILOSOPHY (Com puter Engineering) August 1993 Copyright 1993 Jun-Tae Kim UMI Number: D P22868 All rights reserved INFORMATION TO ALL USERS T he quality of this reproduction is d ependent upon the quality of th e copy subm itted. In the unlikely event that the author did not sen d a com plete m anuscript and th ere are m issing pag es, th e s e will be noted. Also, if material had to be rem oved, a note will indicate the deletion. Published by P roQ uest LLC (2014). Copyright in the Dissertation held by the Author. Oissartaiion Ptbl stwig UMI DP22868 Microform Edition © P roQ uest LLC. All rights reserved. This work is protected against unauthorized copying under Title 17, United S ta tes C ode P roQ uest LLC. 789 E ast Eisenhow er Parkway P.O. Box 1346 Ann Arbor, Ml 4 8 1 0 6 - 1346 UNIVERSITY OF SOUTHERN CALIFORNIA THE GRADUATE SCHOOL UNIVERSITY PARK under the direction of h.tf. Dissertation Committee, and approved by all its members, has been presented to and accepted by The Graduate School, in partial fulfillment of re quirements for the degree of LOS ANGELES, CALIFORNIA 90007 ph.P Cp5 This dissertation, written by Ju n -T ae Kim DOCTOR OF PHILOSOPHY Dean of Graduate Studies D a te .6a. 1993 DISSERTATION COMMITTEE Chairperson Dedication To my parents Kyounghwan Kim and Seunghyun Oh, my sister Junhee, my wife Kyeonah, and my lovely daughter Hyesoo. Acknowledgement s I wish to thank my advisor, Professor Dan Moldovan, for his continuous support, guidance, and encouragem ent during m y entire years of graduate study. He has been providing me w ith many great ideas and insights for various subjects pre sented in this dissertation. His care and understanding were beyond those of an academic advisor, and m ade my years at USC more memorable. I would like to thank Professor Jean-Luc Gaudiot and Dr. Kevin Knight for be ing on my dissertation com m ittee. I sincerely appreciate the tim e and the guidance they provided for the completion of my dissertation. I also would like to thank Dr. Eduard Hovy, Professor Peter Danzig, and Professor Cauligi Raghavendra for their valuable comments and discussions as my guidance com m ittee members. Especially, Dr. Hovy always encouraged me not only on my research but also on other m atters whenever difficulties arose. Many thanks to everyone who worked as a group for the SNAP project during the past three years. W ing Lee, Ron DeM ara, Eric Lin, A drian Moga, Chinyew Lin, Traian M itrache, Hirendu Vaishnav, and Tara Poppen provided a nice en vironm ent for various experim ents on th e simulators and th e SNAP-1 prototype. Sanghwa Chung, Seungho Cha, Minhwa Chung, Ken Hendrickson, Steve Kowalksi, Tony Gallippi, Haigaz Farajian, and Sanda Harabagiu worked closely together w ith me for th e development of various softwares and the inform ation extraction system for MUC-4 and MUC-5. I also thank our secretary Dawn Ernst for her help on publications and various events. I would like to thank my parents Kyounghwan K im and Seunghyun Oh, and my sister Junhee Kim for their love, care, and support. Finally, special thanks to my wife Kyeonah, who has always been my best friend, unchanging supporter, and a devoted wife. Contents D ed ication A cknow ledgem ents List O f Figures A bstract 1 Introduction 1.1 Need of Lexical Knowledge A c q u isitio n ............................ 1.2 Organization of D is s e rta tio n ................................................. 2 Background 2.1 Parallel Knowledge P ro cessin g ............................................. 2.1.1 Semantic networks, frames and concept hierarchy 2.1.2 M arker passing for knowledge processing . . . . 2.1.3 M emory-based natural language processing . . . 2.1.4 The SNAP p r o j e c t .................................................... 2.2 Inform ation E xtraction from Texts ...................................... 2.2.1 The Message U nderstanding Conference . . . . . 2.2.2 Overview of the SNAP system for MUC-4 . . . , 2.2.3 Discussion on th e t a s k ............................................. ... 2.3 Acquisition of Linguistic Knowledge . ............................... 2.3.1 Expanding lexicon through hum an interaction . , 2.3.2 Learning word m e a n in g s ............................................. 2.3.3 Learning gram m ar r u l e s ............................................. 2.3.4 Acquisition of collocational in fo rm atio n ................. 2.3.5 Acquisition of sem antic p a tte r n s ............................... 3 A u tom atic Lexical A cquisition 3.1 T he Acquisition T a s k .............................................................. ... 3.1.1 Representation of sem antic p a tte r n s ......................... 3.1.2 The knowledge s o u rc e s 38 . 3.1.3 Basic approach to the a c q u is itio n 39 1 3.2 Acquisition of New F P -S tru c tu re s 40 ; 3.2.1 Fram e definition and sentence e x tr a c tio n 41 I 3.2.2 Conversion to simple c la u s e s 43 ^ 3.2.3 F P m apping m o d u l e 43 , 3.2.4 F P -structure c o n s tr u c tio n ........................................................... 44 3.3 S u m m a r y ........................................................................................................ 45 4 G e n e ra liz a tio n 48 4.1 The P r o b l e m ................................................................................................. 48 4.2 The A lg o rith m s............................................................................................. 49 4.2.1 Single-step generalization............................................................... 49 4.2.2 Parallel im p le m e n ta tio n ............................................................... 51 4.2.3 Increm ental g e n e ra liz a tio n ........................................................... 52 4.3 S u m m a r y ........................................................................................................ 54 5 C la ssific a tio n 56 5.1 B a c k g ro u n d ..................................................................................................... 56 5.1.1 Inheritance hierarchy and cla ssific a tio n 57 , 5.2 Parallel Classification Algorithm on S N A P ........................................... 59 5.2.1 Parallel subsum ption test against all concepts . . . . . . . 59 5.2.2 Parallel searching for M S S ........................................................... 62 5.2.3 Parallel searching for M G S ........................................................... 62 5.2.4 An e x a m p le ....................................................................................... 63 5.3 Perform ance of the A lg o rith m ...................................... 67 5.3.1 Tim e c o m p le x ity ............................................................................ 67 5.3.2 Strategies for perform ance im p ro v e m e n t............................... 69 5.4 S u m m a r y 71 , I 6 E x p e rim e n ta l R e s u lts 73 , 6.1 The E n v iro n m e n ts ....................................................................................... 73 6.2 T he E x p e rim e n ts 74 | 6.2.1 Tim e to construct the knowledge base .................................. 75 6.2.2 Acquisition r a t e ................................................................................ 75 6.2.3 G eneralization................................................................................... 84 6.2.4 C la s sific a tio n ................................................................................... 87 6.3 S u m m a r y ........................................................................................................ 90 7 C o n c lu sio n a n d F u tu re W o rk 92 , 7.1 C o n trib u tio n s ................................................................................................. 92 7.2 Future R e se arch ............................................................................................. 94 \ V 7.3 S u m m a r y ........................................................................................................ 96 A p p e n d ix 96 A p p e n d ix A Exam ple training texts and tem plates from M U C - 4 ................................ 97 A .l A sample t e x t ................................................................................................. 97 A.2 Corresponding te m p la te s ............................................................. 98 A p p e n d ix B Exam ple training texts and tem plates from M U C - 5 .................................. 100 B .l A sample t e x t ................................................................................................. 100 B.2 Corresponding te m p la te s ............................................................................ 101 A p p e n d ix C A cquired F P -structure ex am p les......................................................................... 104 C .l FP-structures for B O M B I N G ................................................................... 104 C.2 FP-structures for K I L L I N G ...................................................................... 106 C.3 FP-structures for T I E - U P ......................................................................... 107 B ibliography 110 2 7 11 14 15 17 19 20 22 36 40 41 42 46 49 50 51 53 64 65 65 66 66 70 70 75 76 78 79 vii Of Figures This thesis work as an intersection of three field s................................ A simple sem antic n e tw o rk ......................................................................... The m em ory-based parsing m e c h a n is m ................................................. The SNAP-1 prototype [25] ...................................................................... The instruction set of SNAP-1 [73] ........................................................ The exam ple tex t and t e m p l a t e ............................................................... SNAP inform ation extraction s y s t e m ..................................................... An exam ple of a concept sequence s t r u c t u r e ....................................... The parsing result ex am p le......................................................................... The fram e-phrasal p attern rep resen tatio n .............................................. Conceptual diagram of acquisition as a feedback to the parser . . T he functional structure of P A L K A ........................................................ The results of each step for F P -structure acq u isitio n ......................... An exam ple of the knowledge base created by P A L K A .................. D eterm ination of sem antic c o n strain ts..................................................... Propagation of m arkers in single-step g e n e ra liz a tio n ......................... Exam ple of single-step g e n e ra liz a tio n ..................................................... Propagation of m arkers in increm ental g e n e ra liz a tio n ..................... Description of th e input c o n c e p t............................................................... Propagation of m arkers at Phase 1 and 2 .............................................. Propagation of m arkers at Phase 3 ........................................................ Propagation of m arkers to find out the M G S ....................................... The resulting hierarchy after adding the new c o n c e p t..................... Propagation of cancel-marker w ithout space re d u c tio n ..................... Propagation of cancel-marker in restricted s p a c e ................................ Result of the acquisition from 500 MUC-4 and M UC-5 texts . . . Exam ple sentences and corresponding semantic patterns acquired Average num ber of patterns created for BOM BING fram e . . . . Average num ber of patterns created for KILLING f r a m e .............. 6.5 Average num ber of patterns created for TIE-U P f r a m e .................. 80 6.6 Recognition accuracy vs. training set size for BOM BING fram e . 81 6.7 Recognition accuracy vs. training set size KILLING fram e . . . . 82 6.8 Recognition accuracy vs. training set size for TIE-U P fram e . . . 83 6.9 Effect of generalization on the parsing perform ance w ith various percentages of negative e x a m p le s ............................................................ 85 6.10 Effect of generalization on the parsing perform ance for the patterns w ith various sem antic constraint group le n g th s ................................... 86 6.11 Effect of Fan-out on the processing tim e of classification ............... 88 6.12 Processing tim e of classification for different knowledge base size . 89 6.13 Processing tim e of property retrieval for different knowledge base s iz e ....................................................................................................................... 89 Abstract Today is known as the inform ation age. The am ount of available on-line texts is rapidly increasing, and the need of com puterized inform ation processing is more than ever before. For inform ation extraction and retrieval from texts, knowl edge based n atural language processing approach has been studied for a long tim e, and has been successfully applied to selected tasks. However, knowledge based text processing always faces difficulty of knowledge base construction when a practical, large scale application is considered. A large knowledge base of do m ain dependent, sem antic and phrase patterns is needed, and m anual encoding of such sem antic patterns is a m ajor obstacle for a real world application. To overcome the scalability problem, an autom ated acquisition of knowledge should be provided. This thesis deals w ith two im portant issues for practical textual inform ation processing: the scalability and the speed. For the scalability, an au tom atic knowledge acquisition system for inform ation extraction is developed. For the speed, marker-passing based massively parallel p attern m atching for parsing is introduced. Efficiency of parallel m arker-passing is also dem onstrated through the parallel classification algorithm . A sem antic p attern knowledge representation for inform ation extraction, which is suitable for autom ated acquisition and paral lel processing, is provided. Through the experim ents w ith a set of news articles, the feasibility of th e representation and the acquisition m ethod are dem onstrated. T he tim e to construct sem antic patterns is significantly reduced, and the satu ration of knowledge base is clearly shown. This thesis shows th at an autom ated sem antic p attern acquisition, together w ith an appropriate representation, can provide scalability to knowledge based inform ation extraction by overcoming the knowledge engineering bottleneck. Chapter 1 Introduction ' i i i Today is known as th e inform ation age, and our society is often called th e infor- | m ation society. More and m ore inform ation is produced and collected everyday on various topics in different forms, such as natural language texts, databases, speech, and graphical data. As the size of available databanks of inform ation grows, it becomes very difficult to find a necessary d ata from th e databanks (in- 1 form ation retrieval), and to extract relevant inform ation from d a ta (inform ation extraction) th a t have the form of n atural existence like tex t. Therefore, th e need : of com puterized inform ation processing is greater th an ever before. : Among various forms of inform ation, textual inform ation is prim arily im por tan t. The num ber of available on-line databases is rapidly increasing, including i books, journals, news articles, reports, instruction m anuals, stock quotations, and ! so on, and the m ajority of those inform ation is in th e form of text. Although texts can be processed m ore easily than other forms of informa tion, the linguistic com plexity of natural language prevents them from being fully understood or interpreted by com puter. Also, natural language texts have one I im portant characteristic th a t th e semantics of texts are not well represented by surface features. For example, a user of a large tex t database of news articles may j want to find out all the news regarding a “bom bing” event by terrorists. Search ing th e database w ith keywords like “bom b” or “explode” can fail very easily. An irrelevant text like “T he foreign debt crisis exploded in A ndean countries” could . be retrieved, and a relevant tex t like “An unidentified m an hurled dynam ites to Natural Language Processing (Information Extraction from Texts) Parallel Processing (Marker-passing for knowledge processing) Thesis Machine Learning (Automated lexical knowledge acquisition) Figure 1.1: This thesis work as an intersection of three fields th e truck, and three soldiers died” m ay not be retrieved. T he accuracy of infor m ation retrieval and extraction can be im proved by using b e tter representation of contents and m ore intelligent processing. However, more sophisticated representa tions are very difficult to produce, and th e understanding of n atu ral language by m achine still has a long way to go for practical applications. A m ore intelligent, but at the same tim e a more practical and feasible m ethod, is necessary for an efficient processing of textual inform ation. M any researches in natural language processing have focused on developing theories th at can be applied to difficult linguistic phenom ena, or only certain selected texts. Although these types of researches provide interesting new concepts and approaches, in m any cases they do not have practical values, which is a frequently and generally observed problem in Artificial Intelligence research. One of the m ain difficulty lies in providing a large, robust and com plete knowledge base th a t is required for th e AI program s (or theories) to achieve a reasonable perform ance in a practical application. Simply speaking, there are too many things to be known a priori. More intelligent tasks require m ore knowledge. For a textual inform ation system to be practical, the knowledge base required by the system should have the form th at can be constructed from available knowledge sources, and the process of knowledge base construction m ust be autom ated. 2 In this thesis, two m ain issues for textual inform ation processing are addressed: scalability and speed. The goal of this research is not to provide a cognitively plau sible m odel for language processing, or to build a language processing system th a t can handle several difficult linguistic problems. T he m ajor concern of this research is practicality. Scalability is necessary to apply a tex t processing m ethod to a large real-world application domain. Speed is necessary to handle a large am ount of available textual d ata in a reasonable am ount of tim e. For th e scalability, a knowl edge representation for direct m apping from texts to an inform ation stru ctu re is suggested, and a system for autom atic acquisition of necessary knowledge in th a t form is developed. For the speed, marker-passing based massively parallel p attern m atching for parsing is introduced. The efficiency of parallel m arker-passing is also dem onstrated through th e parallel classification algorithm . This thesis work has been done in th e context of th e MUC (Message U nder standing Conference) and the SNAP (Sem antic Network Array Processor) project at USC, in which a m arker-passing parallel machine has been developed and var ious AI applications have been studied, including N atural Language Processing. According to the background of this work and th e issues touched, th e work de- ; scribed in this thesis can be regarded as an intersection between n atural language | processing, machine learning, and parallel processing, as shown in Figure 1.1. 1.1 Need of Lexical Knowledge Acquisition | In knowledge-based tex t processing, domain-specific sem antic patterns have been widely used for inform ation extraction and retrieval. By using domain-specific sem antic patterns, one can achieve fast and efficient tex t processing by directly m apping a surface linguistic p attern to its m eaning w ithout full syntactic anal ysis and w ithout applying conversion rules from syntactic structure to sem antic interpretation [84] [44] [59] [73]. Although th e knowledge-based approach has been proven very effective for inform ation extraction and retrieval especially for specific domains, one significant problem of th e approach is th a t it is necessary to construct a large num ber of domain-specific sem antic patterns. M anual creation of sem antic pattern s is very tim e consuming and error prone, even for a small application domain. To solve th e scalability and the portability problem, automatic acquisition of sem antic patterns m ust be provided. Previous researches on lexical acquisition focused on th e acquisition of th e m eaning of unknown words [30] [14] [43], th e acquisition of collocative inform ation using statistics [21] [94], and th e acquisition of surface sem antics [100] [79] [46] [86]. The goal of the acquisition of word meaning is to im prove a parser by providing a flexibility or self-extending capability, and is different from th e goal addressed in this thesis. The statistical approach is efficient and can be easily im plem ented, but the collocative knowledge acquired by a statistical m ethod usually does not provide semantics. T he acquisition of surface sem antics is m ost closely related to the goal and the approach of knowledge acquisition described in this thesis. For the sem antic p attern acquisition, a practical acquisition system PALKA (Parallel A utom atic Linguistic Knowledge Acquisition) has been developed. The m ajor goal of this system is to facilitate th e construction of a large knowledge base of sem antic patterns. O ur approach has m any features in common w ith the above m entioned works in th e acquisition of surface sem antics, but has unique features such as: 1) the acquisition is perform ed as a feedback to the parser, 2) knowledge sources, th a t are either available on-line or easily constructed for specific domains, are used, 3) a pair of a m eaning fram e and a full phrase p attern is used to represent sem antic knowledge, and 4) the acquired p attern s are generalized through induction. PALKA acquires sem antic patterns from a set of dom ain specific sam ple texts and their desired output representations. T he acquisition system performs as a feedback to the parser. W hen the parsing fails due to th e lack of an appropriate sem antic p attern, the acquisition system constructs a new p attern . W hen con structing a new sem antic p attern, a surface phrasal p attern is acquired from a sam ple tex t, and th e sem antic inform ation is acquired from a corresponding out p u t representation. T he acquired patterns are further tuned through a series of generalizations of sem antic constraints of each elem ent in the phrasal p attern. An inductive learning [70] [68] mechanism is applied to th e generalization steps. 4 1.2 Organization of Dissertation This research addresses the issues of how to acquire linguistic knowledge from real tex t examples and how to construct a consistent knowledge base of sem an tic patterns in a system atic way. It provides a new methodology for knowledge acquisition for practical n atural language application and a way of m aintaining consistency of the hierarchical knowledge base. This thesis is organized as follows: C h a p te r 1 In this introductory chapter, th e brief background and th e m otivation of this research, and th e core points of this thesis are introduced. C h a p te r 2 Reviews of previous researches in the three closely related fields, par allel knowledge processing, inform ation extraction from tex ts, and lexical knowledge acquisition, are given. C h a p te r 3 The knowledge representation and th e basic approach to th e lexical acquisition are described. The acquisition procedure of PALKA is presented in detail w ith examples. C h a p te r 4 The determ ination of sem antic constraints is described as a general ization problem. Two generalization algorithm s and their im plem entation on m arker-passing m achine are presented. C h a p te r 5 The classification of a new piece of knowledge in a fram e-based sys tem is presented. Im plem entation as a m arker-passing algorithm is also presented, and the perform ance of the algorithm is discussed. C h a p te r 6 Several experim ental results are discussed. T he experim ent environ m ents are explained, and the performances of th e acquisition system and several algorithm s are discussed. C h a p te r 7 In this last chapter, th e contributions of this thesis are discussed. Also, several problems in th e current im plem entation are analyzed, and future research directions are suggested. 5 Chapter 2 Background The core of this research resides in the intersection of parallel processing, n atural language processing, and lexical knowledge acquisition. More precisely, it includes parallel m arker passing on sem antic networks, inform ation extraction from n atural language texts, and acquisition of sem antic patterns from texts. In this chapter, brief introductions of these subjects are presented as back grounds of this research. F irst, a knowledge representation and inferencing m odel which are appropriate for parallel processing are explained. Second, th e infor m ation extraction task is discussed w ith th e description of th e Message U nder standing Conference (MUC). Third, related researches to the lexical acquisition are discussed. 2.1 Parallel Knowledge Processing For a com puter system to show an intelligent behavior, th e system needs to know a large num ber of facts, the objects which those facts concern, and th e relationships betw een those objects. It should have knowledge about the world. To provide various knowledge to a com puter system , two m ain issues to be considered are: 1) how to represent knowledge, and 2) how to perform inference using knowledge. In this section, sem antic network and m arker passing are presented as suitable representation and inference mechanism for parallel processing. 6 animal isa feather isa has Cl1 1 3 1 1 1 1 1 1 has bird teeth isa tusk Figure 2.1: A simple sem antic network 2.1.1 S em a n tic n etw ork s, fram es an d co n ce p t h ierarch y Sem antic network representation was first introduced to the AI com m unity as : a sem antic m em ory by Quillian in 1968 [81]. It is a paradigm for structured object representation. In the sem antic network, the knowledge about different ! objects and their relationships are encoded in a directed graph. A node in a ; graph represents an object (frequently referred to as a concept), and a link shows relations between nodes. Figure 2.1 shows a simple exam ple of a sem antic network. It contains the concepts mammal, b ir d , e le p h a n t, and so on, and the facts such as a mammal has le g and a b ir d has wing. Also, the hierarchical relations are represented ; by th e i s a links between concepts. The sem antic netw ork therefore gives us a : static representation of some knowledge of th e world, and of th e concepts used to ! ( describe th a t knowledge. ■ Sem antic network representation has its origin in efforts to build system s which understand n atural language, and is m otivated by a desire to correspond to hu- i m an understanding. T he goal of the sem antic network is to provide a relationship between objects described in the sentences of the story under consideration. Since m uch of the inform ation about these relationships is im plicit, the knowledge in se m antic networks provides common sense knowledge for various inference regarding the m eaning of a sentence. 7 i I Frames, first suggested by M arvin M insky [69], represents another form of structured object which can be seen as a development from sem antic network. In sem antic networks, nodes are not structured, and so a large size netw ork becomes very com plicated and difficult to understand. Frames differentiate various types ■ of nodes according to th e relation types they have, and attach th a t inform ation | w ithin the node. For exam ple, a fact th a t “John gave M ary a red car” and the I knowledge about a car can be represented by using th e following three frames: ; l i give-event#l isa: give-event giver: John recipient: Mary object: car#i car#l isa: car color: red car isa: vehicle has: engine In above representation, ’c a r ’ is a class or concept representing a general object 1 description, and ’c a r # l ’ is an instance representing a specific object. Each object I has a num ber of relations which are called slots of th e fram e. These slots are filled by values th a t are called fillers of the slots. T he fillers can be another 1 fram e or a single value like an integer. Because of these features the frame-based representation is often called slot and filler representation. ; i i One im portant feature of th e fram e representation is th a t objects (classes) can be organized according to their generality by using isa relations, and properties ; of a more general object (class) is inherited to a m ore specific object (class). For I exam ple, if a ’v e h ic le ’ has a slot ’m ad e-in ’, then so does a ’c a r ’ which is a ; m ore specific concept of ’v e h ic le ’. A value can be inherited too, such th a t a : ’v e h i c l e ’ has an engine and so does a ’c a r ’. The isa relation and inheritance ‘ provide two nice properties. First, th e knowledge can be organized in a consistent way. The m ost general object is placed at the top, and m ore specific concepts are placed under more general concepts by adding m ore properties. Second, the representation is very economic, since inform ation is specified only once a t the highest level object, and it is applied to m ore specific object by inheritance. W hen we show only th e isa relations between objects, they form an inheritance hierarchy, which is also called a class hierarchy or a concept hierarchy. T he isa relation is also called the subsumption relation, and the nodes at th e higher level are called th e superconcepts or the subsumers of the nodes at th e lower level. The nodes at the lower level are called the subconcepts or th e subsumees of th e nodes at th e lower level. T he sem antic networks, frames and the concept hierarchy form a basis of the knowledge base structure for language processing. In th e following sections, a parallel paradigm for processing knowledge w ithin these representations and an approach to n atural language understanding based on th a t paradigm are pre sented. 2 .1 .2 M arker p a ssin g for k n o w led g e p ro ce ssin g T he m arker passing was first introduced by Quillian [81] as a spreading activation on sem antic memory. It is an inference m echanism developed originally to find connections (paths) between concepts in sem antic networks to distinguish different word meanings. T he mechanism works as follows. First, th e nodes corresponding to the input objects are m arked by placing tags (or any other form atted data). Second, the tags are propagated to th e neighboring nodes through connected links. T hird, the nodes th a t are m arked by the tags from two different nodes are collected, and form a p ath betw een th e two nodes. T he p ath inform ation is used for the inference by inspecting th e relationship between those objects under concern. This approach to word sense disam biguation was used for natural language understanding in his Teachable Language Com prehender [82]. Later, th e hardw are im plem entation of a m arker passing scheme called N ETL was designed by Fahlm an [27], and m ore elaboration on path finding and its ap plication to inferencing was done by Charniak [15] [16]. Since C harniak’s work, m arker passing has been investigated in m any ways as a technique for inferencing in n atural language processing. Norvig [77] used a constrained m arker passing through only specific link types to prevent irrelevant propagation. H endler sug gested a m ethod th at filters out th e paths violating certain restrictions [36]. Yu presented an idea of specifying a complex conditions on m arkers to control the propagation [108]. M ethods to lim it th e propagation of m arkers using branching factor or reduction of activation energy are presented in [107]. One of the distinctive features of m arker passing is its inherent parallelism . Im plem enting a sem antic network knowledge base and m arker passing m echa nism as a massively parallel hardw are provides several advantages. F irst, one can achieve a fast perform ance in inferencing on very large knowledge base. Since the quality of an intelligent system is m ainly determ ined by the am ount of knowledge they have, very large knowledge bases are necessary for practical system s w ith high-quality. As th e am ount of knowledge increases, th e speed of a knowledge- based system becomes m ore critical. Second, it provides a new perspective to m any problem s in the natural language understanding. It gives m ore insight to th e reasoning and inferencing capability of hum an, and helps to develop a new, different kind of algorithm for knowledge processing. 2.1.3 M em o ry -b a sed n atu ral la n gu age p ro ce ssin g T he idea of representing knowledge as a network of concepts, and perform ing inferencing by propagating m arkers on it, introduced a new approach to the n a t ural language understanding, called memory-based parsing. T he m em ory-based parsing approach was first introduced by Riesbeck and M artin at Yale as Direct M emory Access Parsing (DM AP) [84]. In this approach, parsing is perform ed di rectly in th e knowledge base by m arker-passing m em ory search. Based on DM AP, sim ilar m em ory-based systems such as DM TRANS [98] and $ Dm-Dialog [52] have been developed at CMU. T he basic idea behind this approach is sim ilar to th a t of the m em ory-based reasoning paradigm , where intensive use of m em ory is regarded as th e foundation of an intelligent system [96]. T he m em ory-based parsing system stores possible sentence patterns, together w ith their interpretations, into m em ory as tem plates called concept sequences. W hen a sentence is input, the linguistic p attern in the input sentence is m atched against stored tem plates in the knowledge base by a series of m arker-passing 10 Concept Sequence (Frame) Concept hierarchy CSR, isa isa slot3 isa CSE constraint (a) The knowledge base structure for memory-based parsing Input phrase pattern: [A][B][C][D] <Prediction> P <Movement of prediction> <Activation and AP collision> ♦ • • / " \ \ s m n h n^ 9» <AP collision at the last CSE> E : slo tl: A slot2: C slot3: D <Interpretation> (b) The prediction-activation parsing mechanism Figure 2.2: The m em ory-based parsing m echanism 11 commands. Once a tem plate is recognized, an interpretation for th e sentence is generated, and stored in th e sem antic network knowledge base. Figure 2.2 shows how a linguistic knowledge is represented as a concept se quence p attern , and how an input tex t is interpreted by m em ory search. As shown in Figure 2.2-(a), the knowledge base consists of a concept hierarchy and a set of concept sequences. Surface linguistic patterns are stored as concept sequences which are represented as frames. A concept sequence is represented as a cluster of a concept sequence root (CSR) for representing the m eaning of a phrase, and a set of concept sequence elem ents (CSEs) w ith specific ordering constraints. Each CSE is linked to one or m ore concepts in th e concept hierarchy to specify condi tions for m atching. It is called a constraint of th at elem ent. A concept sequence m aps a surface linguistic p attern to its m eaning representation by m atching an input sentence to the CSE pattern. T he basic parsing m echanism is shown in Figure 2.2-(b). T he parsing algo rith m relies on m arking concepts w ith prediction (P) and activation (A) m arkers, and moving prediction m arkers when a prediction m arker collides w ith an ac tivation m arker. At first, all concept sequence roots are expected as potential hypotheses, and accordingly, th eir first concept sequence elem ents and all sub sum ed concepts in the knowledge base are also expected to be hypotheses as well. W hen an input word is read, a bottom -up activation is propagated up to the corresponding subsum ing nodes in th e knowledge base. Even though m any nodes are initially expected, only a few receive activations as m ore input words are processed. Therefore, candidate hypotheses are quickly narrow ed down to correct ones. Once a whole concept sequence is recognized, an interpretation for th e input sentence is generated and stored in the knowledge base. T he m em ory-based parsing approach is suitable to parallel processing, since parsing becomes a p attern m atching by m em ory search. By im plem enting memory- based parsing on parallel machines, one can achieve large speed im provem ents, which is essential for certain applications like speech processing or bulk tex t pro cessing th a t requires fast processing. In th e next section, the SNAP architecture is presented as an im plem enta tion of above m entioned knowledge representation (sem antic netw ork), knowledge 12 processing paradigm (marker-passing), and th eir application to tex t processing (m em ory-based parsing). 2 .1 .4 T h e S N A P p ro ject Since the first com putational m odel of the spreading activation mechanism was introduced by M. Quillian [81], various models of m arker passing have been stud ied and developed for knowledge processing [15] [27] [37] [106]. SNAP (Sem antic Network A rray Processor) [71] [72] is one of these models. It is a highly paral lel, m arker propagation architecture targeted to AI applications, specifically to n atural language processing. For th e perform ance evaluation and the testing of the original SNAP design a prototype m achine had been built and used for various experim ents. T he SNAP- 1 prototype consists of a processor array th a t perform s th e actual com putation, and a controller which interfaces the array processors w ith the host and controls th e operation of the array. T he processor array consists of 144 Texas Instrum ent TMS320C30 DSP chips. Each TMS320C30 manages 256 sem antic network nodes and has access to 64K nodes of sem antic network. The processing elem ents are interconnected via a modified hypercube network. T he array controller interfaces the processor array w ith a SUN3/280 host, and broadcasts instructions to control the operation of the array. The instructions for th e array are distributed to th e array through a global bus by the controller. Instructions are provided in SIMD fashion, but each processor in the array executes instructions asynchronously and the m arker propagation is perform ed as a background process in MIMD fashion. Therefore, th e propagation of m arkers and th e execution of other instructions can be processed simultaneously. Figure 2.3 shows the architecture of the SNAP-1 prototype. A set of high-level instructions specific to knowledge processing is im plem ented directly in hardw are, including associative search, m arker setting, m arker prop agation, retrieval and logical/arithm etic operations on m arkers. T he instruction set can be called from C language so th a t users can develop applications w ith an extended version of C language. From th e program m ing level, SNAP provides data-parallel program m ing environm ent sim ilar to C* of th e Connection M achine, 13 Host Computer Hardware Environment Host Software Environment Program development using SNAP instruction set Physical Design SUN 4/280 SNAP-1 Controller SNAP-1 Array VME Bus Controller Compiled SNAP code Custom Backplane 144 Processor Array Program Control — Processor Sequence Control Processor Knowledge base SNAP instruction execution . Qj m.2t bazs .bs .9i s L.. Eight 9U-size boards Four clusters per board our to five processors per cluster Figure 2.3: T he SNAP-1 prototype [25] b u t specialized for sem antic network processing w ith m arker passing. T he SNAP-1 instructions are listed in Figure 2.4. There are several m arker propagation rules th a t govern how m arkers are passed. M arkers are passed from one node to others via relations, and travel through sev eral relation types at the same tim e. M ultiple m arkers can be propagated through th e network simultaneously. T he propagation rules have th e form at rule, relation 1, and relation 2, where relation 1 and relation 2 are th e relations affected by rule. T he list of propagation rules used are as follows: • Seq(rl,r2) : T he Seq (sequence) propagation rule allows th e m arker to prop agate through r l once then through r2 once. • Spread(rl,r2) : T he Spread propagation rule allows th e m arker to travel through a chain of r l links and then r2 links. • C om b(rl,r2) : T he Comb (combine) propagation rule allows th e m arker to propagate through all r l and r2 links w ithout lim itation. 14 Instruction Aigument Action CREATE DELETE s-node, <relation>, <weight>, e-node s-node, <relatiom>, e-node creates new <rektion> with <weight> between e-node & s-node deletes <relation> between s-node & e-node MARKER-CREATE MARKER-DELETE maiker, <s-relation>, e-node, <e-relation> marker, <s-relation>, e-node, <e-relation> creates a new <s-relation> between nodes with maiker and e-node, <e-relation> is the relation type for e-node opposite of maiker-create TEST AND OR NOT maiker-1, <marker-2>, <value>, <cond> maiker-1, maiker-2, < marker-3>, <func> maiker-1, marker-2, < maiker-3>, <func> maiker-1, <marker-2> All nodes with marker-1, sets maiker-2 if marker-1 <value> meets <cond> All nodes with marker-1 and marker-2 set <maiker-3>. The marker values are handled by <func> All nodes with either maiker-1 or maiker-2 set <maiker-3> All nodes with <marker-l>, invert <maiker-2> SEARCH node, <maikei>, <value> Set <markei> with <value> in node MARKER maiker-1, <marker-2>, <rule>, <func> All nodes with a marker-1, begin propagating <marker-2> with <iule> and <func> SET-MARKER-VALUE CLEAR-MARKER FUNC-MARKER maiker, <value> maiker maiker, <func> All nodes with marker, set <value> Clears maiker in all nodes Assigns a node function for dealing with duplicate markers in all nodes with maker COLLECT-MARKER COLLECT-RELATION maiker marker, relation get results from all nodes with <tnarker> get relation information from all nodes with marker Figure 2.4: The instruction set of SNAP-1 [73] 15 • End-Spread(rl, r2) : This propagation rule is th e same as Spread except th a t it m arks only the last nodes in the paths. • E nd-C om b(rl, r2) : This propagation rule is th e same as Comb except th a t it m arks only the last nodes in the paths. 2.2 Information Extraction from Texts The rapid increase of available on-line inform ation sources has m ade inform ation extraction and retrieval more and more im portant task. Since m ost of th e on-line sources are in the form of tex t, researches in inform ation extraction and retrieval have been focused on developing tex t analysis techniques. In this section, the task of MUC-4, th e conference th a t evaluates various n a t ural language processing system , is described as a typical inform ation processing task. O ur group at USC has participated in the conference for two years. The inform ation extraction system developed by our group is described briefly, and th e knowledge engineering bottleneck is discussed as one of th e m otivations of this thesis. 2.2.1 T h e M essa g e U n d ersta n d in g C on feren ce T he Message U nderstanding Conference (M UC), sponsored by DARPA, is one of the latest series of conferences th a t concern the evaluation of n atu ral language processing system s. The overall objective of th e evaluations is to advance our u n derstanding of the current tex t analysis techniques, as applied to th e perform ance of an inform ation extraction task. One should note th a t although inform ation extraction requires understand ing of tex ts in p art, it is different from producing an in-depth representation of the content of com plete text. T he form ats of inputs and outputs are predefined for certain application domain. In M UC, th e inputs are newswire texts in elec tronic form , and th e outputs of the extraction process are a set of tem plates or sem antic fram es, which is a partially form atted database. An exam ple te x t and corresponding tem plate of MUC-4 are shown in Figure 2.5. 16 TST1-MUC3-0073 SANTIAGO, 29 DEC 89 (EFE) - [TEXT] POLICE TODAY REPORTED THAT DURING THE LAST FEW HOURS TERRORISTS STAGED THREE BOMB ATTACKS AGAINST U.S. PROPERTIES. AT 0115 THIS MORNING (0415 GMT) INCENDIARY BOMBS WERE HURLED AT A MORMON TEMPLE AT NUNOA DISTRICT, IN SANTIAGO. THE BOMBS CAUSED MINOR DAMAGE. AT THE TIME OF THE ATTACK THE BUILDING WAS EMPTY, ACCORDING TO THE SOURCES. THE ATTACKERS PAINTED THE WALLS AND LEFT PAMPHLETS WITH ULTRA-LEFTIST MESSAGES OF THE LAUTARO YOUTH FRONT. THE MANUEL RODRIGUEZ PATRIOTIC FRONT (FPMR), WHICH THE PINOCHET REGIME CONSIDERS TO BE THE COMMUNIST PARTY’S ARMED BRANCH, ANNOUNCED FOLLOWING THE U.S. INVASION OF PANAMA THAT IT WOULD ATTACK ‘U.S. INTERESTS IN CHILE.” (a) The sample text 0. MESSAGE: ID 1. MESSAGE: TEMPLATE 2. INCIDENT: DATE 3. INCIDENT: LOCATION 4. INCIDENT: TYPE 5. INCIDENT: STAGE OF EXECUTION 6. INCIDENT: INSTRUMENT ID 7. INCIDENT: INSTRUMENT TYPE 8. PERP: INCIDENT CATEGORY 9. PERP: INDIVIDUAL ID 10. PERP: ORGANIZATION ID 11. PERP: ORGANIZATION CONFIDENCE 12. PHYS TGT: ID 13. PHYS TGT: TYPE 14. PHYS TGT: NUMBER 15. PHYS TGT: FOREIGN NATION 16. PHYS TGT: EFFECT OF INCIDENT 17. PHYS TGT: TOTAL NUMBER 18. HUM TGT: NAME “MANUEL RODRIGUEZ PATRIOTIC FRONT’ POSSIBLE "MORMON TEMPLE” OTHER: “MORMON TEMPLE” 1 UNITED STATES: “MORMON TEMPLE” SOME DAMAGE: “MORMON TEMPLE” 1 12 DEC 89 CHILE: SANTIAGO (CITY): NUNOA (DISTRICT) BOMBING ACCOMPLISHED INCENDIARY BOMBS BOMB TERRORIST ACT TST1-MUC3-0073 (b) The sample template Figure 2.5: T he exam ple tex t and tem plate In developing inform ation extraction systems, each participating site was given a training corpus of 1,500 texts, along w ith associated answer keys (instantiated tem plates). The dom ain of MUC-4 texts is terrorist incidents in L atin America, : and there are a small num ber of incident types, such as bom bing and killing. The j task was to extract inform ation about terrorist events from new texts, and fill j : out a tem plate for each event m entioned in a text. T he MUC-5 dom ain is joint ' ( venture events between companies, and th e tem plate to be filled are in the form \ of object-oriented representation. ; i ' T he evaluations of each system were perform ed based on two principal mea- I sures: Recall and Precision. Recall is the num ber of correct answers by th e system divided by the num ber of to tal possible correct answers. It m easures how com pre hensive the system is. Precision is the num ber of correct answers by th e system divided by the num ber of all answers provided by th e system . It m easures the accuracy of the system . For exam ple, if there are 100 possible answers and the system provided 60 answers and 30 of them were correct, then the recall is 30% and the precision is 50%. Since it is difficult to evaluate system s based on two ! different m easures, a combined m easure called F-measure was also used. It is > : defined as follows: (ft* + 1.0 )P R ft2P + R w here P is precision, R is recall, and /? is a param eter specifying th e relative ■ im portance of recall and precision. If ft = 1.0, they are equally weighted. If ! , ft > 1.0, precision is m ore im portant, and if ft < 1.0, recall is m ore im portant. ■ In the MUC-4 evaluation, the average scores of 17 participants were 35% recall , and 33% precision. 2 .2 .2 O v erv iew o f th e S N A P s y ste m for M U C -4 T he inform ation extraction system on SNAP was developed as a p art of three-year SNAP project funded by NSF. As described in C hapter 2, a massively parallel com puter was built for natural language processing, and m arker-passing on sem an tic networks, which exhibits massive parallelism , was selected as an appropriate . 18 SUN Host SNAP Controller and Array Database templates Input Text Phrasal parser Template generator Preprocessor Knowledge base Memory-based parser Figure 2.6: SNAP inform ation extraction system paradigm for language processing. T he m arker-passing approach was directly ap plied to build the inform ation extraction system for MUC-4. Figure 2.6 shows the SNAP system architecture. It consists of th e preprocessor, parser, tem plate generator and knowledge base. In this section, th e SNAP system developed at USC for th e MUC-4 is described briefly. K now ledge base The knowledge base is distributed over the SNAP processor array. T he parsing and tem plate filling are perform ed w ithin the knowledge base by propagating m arkers through th e network. T he knowledge base consists of concept sequences 1 encoding linguistic knowledge and concept hierarchy encoding dom ain knowledge. T he parsing is perform ed by m atching a surface linguistic p attern in th e input tex t against stored concept sequence patterns in th e knowledge base. A concept sequence consists of a concept sequence root (CSR) and a set of concept sequence j elem ents (CSE) w ith specific ordering connections. Each CSE has syntactic and ; sem antic constraints specifying the acceptance condition. T he dom ain knowledge is represented in a concept hierarchy w ith i s a relations and various property links between concepts. T he concept hierarchy contains con cepts which are related to th e terrorist dom ain, such as different event types, target types and instrum ent types. It also contains knowledge about different locations 19 BCS: [ agent, ACCUSE, beneficiary, OF, object ] CSR layer CSE layer syntactic constraint semantic concept hierarchy accuse-even first agent benef. ben59 pred62 agent58 objects / r NFM / A/erfcn / f N P ) / (^Prepositior) j ^ ...... animate <^government-of1 ia§& \ (^tgjrorism-organq T i \ i % 3 \ | ( i m i n i i activation \ 1 \ £ V . J \ — \ i — T r ^ r t \ I ' 1 x ! lexical l a y e r “SALVADORAN PRESIDENT-ELECT* -THE FM LN ” “THE CRIMP* Figure 2.7: An exam ple of a concept sequence structure in South A m erica and various terrorist organization nam es. T he sem antic con straints of basic concept sequence elem ents are m apped to one or m ore concepts in the concept hierarchy. Figure 2.7 illustrates th e layered organization of the knowledge base and how th e surface linguistic p attern s are m apped to a concept sequence through the syntactic and sem antic constraints. T he knowledge base used in th e MUC-4 has approxim ately 12,000 sem antic concept nodes and 47,000 links between them . This network of concepts was formed by using a frame-like representation language, converted to a set of SNAP instructions by a conversion program , and allocated in th e SNAP array by the controller at th e loading tim e. The basic concept sequence occupies 9000 nodes for th e CSR, CSE and the syntactic constraints. T here are about 1200 basic concept sequences for about 900 verb entries. Among the 3000 nodes in th e concept hierarchy, 1000 nodes are used for th e dom ain concepts including event types, object hierarchy, physical target types, hum an target types and instrum ent types. A nother 1000 nodes are used for dom ain-phrases including organization names, and th e rest are used for the 20 geographical concepts containing all th e names of countries, cities and regional nam es in South America. P reprocessor and diction ary T he preprocessor reads the input texts, perform s dictionary look-up, and produces a list of dictionary entries for each word. It also recognizes dom ain specific noun phrases and proper nouns, and concatenate several words such as “PR E SID E N T E L E C T ” as one dictionary entry PRESIDENT-ELECT. T he detection of message headers, end of sentences, and paragraph boundaries, is also th e job of the pre processor. For each word, one or more dictionary entries are provided. Each dictionary entry contains the part of speech, the root form of th e word, num ber inform ation for nouns, tense inform ation for verbs and a sem antic definition of the word. T he parser uses th e sem antic definition to m ark nodes in the knowledge base and propagate m arkers from them . An exam ple of th e preprocessor outp u t for “PR E SID E N T ELEC T ALFRED O CRISTIA N I ACCUSED FM LN” is: PRESIDENT.ELECT 1 common-noun PRESIDENT.ELECT sg GOVERNMENT-OFFICIAL ALFRED O.CRISTIANI 1 proper-noun ALFREDO.CRISTIANI sg HUMAN ACCUSED 1 verb ACCUSE ppl FMLN 2 proper-noun FMLN sg FMLN proper-noun FMLN sg ORGANIZATION Parser T he parser processes each sentence in two stages: phrasal parsing and m emory- based, parsing. T he phrasal parser takes a sequence of dictionary definitions of words generated by the preprocessor, and groups th e words into phrasal segments 21 Basic Concept Sequence [agent, ATTACK, object] attack-event »<^redicatel2^> ►<objectl239^ agentl238 syn\const sem./const. "Verb const sem. const animate human) j instance f instance instance \ j instance attack#3 guerrila#0 np_adjective .------4 urban#! attack-event#7 syn ^const. sem. (const., buildin weapon instance i j instance ^home#4^)np_noun instance pp_modifier *Mwith#8 marino#6 pp_object ^xplosive#9^ Basic Concept Sequence Instance interpretation Figure 2.8: T he parsing result exam ple such as noun-group, verb-group, preposition, punctuation, etc. T he result se quence of phrasal segments drives the m em ory-based parser to get the m eaning interpretation. M ost words can have m ultiple syntactic categories and sem antic m eanings. To resolve th e lexical am biguity as much as possible, about 30 disam biguation rules based on the syntactic inform ation of neighboring words were used. Most of the phrasal segments are correctly assigned by these rules. In case am biguities can not be resolved w ith th e rules, the phrasal parser generates m ultiple segment definitions, and th e m em ory-based parser perform s the disam biguation. G eneration of m eaning representation is done by m em ory-based parser by us ing the concept sequence patterns in th e knowledge base. T he parsing algorithm 22 is based on repeated application of expectations, activations, and verifications. A prediction m arker (P-m arker) identifies nodes th a t are expected to occur next. An activation m arker (A-m arker) identifies the node th a t actually occurs. An expected node in th e CSE is verified only if it receives A-m arkers through both syntactic and sem antic constraint links. Only verified nodes perform actions spe cific to the node type. In the beginning of parsing, all concept sequence roots are expected as pos sible hypotheses, and accordingly their first concept sequence elem ents are also expected. W hen a phrasal input is read, a concept instance is created and a bottom -up activation is propagated from th e syntactic and sem antic definitions of th e phrase through i s a links. If the last CSE of a concept sequence is veri fied, th e CSR of th e concept sequence is verified, and the corresponding concept sequence instance (CSI) is constructed as a m eaning of the input sentence. For complex sentences, several CSR’s are verified and corresponding m ultiple CSI’s are com bined dynam ically to generate one CSI. Figure 2.8 shows th e result of parsing the sentence “URBAN GUERRILLAS ATTACKED M ER IN O ’S HOM E W ITH EX PLO SIV ES.” T e m p la te g e n e r a to r T he parser generates one or m ore concept sequence instances (C SI’s) for each sentence. These CSI’s are collected and processed by the tem plate generator. One tem plate is generated for each relevant CSI, and the split and m erge of tem plates are perform ed a t th e end. T he tem plate generator im plem ents complex slot filling rules which are based on th e occurrence of key verbs and nouns. In th e current system , only 118 rules are present. Much m ore inform ation can be extracted from th e parser o u tp u t by using m ore rules. Some of the rules im plem ented for BOMBING event are: R u le 1 : IF [there is a <bom bing-event> th a t is not the object of a <denial- event> or a < stop-event>] AND [the tense is not future or modal] TH EN [fill slot 4 (incident type) w ith “BO M BIN G ”] R u le 1.1 : IF [the object of <bom bing-event> is not a < hum an> ] T H E N [fill slot 12 (physical target id) w ith th e object of the <bom bing-event>] 23 R u le 1.2 : IF [the agent of th e <bom bing-event> is an <organization>] TH EN [fill slot 10 (perp organization) w ith th e agent of the <bom bing-event>] 2 .2 .3 D isc u ssio n on th e ta sk The inform ation extraction task on a lim ited dom ain is quite different from a gen eral n atural language understanding task. For n atu ral language understanding, one m ust be able to process full com plexity of language, and produce a target representation th a t presents all th e m eanings (including th e im plicit meanings like common sense) of the input sentence. A classical approach is: F irst, a syn tactic analysis m odule parses th e input into a syntactic parse structure by using a syntactic gram m ar. T hen a syntax to sem antics m apping m odule converts the syntactic structure to a m eaning representation by using a set of sem antic inter p retation rules. To produce a com plete m eaning representation, other processes like reference resolution are necessary. It is well recognized th a t for inform ation extraction or retrieval in a lim ited dom ain, a full syntactic analysis or a comprehensive sem antic in terpretation is not necessary. In these tasks, there is only a small num ber of event categories to which each tex t can be m apped. Various expressions need to be m apped into one event category. Also, only a small am ount of pre-defined types of inform ation is to be extracted. Therefore, even for a relevant sentence, only several term s th a t carry relevant inform ation are need to be interpreted. These observations are the basis for th e representation of patterns presented in later chapters and th e p attern m atching based parsing m echanism described in an earlier section of this chapter. 2.3 Acquisition of Linguistic Knowledge Due to the com plexity of hum an efforts to build a com plete lexical knowledge for n atural language processing, and th e needs of portability and scalability for practical language processing system s, there has been m uch research in the field of language learning. T he following is a brief survey of previous research in this field. According to various goals and approaches, they are categorized as: 1) Expanding th e lexicon through interaction w ith a hum an expert [82] [103] [66] [5] 24 [31] [38], 2) Learning word meaning [30] [91] [89], 3) Learning gram m ar [93] [2] . [9], 4) Acquisition of statistical inform ation on collocations [21] [22] [94], and 5) Acquisition of sem antic patterns [109] [110] [100] [79] [86]. In this section, these previous approaches to linguistic knowledge acquisition ! are briefly reviewed, and some of them are com pared to the goal and approach | presented in this thesis. 2.3.1 E x p a n d in g Lexicon th ro u g h h u m an in te r a c tio n One of the original works on language understanding and language learning can be found in Q uillian’s work [82]. In th e late 60’s, Quillian developed a experim ental program for n atural language understanding, called Teachable Language Compre- hender (TLC). One of the m ain goals of this program was to be teachable. It was designed to be capable of being taught to understand English tex t. W hen a text which the program has not seen before is given to it, th e program com prehends th e tex t by relating each assertion to th e sem antic network. To facilitate the acquisition of new inform ation, it was designed to work in close interaction w ith a hum an m onitor. T he m onitor watches the program ’s a ttem p t to understand the tex t, approves or disapproves of each step, and provides factual inform ation as it is required. The inform ation th a t th e m onitor provides is generalized so th a t it can be applied to understand m any other sentences in the future, and retained perm anently in th e memory. T he original plan of this research was to start w ith several children’s books dealing w ith sim ilar subjects and let th e program read all the sentences under the . supervision of the m onitor. T he hope was th a t, over a long period of reading, the program may not need th e help of a hum an m onitor to understand texts in a given subject areas. B ut th e program deals w ith only a lim ited set of problem s, ■ and no serious attem p t had been m ade to actually build a large knowledge base and test large corpuses. A nother sim ilar approach to the acquisition of linguistic knowledge by teaching can be seen in the n atural language interface UC, developed by W ilensky et al. [17] [103]. UC is a n atural language consultant system which advises users in using the UNIX operating system . The system is composed of several com ponents including 25 a parser, generater, planner, highly extensible knowledge base and a system for , acquisition of new knowledge through instructions in English. T he knowledge on language is represented as pattern-concept pairs. A phrasal ! pattern is a description of a sentence th a t has m any different levels of abstraction. , It m ay be ju st a literal string, it m ay be a p a ttern w ith some flexibility such as j “(nationality) restau ran t” or “(person) kick th e bucket” it m ay be a very general phrase such as “(person) (give) (person) (object)” , or it m ay be a productive ] syntactic p a ttern like “(sentence) and (sentence)” . Associated w ith each phrasal p atte rn is a conceptual template. It is a piece of m eaning representation w ith possible reference to th e phrasal p attern . Each pattern-concept tem plate encodes one piece of knowledge about the sem antics of the language. j One of th e prim ary goals of this system was to m ake it relatively straightfor ward to extend th e system . They m ade the knowledge base about the language declarative, so th at the system has clean separation of knowledge about language m eaning from language processing strategies. By doing this, new piece of knowl- : edge can be added easily w ithout having to reprogram . T he knowledge base of pattern-concept pairs can be extended by telling in English th e facts about the t language. I However, th e ability of th e system to acquire new knowledge about th e lan- ' guage is highly restricted. T he user m ust talk (give sentences) to th e system in pre-defined form, and the system does not have any strategies to ex tract p attern s from general sentences or build m ore appropriate pattern s through generalization w ith other pattern s in the knowledge base. There are system s th a t extend lexicon by asking questions to the user. T he VOX system [66] asks questions to the user about new words in order to extend its vocabulary. T he user provides th e syntactic inform ation for th e word, and also specifies slots of a fram e th a t should be filled w ith the roles present in the sentence. It adds new words to th e lexicon, together w ith uses for th e words and their meanings. T he LDC system [5] also asks questions about new words. The user enters inform ation about the syntactic categories of th e words, sem antic class in a sem antic hierarchy, verbs associated w ith th e word, th e phrases used and database query access associated w ith th e phrases. The system presents lexical I 26 ; acquisition and sem antic gram m ar extension, and assumes sim ple m apping from words and phrases to concepts. There are also portable system s which try to facilitate th e construction of a n atu ral language processing system for a new dom ain [31] [38]. T he basic approach of these system s is to provide syntactic gram m ar rules, a parser and software tools to m ake it easy to change to a new dom ain. For each new dom ain, basic syntactic structures are same, and th e concepts and databases are newly developed. The user enters new words and th e system asks m any questions regarding the usage of th e words. T he system s basically provide knowledge engineering tools to create a n atural language interface. However, th e answers to th e system ’s questions usually require a linguistic background, and the process of creating new sem antic interpretation rules is still a tim e consuming job. T he goals of above m entioned system s are different from th e goal of th e acqui sition in this thesis. Also, th e approaches are different since basically, th e system described here is not a user interactive knowledge acquisition tool. A lthough some parts of the system introduce user interaction according to th e availability of knowledge sources, the prim ary knowledge source is not a hum an, b u t texts and databases. 2 .3 .2 L earn in g w ord m ea n in g s There are system s th a t are m ore M achine Learning oriented, in contrast to the knowledge engineering approaches in th e previous section, which ask questions to acquire linguistic knowledge. These system s try to acquire meanings of new words from other knowledge sources such as scripts, word-concept pairs, or th e changes of world states. G ranger [30] used knowledge-based approaches to learn new words. His system , FOUL-UP, inferred the m eaning of a word from th e script expectations (context) [14], and the parser gradually built a set of sem antic and syntactic constraints for th e unknown word. T he learning was invoked by parsing failure, and the m eanings were figured out from currently active scripts. Selfridge [91] used pairs of words and concepts. His CHILD system m odeled the first language acquisition of a child. T he system is presented w ith a concept 27 and corresponding sentence, and based on the common p arts of sentences and th eir concepts, th e concepts are associated w ith individual words. Learning is accom plished by these associations. Since it is intended as a cognitive model, learning of complex sentence structures and concepts are not presented. T he approaches described in this section are focused on th e m odeling of the learning process, which is different from th e approach of the system in this thesis th a t emphasizes practicality for real world applications. 2.3.3 L earn in g gram m ar ru les Learning syntactic rules using a sem antics directed approach was tried by Siklossy [93]. His system is provided by pairs of sentences and desired o u tp u t schem ata, and produces a sem antic gram m ar by learning th e associations. T he outp u t schem ata is the sem antic representation th a t the parser should produce using th e gram m ar. It learned to parse simple Russian phrases, starting w ith sim ple sentences and progressing to m ore complex sentences. A nderson [2] presented a learning system based on the association of a sen tence w ith its conceptual graph m eaning representation. His system learns syn tactic gram m ar rules in the form of recursive phrase structure. In his system , a different syntactic structure implies a different conceptual representation. It is one of th e cognitive models which deals w ith hum an inference and m em ory access m echanism , but it is im practical since the num ber of necessary input pairs is very large. An explicit com putational m odel of language acquisition has been developed by Berwick [9]. His program inductively learns rules of English syntax, given a sequence of gram m atical sentences. T he target gram m ar is th e gram m ar of a determ inistic parser by M arcus [65]. His acquisition m odel is the system which builds a series of parsers of increasing sophistication. T he initial state of the system ’s knowledge contains basic ability to classify a w ord’s syntactic category, simple representation of th e sem antic content of sentences and th e constraints on possible parsing rules. The acquisition procedure is failure-driven. If the parser fails because none of the existing rules apply or known rules fail, then the acquisition m odule tries to build a single new parsing rule th a t will fill th e gap in 28 th e p arser’s knowledge. If it succeeds, a new rule is added to th e p arser’s database after generalization w ith th e existing rules. In contrast to other approaches, his system does not require knowledge sources other than th e sentence itself. He also tried to uncover th e com putational con straint on th e acquisition of syntactic knowledge, and com pare th e leam ability and parsability constraints. 2 .3 .4 A c q u isitio n o f c o llo c a tio n a l in fo rm a tio n One of the em erging fields in language processing is the statistical approaches. In these approaches, statistical probability inform ation on word collocations is acquired by analyzing a very large corpus. T he acquired inform ation can be used for various tasks in lexicography, inform ation retrieval and language generation. Church [21] [22] used statistics to com pute word associations in a large cor pora. He used th e m utual inform ation statistic as a way to identify a variety of interesting linguistic phenom ena, ranging from sem antic relations to syntactic co-occurrence preference between verbs and prepositions. T he m utual inform a tion, I(x;y), compares the probability of observing word x and word y together (joint probability), w ith the probabilities of observing x and y independently. It is defined as: If there is a strong association between x and y, then I(x; y) 0. If there is no such relationship between x and y, then I{x ; y) « 0. The word probabilities, P(x) and P(y ), are estim ated by counting th e num ber of observations of x and y in a corpus, and norm alizing by the size of the corpus. The joint probability, P(x , y), is estim ated by counting the num ber of tim es th a t x is followed by y, and norm alizing by the size. T he acquired inform ation can be used for lexicography and inform ation re trieval. As an exam ple, the usage of sim ilar words strong and powerful can be identified by com puting th e m utual inform ation of those words w ith others. Gen eral English representations are “strong te a ” and “powerful car” , not “powerful 29 tea” or “strong car” . Sim ilar inform ation can also be detected for relations be tween verbs and prepositions, etc. Sm adja [94] addressed a sim ilar approach in th e autom atic extraction of collo cational relations to enrich the word-based lexicons. His program , X tract, retrieves lexical relations from th e analysis of a large corpus. His work focused on th e use of those relations for language generation, w here th e co-occurrence knowledge is im portant to prevent the generation of awkward or incorrect sentences. The statistical analysis is efficient and easily im plem ented. Also the inform a tion on collocations of words is im portant and useful, since those relations can not be identified by using syntax or semantics. However, th e collocative knowl edge acquired by a statistical m ethod usually does not provide sem antics, and therefore, does not related much to extracting th e “m eaning” of a sentence. 2 .3 .5 A c q u isitio n o f sem a n tic p a tte r n s T he system s and approaches described in this section are m ost closely related to the work of this thesis. T he goal of th e system s is autom ated acquisition of sem antic pattern s th a t is used to m ap the surface pattern s in th e input sentence to its m eaning. Zernik’s program RINA [109] [110] learns new word pattern s and idiom s as a phrasal lexicon. It learns new lexical entries from exam ples in context, through a dialog w ith a hum an user. He factored out th e aspects of world knowledge acquisition by presenting th e second language acquisition as his learning model. As a consequence, world knowledge and the inform ation on th e context is assum ed to be given to the system . W hen a sentence w ith a idiom atic expression which is not known to th e system is read, it first tries to understand th e sentence based on each w ord’s m eaning. If th e user disapproves the system ’s understanding and rephrases the input sentence, th e system tries other possibilities, and finally extracts the correct m eaning by using the context. A fter the m eaning of the sentence is approved by the user, the system extracts an appropriate phrasal p attern from th e sentence by deciding the scope of th e p atte rn and generalizing it. 30 His program shows good m odeling of cognitive process for an a d u lt’s second language acquisition. However, in a practical and com putational point of view, his m odel seems to be inappropriate for actual application to a practical n atural language processing system , since his m odel assum ed all th e contextual knowledge is given to the system a priori. To be m ore practical, th e sem antic m eaning of the sentence should be acquired from existing knowledge or from th e user. A m ore practical approach to th e acquisition of sem antics can be found in the work of Velardi et al. [100] [79]. They presented a m ethodology for the extensive acquisition of a case based sem antic dictionary. T heir system analyzes a large sample of sentences including a given word, and produce one or m ore entries for th a t word in th e sem antic dictionary. A target word sense definition is presented by a detailed list of use-types, called surface sem antic patterns. By doing this, th e word sense is given by its associations w ith other concepts, not by single conceptual categories. T he acquisition system has the knowledge concerns general rules for sentence interpretation, and the acquired knowledge reflects real language p attern s, idioms, regularities and irregularities in concept associations. T he background knowledge includes syntactic to sem antic rules (S S rules) which associate each syntactic p attern to a possible sem antic interpretation, and conceptual relation rules ( CR rules) which are the selectional restriction rules on conceptual relations. The system first derives syntactic phrasal patterns from a real sentence, and then derive an interpretation of each phrasal p attern by using the S S rules and CR rules. T he derived sem antic patterns are generalized, and th en checked by the hum an user before it is added as a new entry. T he sem antic p attern s it acquires are sim ilar to th e concept sequence patterns in m em ory-based parsing. However, the knowledge it acquires is restricted to the sem antic relations of concepts, and it does not include general sentence patterns and th eir selectional restrictions. S S rules and CR rules are assum ed to exist as background knowledge for the system , bu t encoding all these rules is also a difficult and tim e consum ing job. Riloff and Lehnert have presented a system called AutoSlog [86], which con structs a domain-specific sem antic dictionary for an inform ation extraction system 31 for the MUC-4 dom ain. T he goal of the system is to alleviate th e knowledge en gineering bottleneck th a t causes the portability and scalability problem s when a knowledge based system is ported to a new domain. Their inform ation extraction system is based on a set of dom ain-specific dictio- i nary entries called concept nodes. Concept nodes are triggered by specific lexical item s relevant to the dom ain. W hen a concept node is activated, it acts as a case fram e th a t picks up relevant inform ation from the sentence. W hat a concept node tells is som ething like “if a word attacked is found and th e class of its direct object is a phys-target, then fill th e target slot w ith the direct object” . T he AutoSlog system uses texts and the corresponding answer keys to autom atically construct th e concept nodes. Given a word from a targ et tem plate, th e system searches for th e first sentence in the tex t th a t contains th e word. By analyzing th e sentence, relevant clause is identified, and a specific p attern is m ade using several linguis tic rules. A fter a p attern is identified, a concept node is b uilt by specifying a triggering word, enabling conditions, a set of slots and constraints for each slot. The AutoSlog system dem onstrates its feasibility by significantly reducing the tim e to construct a domain-specific sem antic dictionary. It has sim ilar features w ith th e system presented in this thesis such as: 1) it uses th e ex tracted infor m ation as a knowledge source, and 2) it does not require a linguist to provide inform ation. However, the patterns it acquires are too sim ple (therefore, can be too general), and they are specific to their own use. Jacobs [44] [46] has applied a statistical m ethod to autom atic p a tte rn acqui sition for knowledge based news categorization. The acquired p a tte rn is called a lexicosemantic pattern, which is used for a knowledge based system NLDB th a t autom atically assigns categories to news stories. T he p attern s are com bina tions of lexical categories represented in regular expressions like lexical features, logical com binations and variable assignm ents. An exam ple of th e p a tte rn is: “C l...announced...acquisition o f ...C 2 ” and the rule says “if in this p attern , C l and C2 are company nam es, then it represents a Takeover, and th e agent is C l and the target is C2.n The goal of the acquisition is to use a large set of training d ata to enter sim ple lexicosem antic p attern s, then m erge th e results w ith the m annualy developed ,32 knowledge base. He used a training set of 11,500 news stories w ith hum an as- j signed categories for each story. In analyzing th e training set, individual term s i are weighted by a statistical m eans, and th e heavily weighted term s are used as j th e building block of the patterns. T he statistical m ethod helps to find words , and phrases th a t are good indicators of each category. By using this m ethod, th e j I knowledge base was provided by about 7,000 autom atically acquired rules along I w ith th e 800 hand-coded patterns. His work shows a good exam ple of how th e statistical m ethod can be successfully applied for the sem antic p attern acquisition. 33 Chapter 3 Automatic Lexical Acquisition In knowledge-based tex t processing, domain-specific sem antic p attern s have been widely used for inform ation extraction and retrieval. By using domain-specific sem antic patterns, one can achieve fast and efficient te x t processing by directly m apping a surface linguistic p attern to its m eaning, w ithout full syntactic anal ysis, and w ithout applying conversion rules from syntactic stru ctu re to sem antic in terpretation [84] [44] [59] [73]. A lthough the knowledge-based approach has been proven to be very effective for inform ation extraction and retrieval, especially for specific dom ains, one signif icant problem of th e approach is th a t it is necessary to construct a large num ber of domain-specific sem antic patterns. M anual creation of sem antic pattern s is very tim e consuming and error prone, even for a small application dom ain. To solve the scalability and the portability problem , autom atic acquisition of sem antic patterns m ust be provided. In this chapter, a practical approach to th e sem antic p a ttern acquisition and the acquisition system prototype PALKA are presented. T he m ajor goal of this system is to facilitate the construction of a large knowledge base of sem antic p a t terns. PALKA acquires sem antic patterns from a set of dom ain specific sam ple texts and th eir desired outp u t representations. T he acquisition system perform s as a feedback to the parser. W hen the parsing fails due to th e lack of an appro p riate sem antic p attern , the acquisition system constructs a new p attern . W hen constructing a new sem antic p attern , a surface phrasal p attern is acquired from a sam ple tex t, and th e sem antic inform ation is acquired from a corresponding 34 o u tp u t representation. T he acquired pattern s are further tu n ed through a series of generalizations of sem antic constraints of each elem ent in th e phrasal p attern . An inductive learning m echanism [70] [68] is applied to the generalization steps. Since our acquisition m ethod is closely related to the parser, we use exam- j pie sentences and target representations from the M UC-4 corpus to describe the \ PALKA system . However, PALKA can be easily adapted to other dom ains if appropriate knowledge sources are provided. 3.1 The Acquisition Task T he task of an acquisition system is to build a knowledge base of domain-specific sem antic patterns. There are three m ain issues which ought to be discussed in order to present th e acquisition system - th e representation of sem antic knowledge, th e knowledge source used and the acquisition m ethod. They are presented in this section, and a detailed description of th e acquisition procedure is presented in the next section. 3.1.1 R e p r e se n ta tio n o f se m a n tic p a tte r n s T he inform ation extraction task on a narrow dom ain is quite different from a , general sem antic in terpretation task. F irst, there is only a sm all num ber of event categories to which each tex t should be m apped, and only a small am ount of pre-defined types of inform ation to be extracted. Various expressions need to be I i m apped into one event category. Second, the inform ation to be extracted can : be found anywhere in th e sentence, not only in th e subject or th e object of the sentence, b u t also in th e prepositional phrases or in the m odifier. For example, all the following sentences contain th e sam e inform ation - th e category o f event is bombing, th e instrum ent is dynam ite, and the target is adm inistration office. 1. The dynam ite exploded inside the adm inistration office. 2. The dynam ite destroyed the windows o f the adm inistration office. 3. The dynam ite was hurled to the adm inistration office. 4. The dynam ite explosion caused serious damage to the adm inistration office. 35 (BOMBING a g e n t: ANIMATE ta rg e t: PHYSICAL-OBJ instrum ent: PHYSICAL-OBJ effe c t: STATE) ((BOMB) BE HURL AT (PHYSICAL-OBJ)) (BOMBING ta rg e t: PHYSICAL-OBJ instrument :BOMB pattern : ((instrument) BE HURL AT (target))) Abbreviation - BOMBING: [ (instrument: BOMB) BE HURL AT (target: PHYSICAL) ] Figure 3.1: T he fram e-phrasal p atte rn representation 5. The adm inistration office was damaged by the dynam ite explosion. 6. A n attack with a dynam ite in front o f the adm inistration office has left one person injured. M apping above sentences to explode-event, hurl-event or leave-event has no m eaning when our intent is to m ap them to a bombing-event. In sentence 2, th e target of th e event is not the window (the object of the sentence). In sentence 4, the instrum ent is not the explosion (the subject of th e sentence). An efficient rep resentation should m ap various expressions to one of the desired categories, such as bombing, and detect the inform ation carrying words or phrases from anywhere in th e sentence. Based on these observations, we represent a sem antic p attern as a p air of a m eaning fram e defining th e necessary inform ation to be extracted, and a phrasal p attern describing the surface syntactic ordering. We call this representation as BOMBING) The meaning frame effect agent target, STATE) a n im a t e ; >HYSII •HYSII The phrasal pattern hurled PHYSICAL The FP-structure BOMBING. instrument target hurled instrument semantic constraint BOMB PHYSICAL 36 th e FP-structure (Fram e-Phrasal p attern structure) [51]. T he knowledge base is organized as a netw ork of F P -structures and a concept hierarchy. • Figure 3.1 shows an exam ple of an F P -structure. A sem antic fram e is repre sented by a set of slots and their sem antic constraints on fillers. A phrasal p a ttern is an ordered com bination of concepts and lexical entries. A phrasal p attern of an F P -stru ctu re m aps a surface linguistic p attern to the root concept of a fram e th a t represents the m eaning of th a t phrase. To com bine a phrasal p a tte rn and a m eaning fram e, each slot of the fram e is linked to th e corresponding elem ent in the phrasal p attern . T he input words are connected to each elem ent in the F P -stru ctu re through th e isa hierarchy of concepts. Sim ilar representations can be found in phrasal lexicons [6] [110], pattern- concept pairs [103], phrasal patterns [41], concept nodes [59] and lexicosem antic p attern s [46]. T he F P -stru ctu re representation is different from these represen tations in several ways such as: 1) a m eaning is represented as a domain-specific fram e, 2) a full phrase is specified as a surface p attern , and 3) a sem antic con strain t is specified for each elem ent in the phrasal p attern . As one can see in the previous exam ples, th e m eaning of a phrase, or th e category of an event, cannot be simply recognized by th e m ain verb. T here can be various dom ain dependent ' expressions for th e BOMBING event, and such p attern s can only be acquired by looking at actual texts on a specific dom ain. R epresenting a full phrase as a , p attern is feasible only when an application dom ain is specific and narrow. The finiteness of expressions on a specific dom ain is discussed in C hapter 6. Sem antic patterns are used by the parser to recognize input texts. O ur parser is based on marker-passing paradigm [15] [73], in which the parsing algorithm consists of repeated applications of top-down expectations and bottom -up activa- ; tions. W hen an input word is read, a bottom up activation is propagated from th e lexical entry through the concept hierarchy. If an expected elem ent receives an activation from its constraint, it is verified, and th e expectation moves to the next elem ent. Parsing succeeds if all elem ents in a p a tte rn are verified. W hen the parsing succeeds, an instance of th e F P -stru ctu re is generated as a result of recognition. M ore details of the parsing procedure can be found in [73]. 37 3 .1 .2 T h e k n o w led g e so u rces i T he sem antic knowledge cannot be acquired solely from tex ts. Possible knowledge ; sources are hum an, contextual knowledge, on-line dictionary, tagged tex ts, or filled j tem plates. Using hum an knowledge introduces a fully interactive system , which ! I is out of our interests. Using contextual knowledge is unrealistic, since it is not | always available and is difficult to provide. An on-line dictionary can be a good source, but the inform ation in th e dictionary is not uniform and not consistent. In our system , training texts and corresponding database templates (as illus tra te d in Figure 2.5) are used as th e two m ajor knowledge sources. A tex t is a set of news articles on a specific domain. T he dom ain currently used is concerned w ith terrorist events in Latin America. PALKA uses th e te x t to acquire phrasal ' patterns. A tem plate is a desired outp u t representation of a sam ple tex t which is generated by hand. It contains all th e inform ation th a t should be extracted from th e text. C urrently, 1400 news articles and th eir corresponding tem plates are available on line for a system training purpose. PALKA uses th e tem plates to m ap a phrasal p attern to a corresponding fram e. W hen tem plates are not avail- , able, PALKA acquires th e m apping inform ation through user interaction. O ther knowledge sources used by PALKA are: • T he concept hierarchy contains general classification of objects, events, states and dom ain specific concepts. It is used to specify a sem antic constraint of each elem ent in an F P -structure. It is also used for th e generalization of 1 p atterns. i • T he dictionary m aps an input word to one or m ore concepts in the concept 1 hierarchy. For exam ple, “ DYNAMITE” is m apped to th e concept BOMB, and | “ HOUSE” is m apped to th e concept BUILDING, which is linked to the concept PHYSICAL-OBJECT through isa relations. ! • T he fram e definition represents the type of inform ation to be extracted from th e dom ain texts. T he fram e definition of BOMBING is described in the next section. 38 ! 3 .1 .3 B a sic ap p roach to th e a cq u isitio n 1) Knowledge acquisition as a feedback to the parser. In our approach, th e ac quisition process perforins as a feedback to th e parser. D epending on th e current statu s of the knowledge base, th e parser m ay produce one of th e following results: 1) correct interpretation, 2) no in terpretation, or 3) incorrect interpretation. Ex am ples of each case and corresponding actions by th e acquisition system are as follows: • Case 1: Correct in terpretation ( desired output — parser output). 1. P attern : A ppropriate 2. Action: None • Case 2: No in terpretation ( desired output ^ 0, parser output — 0). 1. P attern : BOMBING: [(instrument: DYNAMITE) EXPLODE] (or no p attern ) 2. Sentence: “A POWERFUL BOMB EXPLODED IN FRONT OF THE BUILDING” 3. Interpretation: None 4. Action: C reate a new p a tte rn and generalize it • Case 3: Incorrect interpretation ( desired output — 0, parser output ^ 0). 1. P attern : BOMBING: [(instrument: THING) EXPLODE] 2. Sentence: “THE FOREIGN DEBT CRISIS EXPLODED IN ANDEAN COUNTRIES” 3. Interpretation: BOMBING-EVENT, instrument = “FOREIGN DEBT CRISIS" 4. Action: Specialize the p attern In case 2, the parser produces no interpretation since th e sem antic constraint , DYNAMITE is too specific to be m atched by th e input sentence. In this case, a new ; p attern is created from the input sentence, and generalized w ith previous p attern s. , In case 3, th e input sentence is m isinterpreted as a BOMBING event since th e se m antic constraint is overgeneralized as THING. In this case, the sem antic constraint is specialized to an appropriate level. Through th e acquisition process described above, the system creates a consistent and sem antically correct knowledge base. 2) Knowledge acquisition as a reverse process o f parsing. W hen a new knowl edge is needed, th e creation of a sem antic p a ttern is regarded as a reverse process , of parsing. Parsing process is to ex tract inform ation (/) contained in a n atural 39 Text Input . ( Knowledges \ _ B a s e \ J - ► Output ~ 7 ~ \ e + 1 Acquisition!*— -----( g H — Desired Output (Template) Figure 3.2: Conceptual diagram of acquisition as a feedback to the parser language tex t (L ) by using a sem antic knowledge (K). T he acquisition process is to ex tract a necessary sem antic knowledge (K) from a language (L) by using an inform ation (I) which is available. So, th e relationship betw een parsing and acquisition can be represented as follows: Parsing: (L , K) — > I Acquisition: (L, I) — > K In our system , the MUC-4 corpus of news articles is used as L, and the desired outp u t (tem plate) of each article is used as I. From L and I, th e system generates a knowledge base K, which is a collection of dom ain dependent sem antic patterns. 3.2 Acquisition of New FP-Structures PALKA is an autom atic sem antic p attern acquisition tool. It acquires dom ain dependent sem antic pattern s for a given frame. P hrasal p attern s (collocational inform ation) are acquired from texts, and m appings to th e fram e (sem antic infor m ation) are acquired from tem plates. Figure 3.3 shows th e functional stru ctu re of the PALKA system . For a given fram e definition, th e acquisition system selects candidate sentences which m ay have relevancy, and converts them into sim ple clauses. A fter trial parsing, the user determ ines th e correctness of the parsing output. If there is no outp u t for a relevant sentence, a new F P -stru ctu re is cre ated through FP-m apping, FP-structure construction, and generalization. If the 40 correct matching TEXT incorrect matching no matching Knowledge . Base TEMPLATE OR USER 'Concept JLierarchy Frame Merging Conversion to simple clauses Knowledge- based Parser FP-mapping Module Keyword-based sentence extraction Generalization Module Specialization Module Figure 3.3: T he functional structure of PALKA o u tp u t is incorrect, the m atched p attern in th e knowledge base is modified through specialization. In this section, th e acquisition procedure is described in detail w ith exam ples. 3 .2 .1 F ram e d e fin itio n an d se n te n c e e x tr a c tio n T he acquisition of sem antic pattern s is perform ed for one fram e at a tim e. For exam ple, the system first acquires all th e pattern s for th e BOMBING event fram e, and then for the KILLING fram e, and so on. In w hat follows, th e acquisition proce dure in PALKA is described by using the BOMBING fram e exam ple. T he BOMBING fram e is defined in Figure 3.4-(a). The first slot isa points to a m ore general fram e in the knowledge base to which this fram e is connected. In th e second slot keyword, several keywords are specified. Relevant sentences are extracted from sam ple texts by using these keywords. T he other 4 slots - agent, target, effect, and instrument - indicate the types of inform ation used in this dom ain. For each slot, a general sem antic constraint is specified. By 41 (BOMBING isa: (TERRORIST-ACTION) (bom b bombing dynam ite explode explosion explosive) (ANIMATE) keyword: agent: target: (PHYSICAL-OBJECT) (PHYSICAL-OBJECT) (STATE)) instrument: effect: (a) The frame definition of BOMBING event THE PERUVIAN POLICE HAVE REPORTED THAT THREE DYNAMITES W ERE HURLED AT A HOUSE IN THE SAN BOR JA NEIGHBORHOOD, WHERE U .S. MARINE RESIDE. (b) The sentence extracted from a text [THE PERUVIAN POUCEJnoun-group [HAVE REPORTED]verb-group {THATJrel-pronoun [THREE DYNAMITESJnoun-group [WERE HURLEDJverb-group [A7]preposition [A HOUSEJnoun-group [INJpreposition [THE SANBORJA NEIGHBORHOOD]noun-group [,[punctuation [WHEREJrel-pronoun [U.S. MARINE]noun-group [RESIDE]verb-group [.[punctuation (c) The result of grouping words 1. [PERUVIAN POLICE) [HAVE REPORTED] [IT] 2. [THREE DYNAMITES] [WERE HURLED] [AT] [HOUSE] [IN] [SANBORJA NEIGHBORHOOD] 3. [U.S. MARINE] [RESIDE] [IN] [SANBORJA NEIGHBORHOOD] (d) The result of decomposition [THREE (instrument: DYNAMITES)] [WERE HURLED] [AT] [(target: HOUSE)] [IN] [SANBORJA NEIGHBORHOOD] (e) The result of FP mapping (BOMBING target: (BUILDING) instrument: (BOMB) pattern: ((instrument) BE HURL AT (target))) (f) The result of FP-structure construction Figure 3.4: T he results of each step for F P -stru ctu re acquisition 42 using th e keyword “dynamite” , the sentence in Figure 3.4-(b) is extracted from the tex t, as a possible relevant one. 3 .2 .2 C o n v ersio n to sim p le cla u ses T he original tex t consists of complex sentences which contain relative clauses, nom inal clauses, conjunctive clauses, etc. Since sem antic p attern s are acquired from simple clauses, it is necessary to convert a complex sentence to a set of sim ple clauses. A sim ple phrasal parser converts th e extracted sentence into simple clauses through th e following steps. 1) Step 1. Grouping words: The phrasal parser groups words based on each w ord’s syntactic category and ordering rules for noun-groups and verb-groups. Basic syntactic disam biguation of word category is perform ed at this stage. T he result of grouping words for th e exam ple sentence is shown in Figure 3.4-(c). 2) Step 2. Sim plification and decomposition: A fter grouping is perform ed, the phrasal parser first simplifies the sentence by elim inating several unnecessary elem ents such as determ iners, adverbs, quotations, brackets and so on. Then it converts th e simplified sentence into several sim ple clauses by using conversion rules. T he conversion rules include separation of relative clauses, nom inal clauses and conjunctive clauses. T he three simple clauses shown in Figure 3.4-(d) are the results of th e phrasal parsing. Based on the keywords specified in the BOMBING fram e, only the second clause is selected for further processing. To describe th e F P mapping module and the FP-structure construction, we assume th a t th e ou tp u t of th e parser of clause 2 is NIL (i.e., no m atching). 3.2 .3 F P m a p p in g m o d u le At this point, th e definition of th e BOMBING fram e is available (th e meaning fram e), and the simple clause p attern was extracted (the phrasal pattern). To construct an F P -stru ctu re from these, links between the fram e slots and the phrasal p a t tern elem ents should be established. There are two different m odes of operation according to th e availability of tem plates. 43 1) Autom atic mapping mode: If th e tem plates are available PALKA finds out th e m apping by using the inform ation in th e corresponding tem p late1. Each slot of th e fram e definition corresponds to one or m ore slots in th e tem plate. For exam ple, th e target slot of th e fram e corresponds to the PHYS TGT: slot and HUM | TGT: slot of th e tem plate. For each slot of th e fram e, th e system searches through I th e tem plate to pick up fillers for th a t slot. Then each elem ent in the phrasal ; p a ttern is com pared w ith th e fillers collected. If an elem ent is m atched w ith a filler, th en a link between the corresponding slot and the m atched elem ent is m ade. 2) Interactive mapping mode: In case tem plates are not available, PALKA first finds out candidates for each slot by using th e general sem antic constraint specified for each slot in th e fram e definition, and then establishes th e m apping through th e user interaction. For exam ple, th e general sem antic constraint of the target slot is PHYSICAL-OBJECT, and so th e candidate elem ents for target slot are “THREE DYNAMITES” and “HOUSE” , since th eir sem antic categories are under the concept PHYSICAL-OBJECT in th e concept hierarchy. The candidates for each slot are presented to the user, and th e user selects one for each slot. T he experim ents on th e acquisition of the TIE-UP p attern s for MUC-5 dom ain are perform ed in this mode. T he m apping shown in Figure 3.4-(e) is obtained after th e F P m apping pro cedure. T he agent and effect slots are not linked, since either 1) no elem ent in the phrasal p atte rn is m atched to th e corresponding fillers, or 2) no candidates which satisfy the general sem antic constraint are found. 3 .2 .4 F P -str u c tu r e c o n str u c tio n ! A fter all th e links are established, PALKA constructs an F P -stru ctu re based on ; th e m apping inform ation. T he basic strategy for constructing an F P -stru ctu re is to include the m apped elem ents and th e m ain verb, and discard th e unm apped elem ents. Some basic rules for th e F P -stru ctu re construction are: ru le 1. All m apped elem ents are replaced by their sem antic categories. 1Also, semantically tagged texts can be used to provide necessary mapping information. 44 ru le 2. If a m apped elem ent in the noun group is a head noun, the whole group is replaced by th a t elem ent. If it is not, th e rem aining elem ents are included too. , ru le 3. All unm apped prepositional phrases, except the phrases containing key- j words, are discarded. ! ru le 4. An unm apped noun group is also included if it is not a p art of a prepo- j sitional phrase, after replaced by th e sem antic category of its head noun. i ru le 5. All verbs are replaced by th eir root form s, and all auxiliary verbs except the be-verb in passive form are discarded. By applying rules 1 and 2, th e noun groups “THREE DYNAMITES” and “HOUSE” are replaced by th e concepts BOMB and BUILDING, respectively. A fter applying rules 3 and 5, th e prepositional phrase “IN SANBORJA NEIGHBORHOOD” is dis carded, and th e verb group “WERE HURLED” is replace by BE HURL. T he final form of the F P -stru ctu re acquired from th e exam ple sentence is shown in Figure 3.4-(f) Figure 3.5 shows an exam ple knowledge base generated by PALKA. T he knowl edge base consists of a set of F P -structures created by PALKA, the dom ain con cept hierarchy, and th e connections betw een them (th e sem antic constraints). The problem of how to establish correct connections is presented in th e next chapter as a generalization problem. i | 3.3 Summary This chapter presented a knowledge representation for inform ation extraction, 1 called an F P -stru ctu re, and th e PALKA system which autom atically acquires the ! F P -structures from dom ain texts. The sem antic p atte rn in th e F P -stru ctu re di- i i rectly m aps the surface sentence p attern into th e predefined inform ation structure. Our approach to th e acquisition shares m any features w ith other approaches in the acquisition of surface sem antics m entioned in th e previous chapter, b u t has unique features such as: 1) th e acquisition is perform ed as a feedback to th e parser, 2) knowledge sources, th a t are either available on-line or easily constructed for spe cific dom ains, are used, 3) a pair of a m eaning fram e and a full phrase p a tte rn is I I I FP-Structures tm u jm iin n m n ffn tiB B U H BOMBING isa b o m b in g : last Ita rj first instrument isa BOMBING? first sem antic' constraint instrument semantic constraint EXPLODI instrumei semantic constraint next BOMBINGS' first target instrument) EXPLOSIOi CAUSEWeflect next semantic constraint semantic constraint .......... ' ,u ji THING isa isa SITUATION* O B JEC T isa EVENT s t a t e ) ( p r o c e s s ABSTRACT* INANIMATE* ANIMATE HUMAN, h VEHICI W EAPOI ST R U C T U R I EXPLOSIVE* BUILDING B O M B ) (G REN A D I Concept Hierarchy Figure 3.5: An exam ple of th e knowledge base created by PALKA 46 used to represent sem antic knowledge, and 4) th e acquired p attern s are generalized through induction. T he acquired p atte rn is further tuned through the generalization of th e se m antic constraint of each elem ent. T he generalization problem is addressed in th e next chapter. T he acquisition experim ents were perform ed w ith M UC-4 and MUC-5 articles, and the results are discussed in C hapter 6. 47 Chapter 4 Generalization As new p attern s are generated through th e acquisition process, pattern s w ith the same structure are encountered frequently. T he only difference is th e sem antic constraints determ ined by the original sentences from which th e pattern s are ex tracted. As described in th e previous chapter, th e sem antic constraints should be properly determ ined to prevent missing relevant sentences or m ism atching ir relevant sentences. In this chapter, the procedure for generalization of acquired pattern s is presented. Two different modes of generalization are described, and the im plem entation of the algorithm in m arker-passing instructions is given. 4.1 The Problem The goal of generalization is to determ ine an optim al level of generality for each el em ent’s sem antic constraint in th e phrasal p attern . Figure 4.1 shows this problem . T he sem antic constraint of each elem ent is given by a concept or a disjunction of concepts in th e concept hierarchy. It determ ines th e coverage of a phrasal p attern . If a constraint is too specific, it m ay miss a sentence. If it is too general, it m ay be m atched incorrectly. Since th e sem antic category of a newly created p atte rn is determ ined to be the m ost specific one, it should be generalized if possible. T he acquired F P -stru ctu re is com pared w ith existing ones for further generalization. W henever two FP- structures w ith sim ilar phrasal pattern s are generated, th eir sem antic constraints 48 New FP-structure Concept hierarchy Where to connect ? in.. semantic constraint Figure 4.1: D eterm ination of sem antic constraints are generalized. W hen an overgeneralized p attern is found (incorrect m atching), th e corresponding sem antic constraint is specialized. In this section, two different approaches to generalization are presented - single- step and incremental. In both cases, an inductive learning m echanism [70] [68] is applied to the m odification of sem antic constraints. For induction, th e sem antic constraint of a newly created p attern is used as a positive exam ple, and th e se m antic constraint of an incorrectly m atched p atte rn is used as a negative example. 4.2 The Algorithms 4 .2 .1 S in g le -ste p g e n e r a liz a tio n In th e single-step approach, th e algorithm keeps lists of exam ple sem antic con straints for each elem ent of an F P -stru ctu re during th e acquisition process, and com putes an appropriate sem antic constraint at th e end. W hen a positive exam ple is encountered, th e exam ple concept is added to the positive list (P ). W hen a negative exam ple is encountered, the exam ple concept is added to th e negative list (N ). T he generalization is perform ed at th e end of the acquisition process by using th e positive and negative exam ple lists. G eneralization is to detect th e m ost general concepts am ong th e consistent concepts, which subsum e th e positive exam ples and do not subsum e th e negative 49 C o n c e p t H iera rch y C Concept Hierarchy -e Neg-markers Pi Ni Positive examples : P j, P 2 , P3 Negative examples : Nj A v P3 Figure 4.2: Propagation of m arkers in single-step generalization exam ples. Let S u p (S ) be a set of all subsum ers (parents) of th e concepts in th e set S , C S be a set of consistent sem antic constraints, and M G C S be a disjunction of the m ost general concepts among C S . T hen, S u p (P ) is the set of all hypotheses for consistent sem antic constraints. Among them , S u p (N ) m ust be elim inated since they produce incorrect interpretations. Therefore, th e set S u p (P ) — S u p (N ) rep resents th e set of consistent sem antic constraints C S . The m ost general concepts are selected from C S , and combined w ith disjunctions to form a final sem antic constraint. T he m ost general concepts in a set are th e concepts which do not have th eir subsum ers in the set. Figure 4.2 shows an exam ple of single-step generalization. P i, P 2 and P3 are positive exam ples, and N i is a negative exam ple. In this exam ple, th e S u p (P ) and th e S u p (N ) are: S u p (P ) = S u p {{P i, P2, P3» = {P i, P 2, P3, A, B , C } S u p (N ) = S u p ({ N i} ) = { N U B ,C } Therefore, th e consistent constraint set is: C S = S u p (P ) - S u p (N ) = {P u P2, P3, A } Since A and P 3 are the m ost general concepts, th e sem antic constraint is deter m ined as: 50 T H I N G S I T U A T I O N O B J E C T P H Y S I C A L A B S T R A C T negative example (“political crisis exploded...”) A N I M A T E I N A N I M A T E V E H I C L E S T R U C T U R E W E A P O N negative example (“an individual exploded a bom b ...' ^ _ negative example l | | | | | | | ^ ‘the airplane exploded...”) l l l i i l i k s s s s s i S E X P L O S I V E ^ G R E N A D E f B O M B D Y N A M I T E l g r V E H I C L E - B O M S ' Range o f positive examples Figure 4.3: Exam ple of single-step generalization Result sem antic constraint = M G C S = {A V P 3 ). Figure 4.3 shows an exam ple of th e determ ination of th e sem antic constraint for the BOMBING: [(instruments) EXPLODE] p attern . In this exam ple, all the positive exam ples are under th e concept EXPLOSIVE, and th e negative exam ples are under SITUATION, ANIMATE and VEHICLE. Therefore, th e sem antic constraint of the instru ment slot is determ ined as the concept WEAPON, after com puting S u p (P )—S u p (N ). 4 .2 .2 P a ra llel im p le m e n ta tio n T he SNAP procedure for th e single-step generalization is as follows: 51 P r o c e d u r e G E N E R A L IZ A T IO N ; \* propagate markers from positive and negative examples *\ W h ile Pi exists do set pos-marker on node P,-; p ro p a g a te pos-marker :rule spread (f-isa)\ W h ile Ni exists do set neg-marker on node N{\ p ro p a g a te neg-marker :rule spread (f-isa)‘ , Wait until end of propagation. \* delete the subsumers of negative examples *\ F or all nodes if (exist neg-marker) d e le te pos-marker, if (exist pos-marker) set cancel-marker, p ro p a g a te cancel-marker :rule seq (r-isa); Wait until end of propagation. \* find out most general one *\ F or all nodes if (exist cancel-marker) d elete pos-marker, \* returns most general one among Sup(P{) — Sup(N { ) *\ R e tu rn (collect pos-marker); 4 .2 .3 In c r e m e n ta l g en er a liza tio n W hen the size of a training set is large, changing sem antic constraints during acquisition m ay speed up th e acquisition process, since th e num ber of th e parsing failure decreases. In the increm ental approach, th e algorithm modifies sem antic constraints as it sees a new exam ple sentence. G eneralization and specialization is perform ed im m ediately when a new positive or negative exam ple is encountered during th e acquisition process. W hen a positive exam ple is encountered, the 52 Concept Hierarchy P i Ai Pi Concept Hierarchy c Neg-markers A f B ' . A Ai Aj A3 j Nj Ni N i Aj v A2 (a) generalization using a positive example (b) specialization using a negative example Figure 4.4: Propagation of m arkers in increm ental generalization sem antic constraint is generalized, and when a negative exam ple is encountered, th e sem antic constraint is specialized. Let S be a current sem antic constraint, and I n f ( S ) be a set of all subsum ee (children) of the concepts in the set S . W hen a new positive exam ple P,- is found, a new set of consistent sem antic constraint C S is determ ined by com puting S u p ( S U {Pt -}) — S u p (N ). T he generalization of corresponding sem antic constraint is perform ed by replacing the current set S to a disjunction of th e m ost general concepts, M G C S , am ong C S . W hen a new negative exam ple N i is found, a new set of consistent sem antic constraint C S is determ ined by com puting I n f ( S ) — S u p (N U {N i}). T he specialization is perform ed by replacing th e current set S to a disjunction of the m ost general concepts, M G C S , among C S . Figure 4.4 shows exam ples of increm ental generalization. In Figure 4.4-(a), the current sem antic constraint is A \, and the sem antic constraint of a new p atte rn (positive exam ple) is P,-. In this exam ple, the S u p (S U P,) and the S u p (N ) are: S u p iS U { P i} ) = S u p ({ A 1,P i}) = { A u A a,P i t A t C } S u p (N ) = { N x ,B ,C } Therefore, th e new consistent constraint set is: 53 C S = S u p (S U {Pt » - S u p (N ) = { A u A3, Pi, A } Since A is th e m ost general concept, the new sem antic constraint is determ ined as: Modified sem antic constraint = M G C S = A . In Figure 4.4-(b), th e current sem antic constraint is A , and th e sem antic con straint of incorrect interpretation (negative exam ple) is N i. In this exam ple, the I n f ( S ) and S u p (N U {iV,}) are: I n f ( S ) = { A ,A 1,A 2,A 3,N i} S u p (N U {iVi}) = Sup({NlyNi}) = {N r, N i, A3, A , B , C } Therefore, the new consistent constraint set is: C S = I n f ( S ) - S u p (N U {Ni}) = { A x, A 2} Since both A \ and A 2 are the m ost general concepts, th e sem antic constraint is determ ined as: Modified sem antic constraint = M G C S = (A i V A 2). In case there were no negative exam ples (no specialization occurs), a sem antic constraint is generalized to the highest level concept in th e concept hierarchy ac cording to above procedure. There are two possible ways to prevent this: 1) Select th e m ost specific concept am ong those which subsum e all th e positive exam ples, and 2) P u t lim itations to th e m axim um generalizable level. In our exam ple in Figure 3.4, the sem antic constraint of th e instrument slot is generalized from BOMB to EXPLOSIVE, and the target slot is generalized from BUILDING to PHYSICAL-OBJECT by using th e m ethod 1). 4.3 Summary In this chapter, the generalization problem was explained, and two different gener alization algorithm s and their im plem entation in m arker-passing instructions were described. An inductive learning m echanism was applied to th e m odification of 54 sem antic constraints by using th e sem antic constraint of a newly created p attern as a positive exam ple, and th e sem antic constraint of an incorrectly m atched p a t tern as a negative exam ple. T he effect of generalization to th e parsing accuracy has been observed through th e experim ents w ith varying several factors, and the results are discussed in C hapter 6. 55 j Chapter 5 Classification Since th e knowledge base is organized hierarchically according to th e general ity of concepts (for the p atterns, head of th e fram e), whenever a new concept is presented, a classification procedure is necessary to m aintain th e consistency of subsum ption relations. In this chapter, the hierarchical organization of the knowledge base is introduced, and an efficient in-m em ory classification algorithm is presented. T he algorithm is also im plem ented by using th e SNAP instruc tions as a massively parallel m arker-passing procedure. T he perform ance of the algorithm is analyzed both m athem atically and experim entally. Since this work had been done m uch earlier th an th e developm ent of th e ac quisition system , th e algorithm s described in this chapter do not have direct rela tions to th e im plem entation of the PALKA system , and th e exam ples used have a slightly different form and context. However, th e concepts presented in this chapter are readily applicable to the current fram e representation, and can also be used to classify th e instances generated by th e parser for fu rth er inferences. ! T he m arker-passing based im plem entation provides a general idea on how th e , inferencing m echanism s can be im plem ented on a distributed knowledge base as j . set-theoretic procedures. 1 5.1 Background Fram e-based system s or sem antic networks have been used for knowledge repre sentation in m any AI researches [11] [13] [69] [81]. By organizing th e concepts I 56 in th e knowledge base hierarchically, properties can be shared by m any concepts, : and they can be accessed m ore efficiently by inheritance. In m ost hierarchical knowledge representation system s, th e subsum ption relation constructs a p ar tial ordering betw een concepts in th e knowledge base. Classification is a process of constructing a concept hierarchy in which m ore general concepts are located j above m ore specific ones according to the subsum ption order. T he classification . procedure simplifies the task of creating knowledge bases and perform s a class of ; inference on th e concept hierarchy which is useful for m any AI applications. ; A lthough th e classification is an im portant and useful process in knowledge representation system s, the processing tim e increases rapidly as th e size of th e j knowledge base increases. It was shown th a t any fram e-based description language 1 w ith reasonable expressive power implies the intractability of com plete subsum p- ' tion algorithm [12]. T he tractab ility of th e algorithm m ay be achieved by reducing th e expressiveness of th e language or by using an incom plete algorithm [64] [76]. Even w ith these restrictions, however, the subsum ption test is still tim e consum ing. Brachm an and Levesque showed th a t th e subsum ption test betw een two concepts for restricted description language takes 0 ( n 2) com putation tim e, where n is the length of th e description [12]. Furtherm ore, a classification of one con cept on th e taxonom y of M concepts needs O (M ) subsum ption tests in th e worst case. Therefore, th e overall tim e com plexity of the classification on a sequential m achine will be 0 ( M n 2) in the worst case, where M is th e num ber of concepts in th e taxonom y and n is th e average length of th e description of one concept. Since th e classification process plays a central role in knowledge processing sys- j terns, th e above tim e com plexity m ay cause problem s as the size of the knowledge base grows. In this chapter, a parallel classification algorithm of tim e complex- 1 ity 0(log M ), which can be achieved by m assive parallelism on a m arker-passing architecture and parallel knowledge representation, is presented. 5.1.1 In h e r ita n c e h ierarch y an d c la ssific a tio n In m any knowledge representation system s, th e generalization relation betw een concepts and their superconcepts forms an inheritance hierarchy. Concepts are connected to th eir superconcepts (subsum ing concepts) by is-a or subsum ption 57 ; links. All th e concepts in th e knowledge base are organized hierarchically ac cording to th e subsum ption ordering. M ore general concepts are placed above less general concepts, and properties are inherited through th e subsum ption links from superconcepts to th eir subconcepts. A prim ary benefit of this subsum ption ordering is th a t it organizes a large num ber of concepts in th e knowledge base in such a way th a t properties are shared by m any concepts. P roperties need to be presented only once at th e highest level superconcept and can be accessed efficiently by subconcepts. T he subsum ption relation betw een two concepts is defined such th a t concept C subsum es concept C ' only if th e set denoted by C necessarily includes the set denoted by C '. T he subsum ption algorithm described by Schmolze and Lipkis [90] perform s a piece-by-piece com parison of th e properties of C including inherited ones w ith those of C ' to determ ine th e subsum ption relation as follows : 1. All prim itive concepts th a t subsum e C also subsum e C '. 2. For each roleset of C , some roleset of C ' denotes th e sam e relation. 3. T he value description of C s roleset subsum es th a t of C"’s roleset. One im portant aspect which m ust be considered in constructing th e concept hierarchy is th a t im plicit relations can exist between concepts in addition to those relations explicitly given. Those im plicit subsum ption relations m ust always be identified to m aintain correct subsum ption ordering. W hen th e subsum ption re lation is not explicitly present, it is necessary to determ ine w hether one concept subsum es th e other or not. Even if th ere is a given superconcept, we m ust find out the M ost Specific Subsum ers (MSSs) of a given concept to m aintain consistency of the concept hierarchy. Classification is a process which identifies th e im plicit subsum ption relation between a new concept and other concepts, and establishes a subsum ption link between them . It consists of steps which determ ine the sub sum ption relationship betw een two concepts and th e identification of th e MSSs among all subsum ers. It also has to find out m ost general concepts among sub- sumees (concepts which are subsum ed) of a new concept in order to change th e subsum ption links from their previous MSSs to th e new concept. Those concepts 58 are called as M ost General Subsumees (M GSs). W hen a new concept is given w ith a set of properties, the classification process identifies the MSSs and M GSs, and ; places the concept at th e proper location in th e concept hierarchy to m aintain the i correct subsum ption ordering. ' 5.2 Parallel Classification Algorithm on SNAP i This section introduces a massively parallel, m arker-passing based classification ) algorithm on a distributed knowledge base. T he algorithm consists of several inner loops for injecting m arkers through th e sem antic network array in which th e I knowledge base is distributed. Boolean operations involving m arkers are used to calculate th e subsum ption relation. T he input of th e algorithm is a description of a concept, and th e output of th e algorithm is th e set of concepts in the knowledge base which are the MSSs and th e MGSs of the given concept. A lthough this algorithm is based on th e SNAP instructions, it can also be used in other parallel machines such as the Connection M achine [39]. T he speedup in the parallel classification algorithm is achieved by parallel associative search, parallel logical operations on m arkers and sim ultaneous prop agation of m ultiple m arkers. It perform s a subsum ption test not by sequential , com parison, b u t by parallel m arker passing. M ultiple subsum ption tests betw een th e input concept and all other concepts in th e knowledge base can be done in : parallel. Finding the MSSs among the subsum ers can be done in constant tim e. i 5.2.1 P a r a lle l su b su m p tio n te s t a g a in st a ll c o n c e p ts i Let C be a new concept which has S i, S 2 , ...S n superconcepts, and has roleset | relations R i, R 2 , ...R m w ith value descriptions V\, V2 , Let S u p (C ) be the ; set of all superconcepts of C , S R {C ) be th e set of all superconcepts of roleset , relation of C , and S V ( C ) be the set of all superconcepts of value description of C s rolesets. T hen th e subsum ption relation between C and its subsum er C , which was defined in th e previous section, can be re-defined by using set relations t 59 1. S u p iC ') C S u p (C ) 2. S R (C ’) C S R (C ) 3. S F (C ") C S'V(C') Based on these relations, th e algorithm for finding all subsum ers of C can be described by th e following three phases (In th e algorithm descriptions, ‘suV refers to th e subsum ption relation, ‘ro/e’ refers to th e roleset relation, and /- and r- m eans forw ard and reverse direction of m arker propagation on each relation link). 1) Phase 1 : For some concept C ' to be a subsum er of C , all th e superconcepts th a t subsum e C ' m ust also subsum e C . i.e., S u p (C ') C S u p (C ). In Phase 1, we filter out th e concepts which violate this condition. W h ile Si exists do set sub-marker on node Si; set ind-marker on node 5,-; p ro p a g a te sub-marker :rule spread (f-sub); p ro p a g a te ind-marker :rule spread (r-sub); Wait until end of propagation. F or all nodes if (not (or sub-marker ind-marker)) set cancel-marker; p ro p a g a te cancel-marker :rule spread (r-sub); Wait until end of propagation. F or all nodes if (exist cancel-marker) d elete ind-marker, A fter Phase 1, all th e nodes which do not satisfy th e above condition are fil tered out by cancel-marker. T he nodes which are m arked by ind-m arkers are the possible subsum ers of C. 2) Phase 2 : Among those concepts m arked in Phase 1, those which have rolesets th a t are not in S R (C ) m ust also be filtered. Those concepts which have value descriptions th a t are not in S V { C ) have to be filtered out too. To do this we first m ark all th e nodes which are m em bers of th e sets S R ( C ) or S V (C ). Since there are rolesets not only explicitly given by description, b u t also im plicitly given as 60 rolesets of given superconcepts, we m ust also propagate m arkers from supercon cepts of C . By doing this, all the rolesets which m ust be inherited to concept C ] can be m arked. | W h ile Si exists do I set sub-marker on node Si', ' p ro p a g a te sub-marker :rule comb (f-sub, f-role)-, W h ile Hi exists do set sub-marker on node i2;; p ro p a g a te sub-marker :rule spread (f-sub)-, W h ile Vi exists do set sub-marker on node Vi; p ro p a g a te sub-marker :rule spread (f-sub)-, Wait until end of propagation. A fter Phase 2, all th e nodes m arked by sub-marker are one of th e following: first, explicitly given superconcepts of C or their subsum ers; second, explicitly given rolesets and value restrictions of C or their subsum ers; th ird , rolesets and value descriptions of C which are found by inference. These are properties inher ited from superconcepts of C . 3) Phase 3 : In this phase, we filter out those concepts (denoted as C ') which have S R (C ’) or S V {C ') such th a t S R {C ') % S R (C ) or S V { C ') £ S V (C ) by propagating a cancel-marker. F o r all nodes if (not (or sub-marker ind-marker)) set cancel-marker, p ro p a g a te cancel-marker :rule comb (r-sub, r-role); Wait until end of propagation. F or all nodes if (exist cancel-marker) d e le te ind-marker, A fter Phase 3, the rem aining nodes m arked by ind-m arker are actual sub sumers of concept C. t 61 : 5 .2 .2 P a ra llel sea rch in g for M S S A fter finding out all th e subsum ers of C , th e MSSs of C m ust be found am ong them to properly place C in the hierarchy. T he MSSs are th e concepts which are located at th e lowest level in subsum er hierarchy. In the sequential m achine, this procedure takes 0 (n ) tim e, where n is th e num ber of subsum ers of C . By using parallel m arker passing, however, this procedure can be done in constant tim e by propagating a cancel-marker one step from all th e subsum ers to th eir own superconcepts. F o r all nodes if (exist ind-marker) set cancel-marker, p ro p a g a te cancel-marker :rule seq (f-sub); Wait until end of propagation. F or all nodes if (exist cancel-marker) d e le te ind-marker, A fter this procedure is done, only those concepts which are located at th e lowest level of subsum er-hierarchy will rem ain. These concepts are th e MSSs of th e concept C . 5 .2 .3 P a ra llel sea rch in g for M G S i A fter finding C, we have to find out MGSs of C , and modify th e inheritance hi erarchy to m aintain th e consistency of subsum ption ordering. T he only possible candidates for MGSs are th e children of th e MSSs of the new concept. Therefore, we only need to search for those concepts which are subsum ed by C am ong the 1 direct children of th e MSSs of C . We first m ark all th e direct children of th e MSSs (denoted as C") and then find out those concepts which satisfy th e following con dition: {R i} C S R {C ") and {K} C S V {C "). clear all markers set ind-marker on node MSS; 6 2 propagate ind-marker :rule seq (r-sub); W h ile Ri exists do set R-marker i on node Ri; p ro p a g a te R-marker i :rule comb (r-sub, r-role); W h ile Vi exists do set V-marker i on node Vi; p ro p a g a te V-marker i :rule comb (r-sub, r-role); Wait until end of propagation. W h ile R-marker i exists do F or all nodes if (not R-marker i) d elete ind-marker, W h ile V-marker i exists do For all nodes if (not V-marker i) d e le te ind-marker, A fter this procedure is done, all the concepts m arked by ind-m arker are the MGSs of th e concept C. A fter finding all th e MSSs and MGSs, the subsum ption link m ust be created betw een the MSSs and C and betw een C and the MGSs. The subsum ption link between the MSSs and the MGSs m ust be deleted to m aintain correct subsum ption ordering. 5 .2 .4 A n e x a m p le The operations of th e above algorithm are shown here by tracing th e m ovem ent of m arkers in th e network through a simple exam ple. We w ant to classify th e input concept whose description is shown in Figure 5.1 against th e concept hierarchy shown in Figure 5.2. Figure 5.2 shows th e propagation of sub-m arker and ind- m arker in Phase 1 and 2, and Figure 5.3 shows th e propagation of cancel-marker in Phase 3. In this exam ple, C = s tr i n g - p l a y e r , and according to th e description of s t r i n g - p l a y e r , R — p la y , V — s tr in g - in s tr u m e n t, and S = a r t i s t . In Phase 1, the sub-m arker and ind-m arker are propagated from th e node a r t i s t along th e subsum ption link, and th e nodes p e r s o n , a r t i s t , m u s ic ia n , v i o l i n i s t and p i a n i s t are all m arked. T he canceling procedure is not shown 63 Ill*'- : M aiker propagation . jg a r e la tio n -------------------------------------------: Roleset relation * : Subsumers sub ind cancel £ j, : markers lS-i play ''string^ instrument string- player Figure 5.1: D escription of th e input concept here since there is no m ultiple inheritance in this exam ple. In Phase 2, the sub m arkers sue propagated from th e a rtis t(* S ), play(-R) and s tr in g - in s tr u m e n t( V) along the subsum ption link and roleset link. A fter Phase 2, all th e supercon cepts and th e roleset relations of concept s tr i n g - p l a y e r , including inherited properties, are m arked. In this exam ple, those m arked nodes are Sup(C ) — { a r t i s t }, S R (C ) = { p r o f e s s io n , p la y , d e a l- w ith } and S V (C ) — { a r t , s t r i n g - i n s t r u m e n t , m u s ic a l-in s tru m e n t }. In Phase S, th e cancel-marker is set on all the nodes which are not m arked by ind-m arker or sub-m arker, and propagated along th e subsum ption link and roleset link in reverse direction. In our exam ple, the ind-m arkers at v i o l i n i s t and p i a n i s t nodes are canceled by th e cancel-marker since ‘p la y v i o l i n ’ and ‘p la y p ia n o ’ are not th e property of s tr i n g - p l a y e r . Finally, there rem ains th e a r t i s t and th e m u sic ia n , which are th e subsum ers of th e concept s tr i n g - p l a y e r . A fter finding out all th e subsum ers, we m ust search for th e MSSs am ong those subsum ers. To find out th e MSSs, a cancel-marker is set on the subsum ers and propagated one step up through th e subsum ption link. By doing this, all the subsum ers except th e lowest level node are canceled. In our exam ple, th e cancel- m arker is propagated from th e a r t i s t and th e m u sic ia n . T he m arker from m u sic ia n cancels a r t i s t , and finally the m u sic ia n is identified as a MSS of the s tr i n g - p l a y e r . 64 S n i a a i l n » - . i i i i i i i i i i |||I « < artist J ------------------- *"( ^ ) deal-with ~ / musical musician — vinstramen string instrument/ vinstrumen m rf pf, - \ sub violinist violin pianist piano Figure 5.2: Propagation of m arkers at P hase 1 and 2 'm u s ic a l'' .instalment, musician 'string'X /iceyboaw ' instrument/ vmstrumeni cancel violinist violin pianist piano Figure 5.3: Propagation of m arkers a t P hase 3 1 I person profession ^ ------- \ o — deal-with artist M SS ind 'm usical^ instrument, musician ''s tr in g '''' .instrument in stru m e n t iM. violinist violin play M G S •m l' Ind pianist piano play Figure 5.4: Propagation of m arkers to find out th e M G S M G S person profession deal-with musical instrument musician string ^ /Tceybo instrument/ vinstrumen violinist violin pianist Figure 5.5: T he resulting hierarchy after adding th e new concept t I 6 6 I A fter finding MSSs, we m ust find out MGSs and modify th e hierarchy. Fig ure 5.4 shows th e propagation of ind-m arker, R -m arker and V-marker. T he ind- m arker is propagated from th e node m u sic ia n (M SS) and its children v i o l i n i s t and p i a n i s t are m arked. A fter propagating R -m arker and V-m arker from p la y (R) and s tr in g - in s tr u m e n t( V), the ind-m arker at p i a n i s t is canceled, and v i o l i n - , i s t rem ains as a MGS of s t r i n g - p l a y e r . Finally, th e subsum ption link m ust be created betw een m usician(M 5'5) and s tr in g - p la y e r ( C ) , and betw een s tr in g - p la y e r ( C ) and v i o l i n i s t (M G S). T he subsum ption link betw een m usician(M 5'5) and v io lin ist(M (7 .S ) m ust be deleted. Figure 5.5 shows th e resulting hierarchy. 5.3 Performance of the Algorithm 5 .3.1 T im e c o m p le x ity It was already m entioned th a t sequential classification algorithm takes 0 ( M n 2) tim e, w here M is th e num ber of concepts in th e taxonom y and n is th e num ber . of properties of a concept. In order to analyze th e tim e com plexity of th e parallel classification algorithm , following param eters are defined. • Fin : Average num ber of fan-in for a concept in th e hierarchy (average num ber of direct superconcepts for one concept). • Fout : Average num ber of fan-out for a concept in th e hierarchy (average . num ber of direct subconcepts for one concept). • Rave • Average num ber of roleset relations for one concept which are explic- , itly given (locally exist in th e inheritance hierarchy). I • Lave Average length of p attern for one roleset relation. This is th e length of longest chain through which m arkers m ust propagate to perform inferences of property inheritance. • Dave ' ■ Average depth of the concept hierarchy. • M : Total num ber of concepts in th e hierarchy. 67 In Phase 1 of th e algorithm , we m ust propagate m arkers through all th e super concepts and th e subconcepts of a given concept. Therefore, it takes 0 ( F outD ave + Fin Dave) tim e to finish th e propagation. Generally, Fout is m uch greater th an F,n, so we can simplify th e tim e for Phase 1 as 0 ( F outD ave)- In Phase 2 and Phase 3, m arkers are propagated through th e subsum ption link and the roleset link to gether. Therefore, th e propagation tim e is 0 ((F in + R av e ) L ave), and it can also be simplified to 0 ( R aV eLave) as above. As a result, th e overall tim e for th e parallel subsum ption test is 0 ( F outD ave -f R aveL ave). B ut Fout and R aV e factors only af fect th e m arker injecting tim e which is relatively sm all and negligible. Moreover, these factors are not related to th e num ber of concepts in th e concept hierarchy. In other words, Fout and R ave do not grow as the size of th e knowledge base grows, and these factors can be regarded as constants. Furtherm ore, if we assum e th a t m ost roleset relations are not transitive (which is generally true: e.g., A like B and B like C do not im ply A like C), L ave is strictly related to D aV e, and it is less th an D ave. Therefore, th e tim e com plexity of th e parallel subsum ption test can be simplified as 0 ( D ave). T he search for MSSs can be done in constant tim e and th e tim e com plexity for searching MGSs is same as th a t of parallel subsum ption test. Consequently, 0 ( D ave) is th e overall tim e com plexity of the algorithm . Since we w ant to know the tim e com plexity of th e classification as a function of the size of th e knowledge base, D ave m ust be replaced by some function of M (the num ber of concepts in taxonom y). Generally, th e concept taxonom y has a tree like structure, and the average depth of th e tree can be represented as log A where N is th e num ber of nodes in th e tree and th e base of logarithm is the branching factor. Therefore, the tim e com plexity of the parallel classification algorithm as th e function of th e taxonom y size is finally O ^ lo g p ^ M ). It was already shown th a t the sequential classification needs O (M ) subsum p tion tests and each subsum ption takes 0 (R 'ave2) tim e, where R ‘ ave represents the num ber of overall roleset relations for one concept including the inherited ones. This can be calculated as D aV eRave or (lo g M )R aV e• Therefore, th e overall tim e com plexity of th e sequential classification is 0 (M (lo g M )2 R ave2). Thus, if we re gard Rave as a constant, th e speed up factor achieved by the parallel algorithm is O (M lo g M ), and th e parallel algorithm greatly im proves th e perform ance as th e knowledge base size grows. 5 .3 .2 S tr a te g ie s for p erfo rm a n ce im p ro v em en t Since the com m unication tim e to propagate m arkers generally dom inates th e over all processing tim e, to im prove th e perform ance th e physical distance of propa gation m ust be reduced. Also, th e num ber of m arkers which are propagated sim ultaneously m ust be reduced because additional m arkers cause m ore collisions during propagation, which results in increased propagation delays. Therefore th e following factors can be considered in general. 1) Allocation o f concepts in the network: Efficient allocation of concepts on the netw ork can im prove the perform ance of m arker passing algorithm s. Since there is a restricted num ber of neighbors for each chip, we m ust optim ize th e commu nication distance betw een linked concepts to m inim ize th e m arker propagation tim e. This can be achieved by analyzing the whole knowledge base a t th e tim e it is loaded. To m aintain th e optim al com m unication distance, however, we m ust re-allocate th e whole netw ork everytim e a new concept is added or a change on th e knowledge is m ade. To avoid this problem , a dynam ic allocation scheme which is based on sim ple heuristics is used to reduce th e com m unication distances. 2) Reducing the search space vs. reducing the m arker propagation space: In m any AI algorithm s, th e perform ance of th e algorithm can be usually im proved by reducing th e search space using various heuristics. In a classification algorithm , th e search space for MSS can be reduced by considering th e subsum ption relation betw een th e concepts. W hen one concept is found as a subsum er of an input concept, its parent concepts do not need to be tested. They are also subsum ers. W hen none of the parents of a concept are subsum ers, th e concept does not need to be tested. It can not be a subsum er. W ith these rules th e search space to find out subsum ers can be reduced significantly. In parallel im plem entation, th e perform ance of th e algorithm can be im proved in a sim ilar way, by reducing th e m arker propagation space. It can be done in n w w n jg it* . : Marker propagation ». : Roleset relation »• : Isa relation O : Concepts p : Rolesets Subsumer o f C Figure 5.6: Propagation of cancel-marker w ithout space reduction Figure 5.7: Propagation of cancel-marker in restricted space 70 two ways. F irst, restrict th e m arker passing route by p u ttin g some conditions on th e relation nodes, which prevent th e m arkers from further propagation beyond those relation nodes. Second, reduce the num ber of source nodes (starting node) for certain m arkers. This can be done by pre-propagating a dum m y m arker and filtering out the unnecessary source nodes before th e initiation of real m arkers. In parallel classification algorithm , there can be a large num ber of redundant cancel-m arkers in phase 3 according to th e input description and th e structure of the concept hierarchy. A large num ber of redundant m arkers causes heavy traffic of m arkers, which m ay degrade the perform ance of the overall algorithm . T he num ber of cancel-m arkers in this phase can be reduced by restricting th e source of cancel-m arkers. As an exam ple, Figure 5.6 and Figure 5.7 show this restriction. In th e concept hierarchy, S is known (explicitly given) superconcept. T he parent concepts of S are already known as subsum ers, and the MSS can be found among their children. To filter out those concepts which are not subsum ers of th e input concept, th e cancel-m arkers start from all th e nodes which are not in th e set of Sup, S R and S V as in Figure 5.6. In Figure 5.6, all the sources of m arkers which are not connected to th e children nodes of S are redundant. T he m arkers started from those nodes do not affect the result. To prevent th e m arkers from starting from th e unrelated nodes, in Figure 5.7, a dum m y m arker is pre-propagated from th e candidate nodes for MSS, and restrict th e source nodes of cancel-m arker. T he shaded area in Figure 5.7 shows the reduced m arker propagation space as a result of pre-propagation. 5.4 Summary Classification is a process of constructing a concept hierarchy according to th e sub sum ption ordering of concepts in the knowledge base. In this chapter, a m assively parallel classification algorithm , which can be perform ed on a parallel m arker- passing architecture, was presented. T he parallel classification algorithm is ta i lored for th e SNAP m arker passing architecture, and its knowledge representation scheme. It deals w ith a generalized stru ctu re for a knowledge base and is focused 71 m ainly on an efficient parallel im plem entation. It does not deal w ith m ore com plex cases such as m ultiple inheritance or exceptions. However, th e algorithm can be im plem ented on other parallel m achines, and can also be expanded or modified for m ore complex cases or other knowledge representation schemes. SNAP supports m ultiple m arker passing and associative processing on the knowledge base which is distributed in th e array. By using these capabilities, the classification and the property retrieval of a concept on SNAP can be perform ed in 0(logpoutM ) tim e, w here M represents the to ta l num ber of concepts in the knowledge base and Fout represents th e average fan-out of concepts. By reducing the propagation space, the perform ance of th e algorithm can be im proved even further. However, even w ithout these im provem ents, th e processing tim e of th e parallel classification algorithm , using th e m arker-passing, achieves high perform ance com pared to th e sequential algorithm . T he actu al experim ental results are presented in th e next chapter. 72 Chapter 6 Experimental Results This chapter describes several experim ental results on p atte rn acquisition, general ization and classification algorithm s. T he PALKA system and parallel algorithm s presented in previous chapters are used for these experim ents. T he experim ental results on th e acquisition rate and the recognition accuracy dem onstrate th e feasi bility of th e representation and acquisition approach described in this thesis. T he experim ental results on th e effect of generalization show how th e recognition ac curacy improves as generalization is perform ed. The efficiency of parallel m arker- passing is dem onstrated by th e perform ance of th e classification algorithm . T he experim ent environm ents are described first, and then th e experim ental results are discussed in detail. 6.1 The Environments T he dom ain texts used for the experim ents are the M UC (Message U nderstanding Conference) tex t sets used for MUC-4 and MUC-5. These tex ts are not sam ple sentences m ade for th e experim ental purpose, b u t real-world news articles. T he sentences are lim ited in term s of their m eaning (sem antics), b u t unlim ited in term s of th eir form (syntax). T he M UC-4 dom ain is a terrorist incident dom ain, which is a set of news a rti cles describing terrorist events in Latin America. 500 texts, each of which contains 14 sentences on average, were used for th e experim ent on the acquisition. The 73 MUC-5 dom ain is a joint-venture dom ain, which is also a set of news articles de scribing a cooperative association between different companies. 500 news articles were also used for the experim ents. For th e M UC-4 dom ain, there exist m atching tem plates for 1,400 tex ts, and those tem plates were also used for th e acquisition. For th e MUC-5 dom ain, only texts were used as described in earlier chapters. T he PALKA prototype was im plem ented using C on a SUN w orkstation, and all the experim ents were perform ed w ith the SNAP sim ulator. T he SNAP sim ulator provides a standard C program m ing environm ent and a special purpose instruction set for parallel m arker-passing w ithin a distributed knowledge base. Several parts of th e PALKA system and classification algorithm were im plem ented using those SNAP instructions. For th e classification experim ents, a version of sim ulator th a t can m easure th e perform ance of th e algorithm in term s of th e num ber of m achine cycles was used. The knowledge base (dom ain concept hierarchy) used for th e acquisition and generalization contains 12 K sem antic concepts. T he dictionary contains m ore th an 16 K words including domain-specific term s and 1.7 K entries of domain- specific noun-phrases. All the noun entries have one or m ore connections to the concepts in th e concept hierarchy. T he domain-specific term s and noun phrases include organization nam es, geographic nam es, nam es of weapons, etc. for MUC- 4, and com pany nam es, governm ent organization nam es, product nam es, etc. for MUC-5. These domain-specific term s were used to detect inform ation carrying words when a p a ttern is constructed. 6.2 The Experiments Four different groups of experim ental results are described in this section. F irst, th e usefulness of the PALKA system is discussed in term s of the tim e needed to construct the knowledge base of sem antic patterns. Second, the feasibility of th e approach is dem onstrated by observing the change of th e acquisition rate while processing texts, and th e change of the recognition accuracy as sem antic p a t terns are generated. T hird, the efficiency of generalization is shown through th e 74 frame sentences extracted patterns acquired generali zations speciali zations FP-structures created average acquisition average creation BOMBING 220 89 22 5 67 40.5% 75.3% KILLING 601 108 71 12 37 18.0% 34.3 % TIE-UP 703 263 - - 122 37.4% 46.4% Figure 6.1: R esult of the acquisition from 500 M UC-4 and M UC-5 texts im provem ent of recognition accuracy. Finally, th e efficiency of parallel m arker- passing in term s of processing speed is dem onstrated w ith th e classification algo rithm . 6 .2.1 T im e to c o n str u c t th e k n o w le d g e b ase One of th e m ajor goals of th e developm ent of th e au tom ated sem antic p attern acquisition tool is to reduce th e tim e to analyze dom ain texts and th e tim e to construct a set of properly structured patterns. It gives th e portability when a tex t processing system moves from one dom ain to th e others, and th e scalability when the application dom ain keeps growing. For the MUC-4 dom ain, the tim e to acquire 114 p attern s for BO M BIN G and KILLING frames from 500 news articles was less th an 5 hours including m anual post-processing such as m inor corrections on th e phrasal p a tte rn form. For the M UC-5 dom ain, the tim e to acquire 122 pattern s for T IE -U P fram e from 500 texts was less th an 8 hours. For the same tasks, m anual creation of the pattern s took m ore th an 1 person-m onth. 6 .2 .2 A c q u isitio n ra te Figure 6.1 shows th e to tal num ber of sentences extracted for each fram e from 500 tex ts, the num ber of new F P -structures acquired, th e num ber of generaliza tions and specializations perform ed, the num ber of final F P -stru ctu res, and the average num ber of F P -structures created per one sentence. T he result shows th a t 75 THE BOMB, MADE UP OF DYNAMITE AND A FUSE, EXPLODED JUST BEFORE DAWN IN THE HONDUTEL OFFICE IN SAN PEDRO SULA, 190 KM NORTH OF THIS CAPITAL. BOMBING: [ (INSTRUMENT: BOMB) explode In (TARGET: BUILDING) ] POLICE SOURCES HAVE REPORTED THAT THE EXPLOSION CAUSED SERIOUS DAMAGE TO THE SALVADORAN EMBASSY BUILDING IN THE ELEGANT PROVTDENCIA NEIGHORHOOD IN EASTERN SANTIAGO. BOMBING: ( (EXPLOSION) cause (EFFECT: DAMAGE) to (TARGET: BUILDING) ] GUERRILLAS ATTACKED MERINO'S HOME IN SAN SALVADOR 5 DAYS AGO WITH EXPLOSIVES. BOMBING: [ (AGENT: HUMAN) attack (TARGET: BUILDING) with (INSTRUMENT: EXPLOSIVE) ] TERRORISTS THREW A BOMB AT CIVIL DEFENSE MEMBERS IN NEJAPA, NORTH OF SAN SALVADOR, WHILE HARASSING THE PARAMILITARY GROUP'S OUTPOST. BOMBING: [ (AGENT: HUMAN) throw (INSTRUMENT: BOMB) at (TARGET: PHYSICAL)) THE OFFICIAL REPORT POINTS OUT THAT THE POLICEMEN WERE KILLED BY THE EXPLOSION OF DYNAMITE THAT HAD BEEN PLACED IN THE VEHICLE TRANSPORTING THEM. BOMBING: [ (TARGET: HUMAN) be kill by (EXPLOSION) of (INSTRUMENT: BOMB) ] TWENTY EIGHT POLICEMEN HAVE BEEN KILLED IN THIS CITY OVER THE LAST TWO WEEKS. KILLING: [ (TARGET: HUMAN) be kill ] NEVERTHELESS, ACCORDING TO MILITARY REPORTS ISSUED EIGHT HOURS AFTER THE CLASHES IN THAT TOWN, THE_ARMED_FORCES KILLED 11 GUERRILLA MEMBERS AND SUSTAINED SIX CASUALTIES. KILLING: ( (AGENT: ANIMATE) kill (TARGET: HUMAN) ] YESTERDAY S HININ G_PATH TERRORISTS ARRIVED IN THE VILLAGE OF dUNCHIPE AND SHOT 16 PEASANTS WHO WERE MEMBERS OF THE PEASANT PATROLS. KILLING: [ (AGENT: HUMAN) shoot (TARGET: HUMAN) ] KYOCERA AGREES TO FORM JOINT_VENTURE FOR FINANCING AND LEASING WITH SANWA BANK. TIE-UP: [ (ENTITY1: COMPANY) agree to form (JOINT_VENTURE) with (ENTITY2: COMPANY) ] CIBA-GEIGY LIMITED OF BASEL, SWITZERLAND , AND CIBA-GEIGY CORPORATION HAVE SET UP A JOINT_VENTURE IN MICROELECTRONIC MATERIALS. TIE-UP: ( (ENTITY1: COMPANY) and (ENTITY2: COMPANY) set up (JOINT_VENTURE)) MITSUBISHI CORP_ IS ENTERING INTO A JOINT_VENTURE WITH THE AYALA CORP_ TO CONVERT 300 HECTARES OF RBAL_ESTATE IN LAGUNA PROVINCE INTO ANOTHER INDUSTRIAL PARK. TIE-UP: [ (ENTITY1: COMPANY) enter into (JOINT_VENTURE) with (ENTITY2: COMPANY) ] CINCINNATI BELL INFORMATION SYSTEMS FORMED A JOINT_VENTURE WITH KINGSTON COMMUNICATIONS PLC TO MARKET SOFTWARE PRODUCTS AND SERVICES IN EUROPE. TIE-UP: [ (ENTITY1: COMPANY) form (JOINTVENTURE) with (ENTITY2: COMPANY) ] Figure 6.2: Exam ple sentences and corresponding sem antic p attern s acquired 76 75.5% of th e acquired p attern s created a new F P -stru ctu re for th e BOMBING fram e, 46.4% created a new F P -stru ctu re for th e TIE-UP fram e, and only 34.3% created a new F P -stru ctu re for the KILLING fram e. This shows th a t a relatively sm aller num ber of different expressions is used to describe th e KILLING event in this do- j m ain. As an exam ple, the p attern “[(target:X) BE KILL]” was found 97 tim es during j the acquisition process. Several exam ples of collected sentences and th e acquired F P -structures from them are shown in Figure 6.2. A basic assum ption of our approach to sem antic p attern acquisition is th a t only a finite num ber of expressions is frequently used in a specific dom ain to represent a specific event. In other words, the p attern s acquired from a relatively small num ber of sam ple tex ts can cover a m uch larger num ber of texts from the same dom ain. The growth of th e knowledge base eventually becomes satu rated . Figures 6.3, 6.4 and 6.5 show th e changes of acquisition rates for th e BOMB ING, KILLING and TIE-UP frames while processing 500 texts. Since the acquisition rate varies depending on th e order of sentences processed, 100 experim ents were perform ed w ith random re-ordering of sentences, and th e results were averaged. Also, to observe th e effect of generalization to th e acquisition rate, th e increm en tal algorithm was used in these experim ents. In Figure 6.3, the acquisition rate decreases w ithout being satu rated yet. This is because: 1) there were only 200 related sentence exam ples found, and 2 ) as m entioned earlier, a relatively large num ber of expressions are used to describe th e BOMBING event. M ore exam ple sentences are needed to reach the saturation. In Figures 6.4 and 6.5, th e acqui sition rate for the KILLING fram e is alm ost satu rated when 200 sentence exam ples are processed (only 2 /3 of to tal processing is shown), and th e acquisition ra te for th e TIE-UP fram e approaches the saturation point w ith 600 sentence exam ples. In all cases, it is clear th a t th e acquisition rate strictly decreases, which m eans th a t the size of th e knowledge base approaches th e saturation point. Also, in the experim ents w ith the BOMBING and th e KILLING fram e, th e effect of generalization on the acquisition rate is clearly shown. W ith generalization, the acquisition rate decreases m ore rapidly because some of th e p attern s are not actually created but generalized w ith others. 77 0.6 1 0.5 0.4 * 0.3 without generalization 0.2 with generalization Z 0.1 100 200 Processed sentence number Figure 6.3: Average num ber of p attern s created for BO M BIN G fram e 78 0.5 0.4 without generalization I 0.3 of p < with generalization £ 0.2 0.1 300 400 100 200 Processed sentence number Figure 6.4: Average num ber of p attern s created for KILLING fram e 79 0.4 i 0.3 without generalization S ' 0.2 0.1 0 100 200 500 600 300 400 Processed sentence number Figure 6.5: Average num ber of pattern s created for T IE -U P fram e 80 1 00 90 80 70 60 50 40 30 20 10 0 20 40 60 80 100 120 140 160 180 200 Training set size Figure 6.6 : Recognition accuracy vs. training set size for BO M BIN G fram e 81 100 90 80 70 60 50 40 30 20 10 0 20 40 60 80 100 120 140 160 180 200 Training set size Figure 6.7: Recognition accuracy vs. training set size KILLING fram e 82 1 00 90 80 70 60 50 40 30 20 10 0 40 80 120 160 200 240 280 320 360 400 Traning set size Figure 6 .8 : Recognition accuracy vs. training set size for T IE -U P fram e 83 Figures 6 .6 , 6.7 and 6.8 show th e im provem ents of parsing perform ances as m ore sentences are used to acquire pattern s for the three frames. For these exper im ents, various sizes of training sentence sets were used to generate p attern s, and 200 sentences corresponding to each fram e were random ly selected from MUC-4 and MUC-5 texts as a test set. As th e previous experim ents on th e acquisition rates, th e results from 100 experim ents were averaged. In Figure 6.6 and 6 .8 , th e recognition accuracy reached 60% using th e pattern s acquired from 200 and 400 training sentences, respectively. It shows th a t m uch more sam ple sentences are required to cover th e BOMBING and th e TIE-UP frames. In Figure 6.7, th e recognition accuracy for th e KILLING fram e reached 90% using th e p attern s from the sam e num ber of training sentences. 6 .2 .3 G en era liza tio n To observe th e effect of generalization on the parsing perform ance, excluding the effect of th e grow th of th e knowledge base, experim ents w ith a single p a tte rn were perform ed. For these experim ents, 100 sentences which can be m atched to the p a ttern “[(instrumentrX) EXPLODE]” were collected from MUC-4 corpus, and tagged as positive or negative exam ples. For exam ple, as shown in Figure 4.3, “three dynam ites exploded’ is a positive exam ple, and “the airplane exploded’ is a neg ative exam ple for the “[(instrument:X) EXPLODE]” p attern . From th e sentence set, n (training set size) sentences were random ly selected, and single-step general ization was perform ed on th e sem antic constraint X of the instrument slot. The generalization result was then used to parse th e original 100 sentences to com pute th e recognition accuracy. Since the generalization result m ay vary according to th e selection of the training sentences, 100 experim ents were perform ed for each training set size, and the resulting recognition accuracies were averaged. T he results in Figures 6.9 and 6.10 show th a t th e recognition accuracy is mono- tonically increasing when th e generalization is perform ed w ith a larger training set. However, it is affected by: 1) how m any negative exam ples are in th e training sentence set, and 2) how m any concepts are necessary to represent th e optim al sem antic constraint. 84 95 ♦ X i ' 90 85 + 1 0 % n egative exam ple A 2 0 % n egative exam ple Q 4 0 % negative exam ple 80 75 5 15 20 (% of entire text) Figure 6.9: Effect of generalization on the parsing perform ance w ith various per centages of negative examples 85 95 M ' 90 85 80 1 group sem antic constraint 2 group sem antic constraint 3 group sem antic constraint 75 10 Training set size 15 5 20 ( % o f entire text) Figure 6.10: Effect of generalization on th e parsing perform ance for the pattern s w ith various sem antic constraint group lengths 8 6 Figure 6.9 shows th e effect of generalization on th e recognition accuracy for • various ratios of negative exam ples in th e sentence set. T he result shows th a t i if there exist a larger num ber of negative exam ples, th e recognition accuracy ! | improves m ore rapidly as th e training set size increases. It also im plies th a t it is b e tter to select m ore negative exam ples as a training set. Figure 6.10 shows th e effect of generalization on the recognition accuracy for p attern s w ith various sem antic constraint group lengths. T he sem antic constraints ; of different p attern s m ay have different lengths of description. For exam ple, 3 ; ! group sem antic constraint m eans th a t th e optim al sem antic constraint is repre sented by the disjunction of 3 concepts as (A \ V A 2 V A3). In this experim ent 1 three p attern s w ith different sem antic constraint group lengths were used. The result shows th a t m ore complex descriptions of sem antic constraints need m ore exam ples to produce a correct generalization. In all cases, the average parsing recognition accuracy reached 90% w ith gen eralization using far less th an 2 0 % of the entire set of texts related to a specific pattern . 6 .2 .4 C la ssifica tio n T he classification algorithm has been sim ulated on the SNAP sim ulator to dem on strate the efficiency of parallel m arker-passing. Knowledge bases of arb itrary size were generated for sim ulation w ith different Fouts. The stru ctu re of th e knowledge base used in th e sim ulation is sim ilar to th e one shown in the previous exam ples i I in C hapter 5, which have tree-like p attern . T he m ultiple inheritance case was not considered for sim plicity of sim ulation. T he effect of fan-out on the processing tim e was sim ulated w ith a fixed num ber of roleset relations and a fixed size of the knowledge base. T he sim ulation result w ith th e knowledge base size of 256 concepts and one local roleset relation for each concept is shown in Figure 6.11 for Fout of 2 to 8 . T he graph shows th a t Fout does not significantly affect the com putation tim e as previously m entioned in C hapter 5. R ather, th e com putation tim e decreases slightly as Fout increases. , This is because the depth of the concept hierarchy decreases as Fout increases 750 N um ber o f nodes = 256 & o © 1 700 650 2 4 5 7 3 6 8 Fan-out Figure 6.11: Effect of Fan-out on the processing tim e of classification 8 8 800 600 & o 400 ^ 200 16 32 64 128 256 Number of concepts Figure 6.12: Processing tim e of classification for different knowledge base size 500 CO Q y U 400 * 8 300 fan-out = 2 200 16 32 64 128 256 Number of concepts Figure 6.13: Processing tim e of property retrieval for different knowledge base size 89 when th e num ber of concepts in the knowledge base is fixed. Even though th e m arker injecting tim e m ay increase, it can not significantly affect th e processing tim e since the decrease of m arker propagating tim e due to th e sm all depth of hierarchy dom inate overall processing tim e. Figures 6.12 and 6.13 show th e results of sim ulation w ith various sizes of the knowledge base. T he fan-out was fixed to 2, and the num ber of th e roleset relation (locally presented) was fixed to 1. As shown in the figure, the com putation tim e increases alm ost logarithm ically when th e knowledge base size is small. As the knowledge base size grows, the com putation tim e of classification becomes slightly larger th an the predicted execution tim e. This is caused by the com m unication overhead due to th e heavy traffic of cancel-m arker. If several m arkers are received by a node sim ultaneously during th e propagation, th e propagation of m arkers is delayed through buffering. As th e num ber of nodes in th e netw ork increases, heavy traffic of m arkers causes m ore delays, and this m ay affect th e processing tim e. This can be alleviated by increasing the granularity and reducing the propagation space of m arkers as described in th e previous chapter. 6.3 Summary In this chapter, several experim ental results on p a tte rn acquisition, generalization, and classification algorithm s have been discussed. T he change of the acquisition rate and th e recognition accuracy shows th a t th e saturation occurs w ith a rela tively sm all am ount of training texts, if th e dom ain is reasonably lim ited. The experim ental results on the effect of generalization show th a t th e parsing accuracy strictly im proves as th e generalization is perform ed. T he analysis of th e tim e com plexity of th e classification algorithm has also been verified through experim ents. M ost im portantly, the experim ental result on th e acquisition rate shows th e feasibility of th e representation and the acquisition approach described in this thesis. W hat th e results dem onstrate is th a t only a relatively sm all num ber of expressions is frequently used to represent certain events or facts in a lim ited dom ain, and consequently, the phrasal p attern based knowledge representation 90 presented in this thesis, together w ith an autom atic acquisition tool, can be suc cessfully applied to th e inform ation extraction from texts, providing scalability, portability, and practicality. 91 Chapter 7 Conclusion and Future Work This thesis addressed th e issue of autom atic sem antic p atte rn acquisition for prac tical inform ation extraction from n atu ral language texts. For scalability and speed, a new representation of sem antic p attern was presented, and a m arker- passing was introduced as a suitable paradigm for massively parallel processing. This research has been perform ed in the context of th e M UC task and th e SNAP project. T he special features of th e inform ation extraction task in th e M UC, to gether w ith the parallel p attern m atching capability of th e SNA P m otivated the work in this thesis. In this final chapter, th e contributions of this thesis are reviewed, and the directions for future research are discussed. 7.1 Contributions T he goal of this research is to provide scalability and speed to inform ation extrac tion system s, so th a t it can be used for practical applications and can be adapted easily to a new dom ain. T he m ajor contributions of this work are as follows. T h e F ram e-P hrasal pattern representation Based on th e observations on the unique features of inform ation extraction task, a sem antic p attern representation called an F P -stru ctu re has been developed. In this structure, the sem antic p atte rn is represented as a pair of a m eaning fram e 92 and a surface phrasal p attern . By using this representation, th e m apping from in p u t tex t to inform ation stru ctu re can be perform ed directly. It provides scalability since th e stru ctu re is regular and can be acquired from texts, and provides speed since th e interpretation of input becomes a p atte rn m atching for which parallel 1 processing is feasible and effective. t T h e au tom atic p attern acquisition A practical approach to lexical (sem antic) acquisition has been developed. T he construction of sem antic p attern knowledge base is autom ated by developing an acquisition system PALKA. It acquires domain-specific sem antic p attern s from training texts, by using corresponding tem plates (databases) and a dom ain con cept hierarchy. T he saturation of the knowledge base size (convergence of the acquisition rate) in several experim ents shows th e feasibility of this approach. Significantly reduced knowledge base construction tim e dem onstrates th e useful ness of th e acquisition tool. In d u ctive generalization o f sem antic constraints An inductive algorithm for generalization of sem antic constraints w ithin a con cept hierarchy has been developed. This algorithm finds out the optim al level of generalization for sem antic constraints from positive and negative exam ples. T he single-step generalization was also im plem ented as a parallel algorithm on a distributed concept hierarchy. A p p lication o f m arker-passing parallel m odel The parallel m arker-passing m odel has been applied for m ajor parts of the sys tem : P a tte rn m atching for inform ation extraction, single-step generalization, and classification on a distributed concept hierarchy. These algorithm s have been im plem ented by using th e SNAP instructions. T he experim ental results on th e classification algorithm show th e efficiency of the m arker-passing. By distributing the sem antic p attern s in the m em ory of th e SNAP, and by perform ing p attern m atching by m assively parallel m arker-passing, th e SNAP system could achieve fast perform ance in inform ation extraction. 93 7.2 Future Research There are lim itations on th e current representation of p attern s, scope of acquisi tion and knowledge source availability. In this section, possible fu tu re work for im provem ents of th e acquisition m ethod described in this thesis and possible ap plications to other tasks are discussed. P rovid in g flexib ility to th e p attern In the F P -stru ctu re representation, th e phrasal p attern form is very rigid. Even though one elem ent represents a sem antic m eaning of the whole noun group, th e sequence of elem ents is strictly specified. In some cases, a m ore flexible specifi cation of a p attern m ay produce a b e tte r recognition result. For exam ple, if a p a ttern for BOMBING is specified as “[(instrument: EXPLOSIVE) ... (target: BUILDING)]” (which m eans there is a word whose sem antic category is EXPLOSIVE followed by a word whose category is BUILDING), then various sentences can be m atched to this p attern . One problem is th a t if a p attern is too flexible, it not only produces m any correct m atchings, b u t also produces m any incorrect results. T here is a trade-off between the flexibility and th e recognition accuracy. A carefully chosen flexibility can cover a wide range of sentence patterns, while m aintaining reason able accuracy. It totally depends on th e characteristic of each dom ain. A cq u isition o f noun phrase pattern s C urrently, only the sem antic p attern s for verb-oriented clauses are acquired. To provide m ore detailed inform ation for each slot of th e fram e, sem antic p attern s for noun phrases should also be acquired. For exam ple, in the sentence “T he bom bing of PR C em bassy by urban guerrillas was reported ...” , th e subject noun phrase contains all th e necessary inform ation, and th e m ain verb phrase “be reported” need not be represented as a p attern. In vestigation of relations betw een p attern s T he current system merges several pattern s only when those p attern s have ex actly sam e phrasal structures w ith different sem antic constraints. T he m erge in 94 th a t case is perform ed through th e generalization. M erging of p attern s, th a t have different structures b u t have common parts, can also be accomplished. Also, re lations betw een different patterns can be investigated. M erging different p attern s and establishing relations are desirable for b o th th e efficiency of representation ! and the flexibility of interpretation. U se o f other know ledge sources Since one of th e knowledge sources, tem plates, is not always available, other knowl edge sources should also be considered. A lthough th e sem antic inform ation in th e tem plate can be easily provided, it still needs ex tra work. T here is th e user in teractive m ode of acquisition, b u t it takes m ore tim e. One of th e possible choices for knowledge sources is th e use of a m achine readable on-line dictionary such as Longman D ictionary o f Contemporary English (LDO CE). M achine readable dictionaries can be a good source of gram m atic subcategorization and sem antics (although it is not well defined). Incorporation w ith statistical m eth od Statistical probability inform ation on word collocations acquired by analyzing a very large corpus is a good candidate for im proving the accuracy of a p attern , and acquiring sim ple noun phrase patterns. As shown in Jacobs’ work [46], th e statistical inform ation can help in selecting th e individual term as a building block of a pattern. A p plication to oth er tasks The acquisition scheme described in this thesis can be applied to any tasks th a t have a large am ount of sample texts and desirable o u tp u t representations. For the inform ation extraction task, only a small am ount of change is necessary for a new domain. T he PALKA system used for the terrorist dom ain of M UC-4 could be directly applied to th e joint venture dom ain of MUC-5 after defining new fram es w ith new keywords. T he same approach can also be applied to other tasks like n atural language interface to databases and n atu ral language help system s. For these tasks the F P -stru ctu re can be used for direct m apping from queries to their 95 interpretations. T he direct m apping is possible since the dom ain is very narrow. One possible innovative application can be found in speech recognition and dialog translation in a lim ited dom ain. T he p attern based approach for speech can be found in [35] [52]. Given a large set of utterance (or dialog) exam ples on a specific dom ain, PALKA can be used to generate dom ain specific sem antic gram m ar. 7.3 Summary This thesis dealt w ith scalability and speed issues in th e tex tu al inform ation pro cessing. T he prim ary concern was th e practicality. All th e works were presented in th e context of the MUC tasks and the SNAP project. T he scalability for practical application was provided by a new knowledge representation for direct sem antic m apping and th e autom ated p attern acquisition system . T he speed per form ance was discussed in conjunction w ith the SNAP m arker-passing parallel m achine. T he m arker-passing based parallel p atte rn m atching scheme together w ith th e patterns d istributed in the m em ory enable fast processing of a large am ount of texts. T he sem antic p attern acquisition scheme described in this thesis can also be used for other applications in which a large num ber of sem antic p a t terns are required. Using the approach described here, one of the bottlenecks in knowledge-based inform ation processing can be resolved through th e autom atic p a ttern acquisition. 96 Appendix A Example training texts and templates from MUC-4 T he following shows th e exam ples of a te x t and corresponding tem plates in terrorist incidents dom ain of MUC-4. A .l A sample text TST1-MUC3-0099 LIMA, 25 OCT 89 (EFE) ~ [TEXT] POLICE HAVE REPORTED THAT TERRORISTS TONIGHT BOMBED THE EMBASSIES OF THE PRC AND THE SOVIET UNION. THE BOMBS CAUSED DAMAGE BUT NO INJURIES. A CAR-BOMB EXPLODED IN FRONT OF THE PRC EMBASSY, WHICH IS IN THE LIMA RESIDENTIAL DISTRICT OF SAN ISIDRO. MEANWHILE, TWO BOMBS WERE THROWN AT A USSR EMBASSY VEHICLE THAT WAS PARKED IN FRONT OF THE EMBASSY LOCATED IN ORRANTIA DISTRICT, NEAR SAN ISIDRO. POLICE SAID THE ATTACKS WERE CARRIED OUT ALMOST SIMULTANEOUSLY AND THAT THE BOMBS BROKE WINDOWS AND DESTROYED THE TWO VEHICLES. NO ONE HAS CLAIMED RESPONSIBILITY FOR THE ATTACKS SO FAR. POLICE SOURCES, HOWEVER, HAVE SAID THE ATTACKS COULD HAVE BEEN CARRIED OUT BY THE MAOIST "SHINING PATH" GROUP OR THE GUEVARIST "TUPAC AMARU REVOLUTIONARY MOVEMENT" (MRTA) GROUP. THE SOURCES ALSO SAID THAT THE SHINING PATH HAS ATTACKED SOVIET INTERESTS IN PERU IN THE PAST. IN JULY 1989 THE SHINING PATH BOMBED A BUS CARRYING NEARLY 50 SOVIET MARINES INTO THE PORT OF EL CALLAO. FIFTEEN SOVIET MARINES WERE WOUNDED. SOME 3 YEARS AGO TWO MARINES DIED FOLLOWING A SHINING PATH BOMBING OF A MARKET USED BY SOVIET MARINES. IN ANOTHER INCIDENT 3 YEARS AGO, A SHINING PATH MILITANT WAS KILLED BY SOVIET EMBASSY GUARDS INSIDE THE EMBASSY COMPOUND. THE TERRORIST WAS CARRYING DYNAMITE. THE ATTACKS TODAY COME AFTER SHINING PATH ATTACKS DURING WHICH LEAST 10 BUSES WERE BURNED THROUGHOUT LIMA ON 24 OCT. A.2 Corresponding templates 0. MESSAGE ID 1. TEMPLATE ID 2. DATE OF INCIDENT 3. TYPE OF INCIDENT 4. CATEGORY OF INCIDENT 5. PERPETRATOR: ID OF INDIV(S) 6. PERPETRATOR: ID OF ORG(S) "MRTA" 7. PERPETRATOR: CONFIDENCE VEMENT" / "MRTA" 8. PHYSICAL TARGET: ID(S) 9. PHYSICAL TARGET: TOTAL NUM 10. PHYSICAL TARGET: TYPE(S) S" / "EMBASSIES OF THE PRC" 11. HUMAN TARGET: ID(S) 12. HUMAN TARGET: TOTAL NUM 13. HUMAN TARGET: TYPE(S) 14. TARGET: FOREIGN NATION(S) BASSIES OF THE PRC" 15. INSTRUMENT: TYPE(S) 16. LOCATION OF INCIDENT HOOD) 17. EFFECT ON PHYSICAL TARGET(S) F THE PRC" 18. EFFECT ON HUMAN TARGET(S) 0. MESSAGE ID 1. TEMPLATE ID 2. DATE OF INCIDENT 3. TYPE OF INCIDENT 4. CATEGORY OF INCIDENT 5. PERPETRATOR: ID OF INDIV(S) 6. PERPETRATOR: ID OF ORG(S) "MRTA" 7. PERPETRATOR: CONFIDENCE VEMENT" / "MRTA" 8. PHYSICAL TARGET: ID(S) TST1-MUC3-0099 1 24 OCT 89 - 25 OCT 89 BOMBING TERRORIST ACT "TERRORISTS" "SHINING PATH" "TUPAC AMARU REVOLUTIONARY MOVEMENT" / \ POSSIBLE: "SHINING PATH" POSSIBLE: "TUPAC AMARU REVOLUTIONARY M0\ "EMBASSIES" / "EMBASSIES OF THE PRC" 1 / PLURAL DIPLOMAT OFFICE OR RESIDENCE: "EMBASSIE\ PEOPLES REP OF CHINA: "EMBASSIES" / "EM\ PERU: LIMA (CITY): SAN ISIDRO (NEIGHBORS SOME DAMAGE: "EMBASSIES" / "EMBASSIES 0\ NO INJURY: "-" TST1-MUC3-0099 2 24 OCT 89 - 25 OCT 89 BOMBING TERRORIST ACT "TERRORISTS" "SHINING PATH" "TUPAC AMARU REVOLUTIONARY MOVEMENT" / \ POSSIBLE: "SHINING PATH" POSSIBLE: "TUPAC AMARU REVOLUTIONARY M0\ "EMBASSIES" / "EMBASSIES OF THE PRC AND\ 98 THE SOVIET UNION" / "EMBASSY" "VEHICLE" / "EMBASSY VEHICLE" / "USSR E\ MBASSY VEHICLE" / "USSR EMBASSY VEHICLE THAT WAS PARKED IN FRONT OF THE EMBASS\ Y" 9. PHYSICAL TARGET: TOTAL NUM 2 / PLURAL 10. PHYSICAL TARGET: TYPE(S) DIPLOMAT OFFICE OR RESIDENCE: "EMBASSIE\ S" / "EMBASSIES OF THE PRC AND THE SOVIET UNION" / "EMBASSY" TRANSPORT VEHICLE: "VEHICLE" / "EMBASSY\ VEHICLE" / "USSR EMBASSY VEHICLE" / "USSR EMBASSY VEHICLE THAT WAS PARKED IN \ FRONT OF THE EMBASSY" 11. HUMAN TARGET: ID(S) 12. HUMAN TARGET: TOTAL NUM 13. HUMAN TARGET: TYPE(S) 14. TARGET: FOREIGN NATION(S) USSR: "VEHICLE" / "EMBASSY VEHICLE" / "\ USSR EMBASSY VEHICLE" / "USSR EMBASSY VEHICLE THAT WAS PARKED IN FRONT OF THE \ EMBASSY" USSR: "EMBASSIES" / "EMBASSIES OF THE P\ RC AND THE SOVIET UNION" / "EMBASSY" 15. INSTRUMENT: TYPE(S) * 16. LOCATION OF INCIDENT PERU: LIMA (CITY): ORRANTIA (DISTRICT) 17. EFFECT ON PHYSICAL TARGET(S) SOME DAMAGE: "EMBASSIES" / "EMBASSIES 0\ F THE PRC AND THE SOVIET UNION" / "EMBASSY" DESTROYED: "VEHICLE" / "EMBASSY VEHICLE\ " / "USSR EMBASSY VEHICLE" / "USSR EMBASSY VEHICLE THAT WAS PARKED IN FRONT 0F\ THE EMBASSY" 18. EFFECT ON HUMAN TARGET(S) Appendix B Example training texts and templates from MUC-5 T he following shows th e exam ples of a tex t and corresponding tem plates in the joint venture dom ain of MUC-5. B .l A sample text <DQC> <DOCNO> 881103-0100 </D0CN0> <HL> FTC Moves to Block a Joint Venture On Silicones Proposed by GE, Carbide <AUTH0R> Andy Pasztor (WSJ Staff) </AUTH0R> <S0> </S0> <C0> GE UK </C0> <IN> CHM </IN> <G> FTC </G> <DATELINE> WASHINGTON </DATELINE> The Federal Trade Commission moved to block a proposed joint venture between General Electric Co. and Union Carbide Corp. in the more than $3 billion-a-year market for silicones world-wide. The commission, which authorized its staff to seek a preliminary injunction blocking the proposal, said the joint venture could substantially lessen competition in the production and sale of silicone products. Silicones and silicone-based products are widely used in industrial and consumer products, ranging from deodorants and carpet fibers to certain types of insulation and plastic materials. One agency official said the proposed joint venture, announced in May as a way of expanding research and development efforts, "amounts to an outright merger of the silicones businesses" of the giant companies. The companies previously said the joint venture would have initial annual sales of about $750 million, with GE slated to receive 70' /, of the venture's 100 profits and Carbide 30%. The government estimates that in 1986, sales of silcone products in the U.S. totaled about $1.1 billion. In Danbury, Conn., Union Carbide maintained that the proposed venture "would be in the best interest of our customers world-wide." The two companies "intend to work with the FTC in resolving the issue," said H.W. Lichtenberger, president of Union Carbide’s chemical and plastics group. The commission said GE ranks No. 2 in sales and production of silicones, both in the U.S and world-wide. Union Carbide ranks third among silicones producers in the U.S. and sixth world-wide, according to the commission. Dow Corning Corp., a joint venture of Dow Chemical Co., Detroit, and Corning Glass Works, Corning, N.Y., is the world’s largest silicones producer. If a federal district judge grants a preliminary injunction, the commission would have 20 days to file administrative charges against the proposed venture. A spokesman for General Electric Silicones said: "General Electric disagrees with the initial decision of the FTC regarding the proposed joint venture of Union Carbide Silicones and GE Silicones. We remain committed to working with the FTC to resolve any existing issues, since we feel the planned merger is good for both companies and for U.S. industry." Laurie Hays in Philadelphia contributed to this article. </TXT> </D0C> B.2 Corresponding templates <TEMPLATE-8811030100-1> := DOC NR: 8811030100 DOC DATE: 031188 CONTENT: <TIE_UP_RELATIONSHIP-8811030100-1> <TIE_UP_RELATI0NSHIP-8811030100-2> <TIE_UP_RELATIONSHIP-8811030100-i> := TIE-UP STATUS: Existing ENTITY: <ENTITY-8811030100-3> <ENTITY-8811030100-4> ACTIVITY: <ACTIVITY-8811030100-1> <ACTIVITY-8811030100-2> <TIE_UP_RELATI0NSHIP-8811030100-2> : = TIE-UP STATUS: Existing ENTITY: <ENTITY-8811030100-5> <ENTITY-8811030100-6> JOINT VENTURE CO: <ENTITY-8811030100-7> <ENTITY-8811030100-i> := NAME: General Electric CO ALIASES: "GE" 101 TYPE: Company ENTITY RELATIONSHIP: <ENTITY_RELATI0NSHIP-8811030100-3> <ENTITY-8811030100-2> := NAME: Union Carbide CORP ALIASES: "Union Carbide" "Carbide" LOCATION: Danbury (CITY) Connecticut (PROVINCE 1) United States (COUNTRY) TYPE: Company ENTITY RELATIONSHIP: <ENTITY_RELATI0NSHIP-8811030100-2> PERSON: <PERS0N-8811030100-1> <ENTITY-8811030100-3> := NAME: General Electric Silicones ALIASES: "GE Silicones" TYPE: Company ENTITY RELATIONSHIP: <ENTITY_RELATI0NSHIP-8811030100-1> <ENTITY_RELATIONSHIP-8811030100-3> <ENTITY-8811030100-4> := NAME: Union Carbide Silicones TYPE: Company ENTITY RELATIONSHIP: <ENTITY_RELATI0NSHIP-8811030100-i> <ENTITY_RELATI0NSHIP-8811030100-2> <ENTITY-8811030100-5> := NAME: Dow Chemical CO TYPE: Company ENTITY RELATIONSHIP: <ENTITY_RELATI0NSHIP-8811030100-4> <ENTITY-8811030100-6> := NAME: Corning Glass Works LOCATION: Corning (CITY) New York (PROVINCE 1) United States (COUNTRY) TYPE: Company ENTITY RELATIONSHIP: <ENTITY_RELATI0NSHIP-8811030100-4> <ENTITY-8811030100-7> := NAME: Dow Corning CORP TYPE: Company ENTITY RELATIONSHIP: <ENTITY_RELATI0NSHIP-8811030100-4> <INDUSTRY-8811030100-1> := INDUSTRY-TYPE: Production PRODUCT/SERVICE: <PR0D_SERV-8811030100-1> <INDUSTRY-88i1030100-2> := INDUSTRY-TYPE: Sales PRODUCT/SERVICE: <PROD_SERV-8811030100-2> <ENTITY_RELATI0NSHIP-8811030100-1> : = ENTITY1: <ENTITY-8811030100-3> <ENTITY-8811030100-4> REL OF ENTITY2 TO ENTITY1: Partner STATUS: Current <ENTITY_RELATI0NSHIP-8811030100-2> := ENTITY1: <ENTITY-8811030100-2> ENTITY2: <ENTITY-8811030100-4> REL OF ENTITY2 TO ENTITY1: Subordinate STATUS: Current 1 0 2 <ENTITY_RELATIONSHIP-8811030100-3> : * ENTITY1: <ENTITY-8811030100-1> ENTITY2: <ENTITY-8811030100-3> REL OF ENTITY2 TO ENTITY1: Subordinate STATUS: Current <ENTITY_RELATI0NSHIP-8811030100-4> ;= ENTITY1: <ENTITY-8811030100-5> <ENTITY-8811030100-6> ENTITY2: <ENTITY-8811030100-7> REL OF ENTITY2 TO ENTITY1: Child STATUS: Current <ACTIVITY-8811030100-1> := INDUSTRY: <INDUSTRY-8811030100-1> <ACTIVITY-8811030100-2> := INDUSTRY: <INDUSTRY-8811030100-2> ACTIVITY-SITE: <SITE-8811030100-1> REVENUE: <REVENUE-8811030100-1> <PERS0N-8811030100-1> := NAME: H.U. Lichtenberger PERSON’S ENTITY: <ENTITY-8811030100-2> POSITION: SREXEC <REVENUE-8811030100-1> := TYPE: Gross RATE: 7S0000000 ?D0 / Year <SITE-8811030100-1> := SITE_LOC_OR_FACILITY: "world” (UNKNOWN) <PR0D_SERV-8811030100-1> := PS_CODE: 28 PS_TEXT: "silicones" <PR0D_SERV-8811030100-2> := PS_CODE: 51 PS.TEXT: "silicones" Appendix C Acquired FP-structure examples T he following shows th e F P -structures acquired from 500 M UC-4 and MUC-5 texts for BO M BIN G, KILLING and T IE -U P frames. C.l FP-structures for BOMBING BOMBING BOHBIHG BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING BOHBIHG BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING BOMBING (AGENT: HUMAN) ATTACK (TARGET: BUILDING) WITH (INSTRUMENT: EXPLOSIVE) 3 (AGENT: HUMAN) BE INJURE IN (INSTRUMENT: BOMB) (EXPLOSION) 3 (AGENT: HUMAN) BOMB (TARGET: BUILDING) 3 (AGENT: HUMAN) DETONATE (INSTRUMENT: BOMB) NEAR (TARGET: BUILDING) 3 (AGENT: HUMAN) HURL (INSTRUMENT: EXPLOSIVE) AT (TARGET: BUILDING) 3 (AGENT: HUMAN) PLACE (INSTRUMENT: BOMB) 3 (AGENT: HUMAN) PLANT (INSTRUMENT: BOMB) IN-FRONTJJF (TARGET: BUILDING) 3 (AGENT: HUMAN) PLANT (INSTRUMENT: BOMB) 3 (AGENT: HUMAN) THROW (INSTRUMENT: BOMB) AT (TARGET: HUMAN) 3 (AGENT: HUMAN) USE (INSTRUMENT: BOMB) 3 (AGENT: HUMAN) USE (INSTRUMENT: EXPLOSIVE) IN (ATTACK) 3 (EXPLOSION) BE HEAR 3 (EXPLOSION) BREAK (OBJECT) OF (TARGET: BUILDING) 3 (EXPLOSION) CAUSE (EFFECT: DAMAGE) TO (TARGET: BUILDING) 3 (EXPLOSION) CAUSE (EFFECT: DAMAGE) 3 (EXPLOSION) CAUSE (SITUATION) 3 (EXPLOSION) CAUSE (TARGET: HUMAN) (DAMAGE) 3 (EXPLOSION) DAMAGE (OBJECT) OF (TARGET:OBJECT) 3 (EXPLOSION) DAMAGE (TARGET: OBJECT) 3 (EXPLOSION) DESTROY (OBJECT) OF (TARGET: BUILDING) 3 (EXPLOSION) OCCUR 3 (GEO-REGION) BE BOMB 3 (HUMAN) ALLOW (BOMBING) BY (AGENT: HUMAN) 3 (INSTRUMENT: BOMB) (EXPLOSION) DAMAGE (OBJECT) OF (TARGET: BUILDING) 3 (INSTRUMENT: BOMB) (EXPLOSION) INJURE (TARGET: HUMAN) 3 104 (IISTRUMEHT: BOMB) (EXPLOSIOI) OCCUR ] (IISTRUMEHT: BOMB) BE ACTIVATE ] (IHSTRUHEHT: BOMB) BE PLACE AT (TARGET: BUILDIHG) ] (IHSTRUMEHT: BOMB) BE PLACE BY (AGEHT: HUMAI) ] (IHSTRUHEHT: BOMB) BE PLACE II (TARGET: BUILDIIG) ] (HSTRUHEHT: BOMB) BE PLACE OUTSIDE (TARGET: BUILDIIG) ] (IHSTRUMEHT: BOMB) BE PLAIT BY (AGEIT: HUMAI) ] (IISTRUHEIT: BOMB) BE PLAIT ] (HSTRUHEHT: BOMB) BE USE ] (IISTRUMEHT: BOMB) CAUSE (EFFECT: DAMAGE) ] (IHSTRUMEHT: BOMB) DAMAGE (TARGET: BUILDIHG) ] (HSTRUHEHT: BOMB) DESTROY (TARGET: BUILDIHG) ] (IHSTRUMEHT: BOMB) EXPLODE (TARGET: OBJECT) ] (IHSTRUMEHT: BOMB) EXPLODE AT (TARGET: BUILDIHG) ] (IHSTRUMEHT: BOMB) EXPLODE IH (TARGET: BUILDIHG) ] (IHSTRUMEHT: BOMB) EXPLODE HEAR (TARGET: BUILDIHG) ] (IHSTRUMEHT: BOMB) EXPLODE OUTSIDE (OBJECT) OF (TARGET: BUILDIHG) ] (IHSTRUMEHT: BOMB) GO OFF ] (HSTRUHEHT: EXPLOSIVE) BE HURL ] (IHSTRUMEHT: EXPLOSIVE) EXPLODE ] (IHSTRUMEHT: EXPLOSIVE) KILL (TARGET: HUMAI) ] (OBJECT) BE BLOW BY (EXPLOSIOI) ] (OBJECT) FROM (EXPLOSIOI) HIT (TARGET: VEHICLE) ] (OBJECT) OF (TARGET: BUILDIIG) BE BREAK AS_A_RESULTJ)F (EXPLOSIOI) ] (SITUATIOH) BE CAUSED BY (EXPLOSIOI) IHSIDE (TARGET: BUILDIHG) ] (SITUATIOH) FROM (EXPLOSIOI) DAMAGE (TARGET: BUILDIIG) ] (TARGET: BUILDIHG) BE SHAKE BY (EXPLOSIOI) ] (TARGET: HUMAI) BE KILL BY (EXPLOSIOI) ] (TARGET: HUHAI) BE KILL BY (IHSTRUMEHT: BOMB) BY (AGEIT: HUHAH) ] (TARGET: HUHAH) BE KILL II (IISTRUMEHT: BOMB) (EXPLOSIOI) ] [ (TARGET: HUMAH) BE UOUID II (EXPLOSIOI) ] [ (TARGET: HUMAI) DIE AS_A_RESULTJ)F (IISTRUHEIT: BOMB) (EXPLOSIOI) ] [ (TARGET: HUMAI) DIE IH (EXPLOSIOI) ] [ (TARGET: OBJECT) BE DAHAGE ASJURESULT.DF (EXPLOSIOI) ] [ (TARGET: OBJECT) BE DAHAGE BY (HSTRUHEHT: BOMB) (EXPLOSIOI) ] [ (TARGET: OBJECT) BE DESTROY BY (IHSTRUMEHT: BOMB) (EXPLOSIOI) ] [ (TARGET: OBJECT) BE DESTROY BY (HSTRUHEHT: BOMB) ] [ (TARGET: THIIG) EXPLODE ] [ (THIHG) CAUSE (EFFECT: DAHAGE) WITH (HSTRUHEHT: EXPLOSIVE) ] [ (THIIG) DEIOUHCE (EFFECT: DEATH) OF (TARGET: HUMAI) DUE.TO (BOHBIHG) OF ORGAIIZATIOH) ] [ (THIHG) HAPPEI I I (BOHBIHG) ] BOHBIHG: C (THIHG) OF (HSTRU HEHT: EXPLOSIVE) EX PLO D E] C.2 FP-structures for KILLING KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG KILLIHG (AGEHT: AHIMATE) PERPETRATE (MURDER) ] (AGEHT: HUMAH) BE (PERPETRATOR) OF (MURDER) OF (TARGET: HUMAH) ] (AGEHT: HUHAH) BE IHVOLVE IH (MURDER) OF (TARGET: HUMAH) ] (AGEHT: HUMAH) BE IHVOLVE IH (MURDER) ] (AGEHT: HUMAH) CARRY OUT (MURDER) ] (AGEHT: HUMAH) KILL (TARGET: HUMAH) ] (AGEHT: HUHAH) MURDER (TARGET: HUMAH) ] (AGEHT: HUHAH) PARTICIPATE IH (MURDER) OF (TARGET: HUMAH) ] (AGEHT: HUMAH) SHOOT (TARGET: HUHAH) TO (EFFECT: DEATH) ] (AGEHT: HUMAH) SHOOT (TARGET: HUMAH) WITH (IHSTRUMEHT: VEAPOH) ] (AGEHT: HUHAH) SHOOT (TARGET: HUMAH) ] (AGEHT: HUMAH) STAB (TARGET: HUHAH) TO (EFFECT: DEATH) ] (EFFECT: DEATH) OF (TARGET: HUMAH) BE CAUSE BY (OBJECT) ] (HUHAH) BLAME (AGEHT: ORGAHIZATIOH) FOR (MURDER) OF (TARGET: HUMAH) ] (HUMAH) REPORT (EFFECT: DEATH) OF (TARGET: HUMAH) ] (IHSTRUMEHT: BOMB) (ATTACK) KILL (TARGET: HUMAH) ] (IHSTRUMEHT: EXPLOSIVE) KILL (TARGET: HUHAH) ] (SITUATIOH) RESULT IH (MURDER) OF (TARGET: HUMAH) ] (TARGET: HUMAH) BE BURH TO (EFFECT: DEATH) ] (TARGET: HUHAH) BE KILL BY (AGEHT: HUMAH) ] (TARGET: HUMAH) BE KILL BY (IHSTRUMEHT: BOMB) ] (TARGET: HUHAH) BE KILL IH (IHSTRUMEHT: EXPLOSIVE) (ATTACK) ] (TARGET: HUMAH) BE XILL ] (TARGET: HUHAH) BE MURDER BY (AGEHT: AHIMATE) ] (TARGET: HUHAH) BE MURDER IH (AGEHT: ORGAHIZATIOH) (ATTACK) ] (TARGET: HUMAH) BE MURDER ] (TARGET: HUMAH) BE SHOOT BY (AGEHT: HUMAH) J (TARGET: HUMAH) BE SHOOT TO (EFFECT: DEATH) BY (AGEHT: HUHAH) ] (TARGET: HUMAH) BE SHOOT TO (EFFECT: DEATH) ] (TARGET: HUHAH) BE SHOOT ] (TARGET: HUHAH) DIE ] (THIHG) ACCUSE (AGEHT: ORGAHIZATIOH) OF (DEATH) ] (THIHG) BE RESPOHSIBLE FOR (EFFECT: DEATH) OF (TARGET: HUHAH) ] (THIHG) CAUSE (EFFECT: DEATH) OF (TARGET: HUMAH) ] (THIHG) COHDEMH (MURDER) OF (TARGET: HUMAH) ] (THIHG) DEHOUHCE (EFFECT: DEATH) OF (TARGET: HUHAH) ] K ILLIH G : [ (THIHG) K ILL (TARGET: HUHAH) ] K ILLIH G : [ (THIHG) REGRET (EFFEC T: DEATH) OF (TARGET: HUHAH) ] , C.3 FP-structures for TIE-UP TIE-UP: (EHTITYO: COHPAHY) , (EHTITYI: COHPAHY) pos (joint-venture) with (EHTITY2: COHPAHY) 3 TIE-UP: (EHTITYO: COHPAHY) , (joint-venture) (company) form by (EHTITYI: COHPAHY) and (EHTITY2 COHPAHY) ] TIE-UP: (EHTITYO: COHPAHY) , (joint.venture) of (EHTITYI: COHPAHY) 3 TIE-UP: (EHTITYO: COHPAHY) , (joint.venture) betveen (EHTITYI: COHPAHY) and (EHTITY2 COHPAHY) 3 TIE-UP: (EHTITYO: COHPAHY) , (joint.venture) create by (EHTITYI: COHPAHY) and (EHTITY2 COHPAHY) ] TIE-UP: (EHTITYO: COHPAHY) , (joint.venture) establish by (EHTITYI: COHPAHY) and (EHTITY2 COHPAHY) ] TIE-UP: (EHTITYO: COHPAHY) , (joint-venture) of (EHTITYI: COHPAHY) 3 TIE-UP: (EHTITYO: COHPAHY) , (joint.venture) of (EHTITYI: COHPAHY) and (EHTITY2: COHPAHY) 3 TIE-UP: (EHTITYO: COHPAHY) , (joint.venture) operate vith (EHTITYI: COHPAHY) 3 TIE-UP: (EHTITYO: COHPAHY) , (joint-venture) set up betveen (EHTITYI: COHPAHY) and (EHTITY2 COHPAHY) ] TIE-UP: (EHTITYO: COHPAHY) , (joint-venture) set up by (EHTITYI: COHPAHY) 3 TIE-UP: (EHTITYO: COHPAHY) , (joint-venture) set up by (EHTITYi: COHPAHY) and (EHTITY2 COHPAHY) ] TIE-UP: (EHTITYO: COHPAHY) , (joint-venture) with (EHTITYi: COHPAHY) 3 TIE-UP: (EHTITYO: COHPAHY) be (EHTITYI: COHPAHY) pos (joint.venture) 3 TIE-UP: (EHTITYO: COHPAHY) be (joint.venture) betveen (EHTITYI: COHPAHY) and (EHTITY2 COHPAHY) ] TIE-UP: (EHTITYO: COHPAHY) be (joint.venture) of (EHTITYI: COHPAHY) and (EHTITY2: COHPAHY) 3 TIE-UP: (EHTITYO: COHPAHY) be form as (joint.venture) involve (EHTITYI: COHPAHY) and (EHTITY2 COHPAHY) 3 TIE-UP: (EHTITY1: COHPAHY) , (joint.venture) betveen (EHTITYI: COHPAHY) and (EHTITY2 COHPAHY) 3 TIE-UP: (EHTITY1: COHPAHY) agree (joint.venture) vith (EHTITY2: COHPAHY) 3 TIE-UP: (EHTITYi: COHPAHY) agree to establish (joint-venture) vith (EHTITY2: COHPAHY) 3 TIE-UP: (EHTITY1: COHPAHY) agree to form (joint.venture) vith (EHTITY2: COHPAHY) 3 TIE-UP: (EHTITYI: COHPAHY) agree to launch (joint.venture) vith (EHTITY2: COHPAHY) and (EHTITY3 COHPAHY) 3 TIE-UP: (EHTITYI: COHPAHY) agree to start (joint.venture) vith (EHTITY2: COHPAHY) 3 TIE-UP: (EHTITYI: COHPAHY) agree vith (EHTITY2: COHPAHY) to establish (joint.venture) 3 TIE-UP: (EHTITYI: COHPAHY) agree vith (EHTITY2: COHPAHY) to form (joint.venture) 3 107 TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: I TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ (EHTITY3: TIE-UP: [ (EHTITY3: TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ (EHTITY2: TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ (EHTITY2: TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ TIE-UP: [ (EHTITY3: TIE-UP: [ (EH TITY i (EH TITY I (EH TITY I (EH TITY I (EH TITY I (EH TITY I (EH TITY I (EH TITY I COHPAHY) (E H T IT Y I: COHPAHY) (EH TITY I (EH TITY I (EH TITY I (EH TITY I (EH TITY I (EH TITY I (EH TITY I (EH T IT Y i (EH TITY I (EH TITY I (EH TITY i (EH TITY I (EH TITY I (EH TITY I COHPAHY) (EH TITY I (EH TITY I (EH TITY I (EH TITY I COHPAHY) (EH TITY I (EH TITY I (EH TITY I (EH TITY I (EH TITY I (EH TITY i (EH TITY I (EH TITY i (EH TITY I COHPAHY) (E H T IT Y i: COHPAHY) agree sith (EH T IT Y 2: COHPAHY) to set up (joint.venture) 3 COHPAHY) and (E H T IT Y 2: COHPAHY) agree to form (alliance) 3 COHPAHY) and (E H T IT Y 2: COHPAHY) agree to set up (joint.venture) 3 COHPAHY) and (E H T IT Y 2: COHPAHY) announce (joint.venture) (deal) 3 COHPAHY) and (E H T IT Y 2: COHPAHY) be to establish (joint.venture) 3 COHPAHY) and (E H T IT Y 2: COHPAHY) complete (joint.venture) 3 COHPAHY) and (E H T IT Y 2: COHPAHY) establish (joint.venture) ] COHPAHY) and (E H T IT Y 2: COHPAHY) finalize (joint-venture) sith COHPAHY) and (E H T IT Y 2: COHPAHY) form (joint.venture) sith COHPAHY) and (E H T IT Y 2: COHPAHY) hold (stake) in (joint.venture) ] COHPAHY) and (EH TITY 2: COHPAHY) launch (joint.venture) 3 COHPAHY) and (E H T IT Y 2: COHPAHY) open (joint.venture) 3 COHPAHY) and (EH TITY 2: COHPAHY) osn (joint-venture) 3 COHPAHY) and (E H T IT Y 2: COHPAHY) plan to establish (joint.venture) 3 COHPAHY) and (EH TITY 2: COHPAHY) reach (agreement) to set up (joint.venture) COHPAHY) and (E H T IT Y 2: COHPAHY) set up (joint-venture) 3 COHPAHY) and (E H T IT Y 2: COHPAHY) sign (accord) to set up (joint.venture) 3 COHPAHY) and (E H T IT Y 2: COHPAHY) sign (agreement) to form (joint-venture) 3 COHPAHY) and (EH T IT Y 2: COHPAHY) sign (joint-venture) (agreement) 3 COHPAHY) announce (agreement) to (joint.venture) ] COHPAHY) announce (establishment) of (joint.venture) sith (EH T IT Y 2: COHPAHY) ] COHPAHY) announce (joint.venture) vith (E H T IT Y 2: COHPAHY) 3 COHPAHY) announce (plan) to form (joint.venture) vith and (EH T IT Y 3: COHPAHY) 3 COHPAHY) announce (plan) to set up (joint.venture) with (E H T IT Y 2: COHPAHY) 3 COHPAHY) approve (joint.venture) involve (E H T IT Y 2: COHPAHY) ] COHPAHY) be to establish (joint.venture) vith (E H T IT Y 2: COHPAHY) 3 COHPAHY) be to form (EHTITYO: COHPAHY) in (joint.venture) sith ] COHPAHY) begin (joint.venture) sith (EH T IT Y 2: COHPAHY) 3 COHPAHY) build (factory) in (joint.venture) vith (E H T IT Y 2: COHPAHY) 3 COHPAHY) complete (joint.venture) (agreement) sith (E H T IT Y 2: COHPAHY) 3 COHPAHY) conclude (joint.venture) sith (EH TITY 2: COHPAHY) ] COHPAHY) enter (joint.venture) sith (EH TITY 2: COHPAHY) ] COHPAHY) enter into (joint.venture) sith (EH T IT Y 2: COHPAHY) ] COHPAHY) establish (joint.venture) ] COHPAHY) establish (joint.venture) sith (EH T IT Y 2: COHPAHY) ] COHPAHY) expand (joint.venture) sith (EH T IT Y 2: COHPAHY) and ] COHPAHY) form (EHTITYO: COHPAHY) , (joint.venture) sith (E H T IT Y 2: COHPAHY) ] 108 T IE -U P : t (E H T IT Y I: COHPAHY) form (joint.venture) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) f o rm ( j o i n t . v e n t u r e ) w i t h (E H T IT Y 2: COHPAHY) ] T IE -U P : [ (E H T IT Y I: COHPAHY) f o r m ( j o i n t . v e n t u r e ) v i t h (EH T IT Y 2: COHPAHY) a n d (E H T IT Y 3: COHPAHY) ] T IE -U P : [ (E H T IT Y i: COHPAHY) h a v e ( j o i n t . v e n t u r e ) v i t h (E H T IT Y 2: COHPAHY) ] T IE -U P : [ (E H T IT Y I: COHPAHY) h a v e ( p l a n ) f o r ( j o i n t . v e n t u r e ) v i t h (EH TITY 2: COHPAHY) ] T IE -U P : [ (E H T IT Y I: COHPAHY) i n v o l v e i n ( j o i n t . v e n t u r e ) v i t h (E H T IT Y 2: COHPAHY) ] T IE -U P : [ (E H T IT Y I: COHPAHY) l a u n c h ( j o i n t . v e n t u r e ) v i t h (E H T IT Y 2: COHPAHY) ] T IE -U P : t (E H T IT Y I: COHPAHY) make (thing) in (joint.venture) vith (EH TITY 2: COHPAHY) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) make (thing) through (joint.venture) vith (EH T IT Y 2: COHPAHY) ] T IE -U P : [ (E H T IT Y I: COHPAHY) make (thing) under (joint.venture) vith (E H T IT Y 2: COHPAHY) 3 T IE -U P : [ (E H T IT Y i: COHPAHY) manufacture (car) in (joint.venture) vith (E H T IT Y 2: COHPAHY) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) open (conveniences to re) in (joint.venture) vith (EH T IT Y 2: COHPAHY) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) open (joint.venture) (bank) vith (E H T IT Y 2: COHPAHY) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) plan (production) in (joint.venture) vith (EH T IT Y 2: COHPAHY) 3 T IE -U P : [ (E H T IT Y i: COHPAHY) plan to establish (joint.venture) sith (E H T IT Y 2: COHPAHY) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) plan to set up (joint.venture) , (EHTITYO: COHPAHY) , sith (E H T IT Y 2: COHPAHY) and (EH T IT Y 3: COHPAHY) 3 COHPAHY) plan to set up (joint.venture) 3 COHPAHY) plan to set up (joint-venture) vith (E H T IT Y 2: COHPAHY) 3 COHPAHY) pos (joint.venture) (partner) be (E H T IT Y 2: COHPAHY) 3 COHPAHY) pos (joint.venture) sith (E H T IT Y 2: COHPAHY) 3 COHPAHY) pos (tie-up) vith (EH T IT Y 2: COHPAHY) 3 COHPAHY) promote (trade) through (joint.venture) vith (EH T IT Y 2: COHPAHY) 3 COHPAHY) reach (agreement) vith (EK T IT Y 2: COHPAHY) on (establishment) of T IE -U P : [ (EH TITY I T IE -U P : [ (EH TITY I T IE -U P : t (EH TITY I T IE -U P : [ (EH TITY I T IE -U P : [ (EH TITY I T IE -U P : [ (EH TITY I T IE -U P : [ (EH TITY I (joint.venture) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) reach (agreement) vith (E H T IT Y 2: COHPAHY) to establish (joint.venture) 3 T IE -U P : C (E H T IT Y I: COHPAHY) reach (agreement) vith (EH T IT Y 2: COHPAHY) to set up (joint.venture) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) set up (EHTITYO: COHPAHY) as (joint.venture) vith (E H T IT Y 2: COHPAHY) 3 T IE -U P : [ (E H T IT Y i: COHPAHY) set up (EHTITYO: COHPAHY) in (joint.venture) sith (E H T IT Y 2: COHPAHY) 3 T IE -U P : [ (EH TITY I T IE -U P : [ (EH TITY I T IE -U P : [ (EH TITY I COHPAHY) set up (bank) in (joint-venture) vith (E H T IT Y 2: COHPAHY) 3 COHPAHY) set up (joint.venture) vith (E H T IT Y 2: COHPAHY) 3 COHPAHY) set up (joint.venture) vith (E H T IT Y 2: COHPAHY) and (E H T IT Y 3: COHPAHY) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) sign (agreement) sith (EH T IT Y 2: COHPAHY) to set up (joint.venture) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) sign (contract) sith (E H T IT Y 2: COHPAHY) to establish (joint.venture) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) sign (contract) vith (E H T IT Y 2: COHPAHY) to form (joint.venture) 3 T IE -U P : [ (E H T IT Y I: COHPAHY) sign (joint-venture) (agreement) vith (EH T IT Y 2: COHPAHY) 3 109 T IE -U P : T IE -U P : T IE -U P : T IE -U P : T IE -U P : j T IE -U P : T IE -U P : T IE -U P : (EHTITY2 T IE -U P : T IE -U P : T IE -U P : T IE -U P : T IE -U P : T IE -U P : T IE -U P : T IE -U P : T IE -U P : T IE -U P : T IE -U P : (EHTITY2 T IE -U P : T IE -U P : T IE -U P : T IE -U P : T IE -U P : T IE -U P : T IE -U P : (EH TITY I (EH TITY I (EH TITY I (EH TITY i (EH TITY i (EH TITY I COHPAHY) start (joint.venture) sith (EHTITY2: COHPAHY) ] COHPAHY) take (stake) in (joint.venture) sith (EHTITY2: COHPAHY) j COHPAHY) team up sith (EHTITY2 COHPAHY) team up sith (EHTITY2 COHPAHY) team up sith (EHTITY2 COHPAHY) to establish (joint.venture) 3 COHPAHY) to form (joint.venture) ] COHPAHY) to set up (joint.venture) ] COHPAHY) tie up sith (E H T IT Y 2: COHPAHY) 3 (agreement) on (joint-venture) be sign by (E H T IT Y I: COHPAHY) and (EH T IT Y 2: COHPAHY) 3 (contract) for (joint.venture) be sign betseen (E H T IT Y I: COHPAHY) and COHPAHY) 3 (joint.venture) (agreement) betseen (E H T IT Y I: COHPAHY) and (E H T IT Y 2: COHPAHY) 3 (joint-venture) (company) develop by (E H T IT Y I: COHPAHY) sith (E H T IT Y 2: COHPAHY) 3 (joint.venture) (company) involve (E H T IT Y I: COHPAHY) and (E H T IT Y 2: COHPAHY) 3 (joint.venture) (partnership) comprise (E H T IT Y I: COHPAHY) and (E H T IT Y 2: COHPAHY) 3 (joint.venture) (team) of (E H T IT Y I: COHPAHY) and (EH T IT Y 2: COHPAHY) 3 (joint-venture) , (EHTITYO: COHPAHY) 3 (joint-venture) announce betseen (E H T IT Y I: COHPAHY) and (E IT IT Y 2 : COHPAHY)3 (joint-venture) be form betseen (E H T IT Y I: COHPAHY) and (E H T IT Y 2: COHPAHY) 3 (joint.venture) be oun by (E H T IT Y I: COHPAHY) and (EH T IT Y 2: COHPAHY) 3 (joint.venture) betseen (E H T IT Y I: COHPAHY) and (E H T IT Y 2: COHPAHY) 3 (joint.venture) call (EHTITYO: COHPAHY) comprise (E H T IT Y I: COHPAHY) and COHPAHY) 3 (joint.venture) name (EHTITYO: COHPAHY) be set up 3 (joint-venture) of (E H T IT Y I: COHPAHY) and (EH T IT Y 2: COHPAHY) 3 (joint.venture) sith (EHTITYi: COHPAHY) and (EHTITY2: COHPAHY) 3 (partner) in (joint.venture) include (E H T IT Y i: COHPAHY) and (EH T IT Y 2: COHPAHY) 3 (partner) of (joint.venture) be (E H T IT Y I: COHPAHY) and (E H T IT Y 2: COHPAHY) 3 (shareholder) of (joint.venture) be (EHTITYI: COHPAHY) and (EHTITY2: COHPAHY) 3 (tie) betseen (E H T IT Y I: COHPAHY) and (EH T IT Y 2: COHPAHY) 3 110 Bibliography [1] J. Allen, Natural Language Understanding, T he B enjam in/C um m ings P u b lishing Company, Inc., 1987. [2] J. R. Anderson, “A theory of language acquisition based on general learning principles,” Proceedings o f IJC A I-81, the Seventh International Joint Con ference on Artificial Intelligences 1, 1981. [3] R. Bareiss, Exemplar-Based Knowledge Acquisition, Academ ic Press, Inc., 1989. [4] J. B arnett, K. K night, I. M ani, and E. Rich, “Knowledge and n atu ral lan guage processing,” Communications o f the AC M , Vol. 33, No. 8 , 1990. [5] B. W . Ballard, “The syntax and semantics of user-defined modifiers in a transportable n atural language processor,” Proceedings o f COLING-84, the 9th International Conference o f Com putational linguistics, 1984. [6] J. D. Becker, “The phrasal lexicon,” Bolt Beranek and Newm an Inc. Report No. 3081, 1975. [7] T. J. M. Bench-Capon, Knowledge Representation - A n Approach to Artificial Intelligence, Academic Press Inc., 1990. [8] R. C. Berwick, “Learning Word M eanings from Exam ples,” Proceedings of IJC A I-83, the Eighth International Joint Conference on Artificial Intelli gence, 1983. [9] R. C. Berwick, The Acquisition o f Syntactic Knowledge, T he M IT Press, 1985. [10] J. L. Binot and K. Jensen, “A sem antic expert using an on-line standard dic tionary,” Proceedings o f IJC A I-87, the Tenth International Joint Conference on Artificial Intelligence, 1987. [11] D. G. Bobrow and T. W inograd, “An Overview of KRL: A Knowledge Rep resentation Language,” Cognitive Science, vol. 1, pp. 3-46, 1977. Ill 12] R. J. Brachm an and H. J. Levesque, “T he T ractability of subsum ption in fram e-based description languages,” Proceedings o f A A A I-84, the National Conference on Artificial Intelligence, 1984. 13] R. J. Brachm an and J. G. Schmolze, “An overview of th e KL-ONE knowledge representation system ,” Cognitive Science, vol. 9, pp. 171-216, 1985. 14] J. G. Carbonell, “Towards a self-extending parser,” Proceedings o f the 17th Meeting o f the Association fo r Com putational Linguistics, 1979. 15] E. Charniak, “Passing m arkers: a theory of contextual influence in language com prehension,” Cognitive Science, vol. 7, 1983. 16] E. Charniak, “A N eat Theory of M arker Passing,” Proceedings o f A A A I-86, the National Conference on Artificial Intelligence, 1986. 17] D. N. Chin, “A case study of knowledge representation in U C,” Proceedings o f IJC A I-83, the Eighth International Joint Conference on Artificial Intelli gence, 1983. 18] N. Chomsky, Aspects o f the Theory o f Syntax, M IT Press, Cam bridge, Mass., 1986. 19] M. Chung and D. Moldovan, “M emory-based parsing on SNAP: integrated syntactic and sem antic analysis,” Technical Report P K P L 91-4, University of Southern California, D epartm ent of EE-System s, 1991. 20] S.-H. Chung and D. Moldovan, “Modeling Sem antic Networks on The Con nection M achine,” Journal o f Parallel and Distributed Com puting, vol. 17, pp. 152-163, February 1993. 21] K. Church and P. Hanks, “W ord association norm s, m utual inform ation and lexicography,” Proceedings o f the 28th Meeting o f the Association fo r Com putational Linguistics, 1990. 22] K. Church, W . Gale, P. Hanks, and D. Hindle, “Using statistics in Lexical I A nalysis,” in Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, Lawrence Erlbaum Associates, Inc., N J, 1991. 23] W . A. Cook, Case Grammar: Development o f the M atrix Model (1970-1978), Georgetown University Press, 1979. 24] K. Dahlgren, Naive Sem antics fo r Natural Language Understanding, Kluwer Academic Publishers, 1988. 25] R. D eM ara and D. M oldovan, “The SNAP-1 parallel AI p rototype,” Proceed ings o f Annual International Sym posium on Com puter Architecture, 1991. 112 [26] T. V. Els, et al., Applied Linguistics and the learning and teaching o f foreign languages, Edw ard Arnold, 1984. [27] S. E. Fahlm an, N E T L : a system fo r representing and using real-world knowl edge, T he M IT Press, Cam bridge, MA, 1979. ' [28] C. J. Fillm ore, “The case for case,” in E. Bach and R. H arm s (Eds.), Uni- ! versals in Linguistic Theory, H olt, R einhart and W inston Inc., New York, 1968. [29] B. R. Gaines, “Knowledge acquisition system s,” in Knowledge Engineering. Vol. 1, Fundamentals, H ojjat Adeli Ed., M cGraw-Hill, Inc., 1990. [30] R. H. G ranger, “FOUL-UP: A program th a t figures out meanings of words from context,” Proceedings o f IJC A I-77, the Fifth International Jo in t Con ference on Artificial Intelligence, 1977. [31] B. J. Grosz, “TEAM : A transportable n atural language interface system ,” Proceedings o f the Conference on Applied Natural Language Processing, 1983. [32] T. R. G ruber, The Acquisition o f Strategic Knowledge, A cadem ic Press, Inc., 1989. [33] A. H auptm ann, “From syntax to m eaning in n atural language processing,” Proceedings o f A A A I-91, the National Conference on Artificial Intelligence, 1991. [34] A. H auptm ann, Meaning from Structure in Natural Language Processing, Ph.D Thesis, D epartm ent of C om puter Science, Carnegie-M ellon University, 1991. [35] P. J. Hayes, A. G. H auptm ann, J. G. Carbonell, and M. Tom ita, “P ars ing Spoken Language: a Sem antic Casefram e A pproach,” Proceedings of COLING-86, the 11th International Conference o f Com putational Linguis tics, pp. 587-592, 1986. [36] J. A. Hendler, “M arker-passing and M icrofeatures,” Proceedings o f IJC A I-87, the Tenth International Joint Conference on Artificial Intelligence, 1987. [37] J. A. H endler, Integrating M arker-Passing and Problem Solving, Lawrence Erlbaum Associates, Inc., 1988. [38] G. G. H endrix and W . H. Lewis, “Transportable n atu ral language interfaces to databases,” Proceedings o f COLING-84, the International Conference o f Com putational Linguistics, 1984. 113 [39] D. W . Hillis, The Connection M achine, T he M IT Press, Cam bridge, MA, 1985. [40] G. H irst, Sem antic interpretation and the resolution of ambiguity, Cam bridge University, 1987. [41] J. R. Hobbs, D. A ppelt, M. Tyson, J. Bear, and D. Israel, “FASTUS: System sum m ary,” Proceedings o f Fourth Message Understanding Conference, 1992. [42] E. Hovy, Generating Natural Language under Pragmatic Constraints, Lawrence Erlbaum Associates, Inc., N J, 1988. [43] P. Jacobs and U. Zernik, “Acquiring lexical knowledge from text: A case study,” Proceedings o f A A A I-88, the 7th N ational Conference on Artificial Intelligence, 1988. [44] P. Jacobs and L. R au, “Scisor: E xtracting inform ation from on-line news,” Com m unications o f the ACM , Vol. 33, No. 11, 1990. [45] P. Jacobs, editor, Text-Based Intelligent System s: Current Research and Practice in Inform ation Extraction and Retrieval, Lawrence E rlbaum Asso ciates, Inc., N J, 1992. [46] P. Jacobs, “Using statistical m ethods to im prove knowledge-based news cat egorization,” IE E E Expert, April, 1993. [47] K. K night, Integrating Knowledge Acquisition and Language Acquisition, Ph.D Thesis, D epartm ent of C om puter Science, Carnegie-M ellon University, 1991. [48] J.-T . K im and D. Moldovan, “Parallel knowledge classification on SNAP,” Proceedings o f ICPP-90, the International Conference on Parallel Processing, Vol. I, pp. 482-488, 1990. [49] J.-T . K im and D. Moldovan, “Acquisition of sem antic pattern s for infor m ation extraction from corpora,” Proceedings o f CAIA-93, the N inth IE E E Conference on A I Applications, pp. 171-176, 1993. [50] J.-T . Kim and D. Moldovan, “Classification and retrieval of knowledge on parallel m arker-passing architecture,” IE E E Trans, on Knowledge and Data Engineering, A ugust, 1992. [51] J.-T . Kim and D. Moldovan, “PALKA: A system for linguistic knowledge acquisition,” Proceedings o f the Second A C M International Conference on Inform ation and Knowledge M anagement, November, 1993. 114 [52] H. K itano, “<J>DM-Dialog. An experim ental speech-to-speech dialog transla tion system ,” IE E E Com puter, June, 1991. [53] H. K itano, D. Moldovan and S. Cha, “High perform ance n atu ral language processing on Sem antic Network A rray Processor,” Proceedings o f IJC A I-91, the 12th International Joint Conference on Artificial Intelligence, 1991. [54] V. Klein, Second Language Acquisition, Cam bridge University Press, 1990. [55] M. Lebowitz, “M emory-based parsing,” Artificial Intelligence, Vol. 21, 1983. [56] W . Lee and D. Moldovan, “T he design of a m aker passing architecture for knowledge processing,” Proceedings o f A A A I-90, the National Conference on Artificial Intelligence, 1990. [57] W . G. Lehnert and M. H. Ringle, editors, Strategies fo r Natural Language Processing, Lawrence Erlbaum Associates, Inc., 1982. [58] W . G. Lehnert, “Knowledge-based natural language understanding,” in Ex ploring Artificial Intelligence, Howard E. Shrobe, editor, M organ K aufm ann Publishers, Inc., 1986. [59] W . Lehnert, C. Cardie, D. Fisher, J. M cCarthy, E. Rilolf, and S. Soderland, “D escription of the CIRCUS system used for M UC-4,” Proceedings o f Fourth Message Understanding Conference, 1992. [60] C. Lin and D. Moldovan, “SNAP Sim ulator R esults,” Technical R eport CENG 89-11, University of Southern California, D epartm ent of EE-System s, 1989. [61] S. L. Lytinen, The Organization o f Knowledge In a M ulti-lingual Integrated Parser, Ph.D D issertation, D epartm ent of C om puter Science, Yale University, 1984. [62] S. L. Lytinen, “Dynam ically Combining Syntax and Sem antics in N atural Language Processing,” Proceedings o f A A A I-86, the N ational Conference on Artificial Intelligence, 1986. [63] R. M acGregor, “A deductive p attern m atcher,” Proceedings o f A A A I-88, the N ational Conference on Artificial Intelligence, 1988. [64] R. M acGregor, “T he evolving technology of the KL-ONE fam ily knowledge representation system s,” Workshop on Formal Aspects o f Sem antic Networks, January, 1989. [65] M. M arcus, A Theory o f Syntactic Recogition fo r Natural Language, T he M IT Press, Cam bridge, MA, 1980. 115 [66] A. Meyers, “VOX - an extensible natural language processor,” Proceedings o f IJC A I-85, the N inth International Joint Conference on Artificial Intelligence, 1985. [67] R. M ichalski, J. Carbonell, and T. M itchell, editors, M achine Learning, Mor gan K aufm ann Publishers, Inc., 1983. [68] R. M ichalski, “A theory and m ethodology of inductive learning,” Artificial Intelligence, Vol. 20, 1983. [69] M. Minsky, “A framework for representing knowledge,” in The Psychology o f Com puter Vision, P. H. W inston, editor, M cGraw-Hill Book Co., New York, 1975. [70] T. M itchell, “G eneralization as Search,” Artificial Intelligence, Vol. 18, 1982. [71] D. Moldovan, W . Lee and C. Lin, “SNAP: A m arker propagation architec tu re for knowledge processing,” Technical R eport CENG 89-10, U niversity of Southern California, D epartm ent of EE-System s, 1989. [72] D. Moldovan, W. Lee, C. Lin, and S.-H. Chung, “Parallel knowledge pro cessing on SNAP,” Proceedings o f ICPP-90, the International Conference on Parallel Processing, vol. I, pp. 474-481, 1990. [73] D. Moldovan, W . Lee, C. Lin, and M. Chung, “SNAP: Parallel processing applied to A I,” IE E E Computer, June, 1992. [74] D. M oldovan, S. Cha, M. Chung, K. Hendrickson, J. Kim , and S. Kowalski, “USC: Description of th e SNAP system used for MUC-4 ,” Proceedings o f the Fourth Message Understanding Conference (M UC-4), M organ K aufm ann Publishers, Inc., June 1992. [75] D. Moldovan, W . Lee and C. Lin, “SNAP: A M arker-Propagation A rchitec tu re for Knowledge Processing,” IE E E Trans, on Parallel and Distributed System s, July 1992. [76] B. Nebel, “C om putational complexity of term inological reasoning in BA CK ,” Artificial Intelligence, vol. 34, 1988 [77] P. Norvig, “Inference In Text U nderstanding,” Proceedings o f A A A I-87, the N ational Conference on Artificial Intelligence, 1987. [78] P. Norvig, “M arker Passing as a Weak M ethod for Text Inferencing,” Cogni tive Science, vol. 13, 1989. 116 [79] M. T . Pazienza and P. Velardi, “M ethods for extracting knowledge from corpora,” Proceedings o f the 5th Annual Workshop on Conceptual Structures, 1990. [80] D. Powers and C. Turk, M achine Learning o f Natural Language, Springer- Verlag, Inc., London, 1989. [81] M. R. Quillian, “Sem antic Memory,” in Sem antic Inform ation Processing, M. Minsky, editor, The M IT press, Cam bridge, MA, 1968. [82] M. R. Quillian, “T he teachable language comprehender: A sim ulation pro gram and theory of language,” Com m unications o f the AC M , Vol. 12, No. 8 , 1969. [83] L. F. Rau, “Spontaneous Retrieval in a Conceptual Inform ation System ,” Proceedings o f IJC A I-87, the Tenth International Joint Conference on A rti ficial Intelligence, 1987. [84] C. K. Riesbeck and C. E. M artin, “D irect m em ory access parsing,” Report 354, D epartm ent of C om puter Science, Yale University, 1985. [85] C. K. Riesbeck and R. Schank, Inside Case-Based Reasoning, Lawrence Erl- baum Associates, Inc., 1989. [8 6] E. Riloff and W. Lehnert, “A utom ated dictionary construction for infom ation extraction from te x t,” Proceedings o f CAIA-93, the N inth IE E E Conference on A I applications, 1993. [87] L. Ruqian, L. Yinghui and L. Xiaobin, “Com puter-aided gram m ar acquisition in the Chinese understanding system CUSACA,” Proceedings o f IJC A I-89, the Eleventh International Joint Conference on Artificial Intelligence, 1989. [8 8] G. Salton, Autom atic Text Processing: The Transformation, Analysis, and Retrieval o f Inform ation by Computer, Addison-W esley Publishing Company, 1989. [89] S. C. Salveter, “Inferring building blocks for knowledge representation,” in W . G. Lehnert and M. H. Ringle, editors, Strategies fo r Natural Language Processing, Lawrence Erlbaum Associates, Inc., 1982. [90] J. G. Schmolze and T. Lipkis, “Classification in th e KL-ONE knowledge representation system ,” Proceedings o f IJC A I-83, the Eighth International Joint Conference on Artificial Intelligence, 1983. [91] M. Selfridge, “A com puter m odel of child language acquisition” Proceedings o f IJC A I-81, the Seventh International Joint Conference on Artificial Intel ligence, 1981. 117 [92] J. Shavlik and T. D ietterich, editors, Readings in M achine Learning, M organ Kaufm ann Publishers, Inc., 1990. [93] L. Siklossy, Natural Language Learning by Com puter, Ph.D Thesis, D epart m ent of C om puter Science, Carnegie-Mellon University, 1968. | [94] F. Sm adja “M acrocoding the lexicon w ith co-occurrence knowledge,” in Lex ical Acquisition: Exploiting On-Line Resources to Build a Lexicon, Lawrence Erlbaum Associates, Inc., N J, 1991. [95] J. Sowa, editor, Principles o f Sem antic Networks, M organ K aufm ann P u b lishers, Inc., San M ateo, Ca, 1991. [96] C. Stanfill and D. W altz, “Toward m em ory based reasoning,” Com munication o f ACM , 29-12, 1986. [97] B. Sundheim , editor, “Proceedings of the Fourth Message U nderstanding Conference (M UC-4)” M organ Kaufm ann Publishers, Inc., San M ateo, Ca, June 1992. [98] H. Tomabechi, “Direct M emory Access Translation,” Proceedings o f IJC A I- 87, the Tenth International Joint Conference on Artificial Intelligence87, 1987. [99] P. Velardi and M. T. Pazienza, “Com puter aided acquisition of lexical cooc currences,” Proceedings o f the 27th Meeting o f the Association fo r Com puta tional Linguistics, 1989. [100] P. Velardi, M. T. Pazienza and S. M agrini, “A cquisition of sem antic patterns from a natural corpus of tex ts,” A C M S IG A R T Newsletter, No. 108, April 1989. [101] D. L. W altz and J. B. Pollack, “Massively Parallel Parsing: A Strongly Interactive Model of N atural Language Interpretation,” Cognitive Science, Vol. 9, pp. 51-74, 1985. [102] D. L. W altz, “Massively parallel A I,” Proceedings o f A A A I-90, the National Conference on Artificial Intelligence, 1990. [103] R. Wilensky, Y. Arens and D. Chin, “Talking to UNIX in English: An overview of U C,” Com m unications o f the ACM , Vol. 27, No. 6, 1984. [104] T. W inograd, Understanding Natural Language, A cadem ic Press, New York, 1972. [105] T. W inograd, Language as a Cognitive Process - Volume 1, Syntax, Addison- Wesley Publishing Co., 1983. [106] W . A. Woods, “Research in knowledge representation for n atu ral language understanding,” Annual report no. 4785, Bolt Berbank and Newm an Inc., 1981. [107] D. W u, “A Probabilistic Approach to M arker Propagation,” Proceedings of IJC A I-89, the Eleventh International Joint Conference on Artificial Intelli gence, 1989. [108] Y.-H. Yu and R. F. Simmons, “Truly Parallel U nderstanding of T ext,” Pro ceedings o f AA AI-90, the National Conference on Artificial Intelligence, 1990. [109] U. Zernik and M. G. Dyer, “T he self-extending phrasal lexicon,” Computa tional Linguistics, Vol. 13, No. 3-4, 1987. [110] U. Zernik, Strategies in Language Acquisitions: Learning Phrases from Examples in Context, Ph.D D issertation, Com puter Science D epartm ent, UCLA, 1987. [111] U. Zernik, “Lexicon acquisition: learning from corpus by capitalizing on lexical categories,” Proceedings o f IJC A I-89, the Eleventh International Joint Conference on Artificial Intelligence, 1989. [112] U. Zernik, editor, Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, Lawrence Erlbaum Associates, Inc., N J, 1991. 119
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
Asset Metadata
Core Title
00001.tif
Tag
OAI-PMH Harvest
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC11255791
Unique identifier
UC11255791
Legacy Identifier
DP22868