Page 59 |
Save page Remove page | Previous | 59 of 130 | Next |
|
small (250x250 max)
medium (500x500 max)
Large (1000x1000 max)
Extra Large
large ( > 500x500)
Full Resolution
All (PDF)
|
This page
All
|
Figure 3.6: Example of Sense Confusion. The POS tag for word \beyond" is predicted as \RB" instead of \IN", resulting in a missing prepositional phrase. NMT sentence vectors encode a lot of syntax, but they still cannot grasp these subtle details. 59
Object Description
Title | Neural sequence models: Interpretation and augmentation |
Author | Shi, Xing |
Author email | xingshi@usc.edu;shixing19910105@gmail.com |
Degree | Doctor of Philosophy |
Document type | Dissertation |
Degree program | Computer Science |
School | Viterbi School of Engineering |
Date defended/completed | 2018-05-08 |
Date submitted | 2018-08-01 |
Date approved | 2018-08-01 |
Restricted until | 2018-08-01 |
Date published | 2018-08-01 |
Advisor (committee chair) | Knight, Kevin |
Advisor (committee member) |
May, Jonathan Narayanan, Shri |
Abstract | Recurrent neural networks (RNN) have been successfully applied to various Natural Language Processing tasks, including language modeling, machine translation, text generation, etc. However, several obstacles still stand in the way: First, due to the RNN's distributional nature, few interpretations of its internal mechanism are obtained, and it remains a black box. Second, because of the large vocabulary sets involved, the text generation is very time-consuming. Third, there is no flexible way to constrain the generation of the sequence model with external knowledge. Last, huge training data must be collected to guarantee the performance of these neural models, whereas annotated data such as parallel data used in machine translation are expensive to obtain. This work aims to address the four challenges mentioned above. ❧ To further understand the internal mechanism of the RNN, we choose neural machine translation (NMT) systems as a testbed. We first investigate how NMT outputs target strings of appropriate lengths, locating a collection of hidden units that learns to explicitly implement this functionality. Then we investigate whether NMT systems learn source language syntax as a by-product of training on string pairs. We find that both local and global syntactic information about source sentences is captured by the encoder. Different types of syntax are stored in different layers, with different concentration degrees. ❧ To speed up text generation, we propose two novel GPU-based algorithms: 1) Utilize the source/target words alignment information to shrink the target side run-time vocabulary; 2) Apply locality sensitive hashing to find nearest word embeddings. Both methods lead to a 2-3x speedup on four translation tasks without hurting machine translation accuracy as measured by BLEU. Furthermore, we integrate a finite state acceptor into the neural sequence model during generation, providing a flexible way to constrain the output, and we successfully apply this to poem generation, in order to control the meter and rhyme. ❧ To improve NMT performance on low-resource language pairs, we re-examine multiple technologies that are used in high resource language NMT and other NLP tasks, explore their variations and result in a strong NMT system for low resource languages. Experiments on Uygher-English show 10.4 BLEU score improvement over the vanilla NMT system, and achieve comparable results with syntax-based machine translation. |
Keyword | interpretation; neural networks; sequence-to-sequence models; GPU; poem generation; language generation; neural machine translation; speedup; locality sensitive hashing; word alignment |
Language | English |
Format (imt) | application/pdf |
Part of collection | University of Southern California dissertations and theses |
Publisher (of the original version) | University of Southern California |
Place of publication (of the original version) | Los Angeles, California |
Publisher (of the digital version) | University of Southern California. Libraries |
Provenance | Electronically uploaded by the author |
Type | texts |
Legacy record ID | usctheses-m |
Contributing entity | University of Southern California |
Rights | Shi, Xing |
Physical access | The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given. |
Repository name | University of Southern California Digital Library |
Repository address | USC Digital Library, University of Southern California, University Park Campus MC 7002, 106 University Village, Los Angeles, California 90089-7002, USA |
Repository email | cisadmin@lib.usc.edu |
Filename | etd-ShiXing-6594.pdf |
Archival file | Volume3/etd-ShiXing-6594.pdf |