Jan W. Amtrup

Dissertation
The original German title of my dissertation is ``Maschinelles Dolmetschen mit Mehr-Ebenen-Charts''. It is available from the Library of the University of Hamburg, Germany as pdf file.

A slightly revised English translation has been published by Springer:

Incremental Speech Translation.
Lecture Notes in Computer Science 1735,
Springer Verlag, Berlin, Heidelberg

Overview

Human language understanding works incrementally. This means that people process parts of the acoustic input, even before it has been completed. Engaged in a dialog, they are able to anticipate their partner's contribution before she or he has stopped talking. Simultaneous interpreters provide an example relevant for the purpose of the research presented here: They start to produce target language utterances with little delay, thereby showing incremental behavior.

Consequently, the introduction of incremental strategies into artificial systems for understanding natural language seems reasonable to implement adequate processing which exhibits at least part of the human performance. If machines are expected to perform somewhat similarly to humans, incremental algorithms have to be incorporated.

There is still some hesitation, however, to investigate incremental mechanisms. This is mainly due to the fact that incorporating these strategies results in an increased processing burden in the first place, because there is less certainty about the right context. This increase in processing time can be partly countered by employing means of parallelization which become possible in incremental systems. A gain in quality of individual processing steps can only be reached by constructing alternative information paths used for mutual influence between two components.

The predominant area in which investigation into incremental systems is pursued nowadays seems to be the study of speech-to-speech translation. First promising results have already been achieved.

The system MILC (Machine Interpreting with Layered Charts), which is developed and described in this work, constitutes the attempt to design and implement an application for translation of spontaneous speech which works incrementally throughout. Each component of the system starts operating before preceding components have finished their processing. Parts of the input are worked on in strict order of speaking time. This schema results in an incremental behavior of the application as a whole, which is able to generate target language text before the source language speaker has finished his or her turn.

In order to realize such a system in an integrated and uniform way, we develop a new data structure, called Layered Chart, which is used to store temporary results throughout the system. This structure extends the way in which charts have been used so far, namely for parsing and generation. Additionally, we implement a typed feature formalism which aids in maintaining the knowledge sources for the system. This formalism is specifically designed to simplify the means of communicating linguistic objects between different modules of a system. Both measures linked to each other guarantee a homogeneous view of the global state of the translation system and enable the investigation of non-trivial interaction patterns between components.

The successful implementation proves that it is possible to realize an integrated system for interpreting spontaneous speech for restricted corpora in an architecturally strict way. Moreover, MILC provides the basis for more far-reaching investigations into the architecture of incremental speech-processing systems. This is exemplified by the incorporation of the incremental recognition and translation of idioms.

This monograph is organized as follows:

  • Chapter 1 provides a detailed introduction to the central matters of this work. We present the incremental nature of human speech comprehension and show how incrementality is used within natural language processing.
  • Chapter 2 contains an overview over the relevant parts of graph theory which have an impact on speech processing. We present some results from the evaluation of speech recognition. Additionally, we introduce hypergraphs which are a basic data structure for the representation of highly redundant recognition results.
  • Chapter 3 gives an introduction to unification-based formalisms and describes their use in NLP systems with an emphasis on machine translation. We present the typed-feature formalism which was implemented for the system described here.
  • Chapter 4 describes the architecture and implementation of the MILC system. We focus on the global architecture, the communication subsystem, and some properties of individual components.
  • Chapter 5 contains the description of the data and experiments we used to evaluate MILC. The different knowledge sources, which are relevant for the evaluation, are presented.
  • Chapter 6 enumerates the central results presented throughout this book and tries to give an outlook on future research in the field.