Minimum bayes-risk techniques in automatic speech recognition and statistical machine translation

  • Authors:
  • William Byrne;Shankar Kumar

  • Affiliations:
  • The Johns Hopkins University;The Johns Hopkins University

  • Venue:
  • Minimum bayes-risk techniques in automatic speech recognition and statistical machine translation
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic Speech Recognition (ASR) and Machine Translation (MT) are fundamental language technologies that are emerging as core components of information processing systems. Each of these problems can be evaluated using a variety of metrics that measure different aspects of recognition or translation performance. In contrast, the training and decoding architectures of most of the current ASR and statistical MT systems are optimized with respect to Sentence Error Rate that is rarely used in evaluating these systems. The goal of this thesis is to overcome this mismatch by building automatic systems specialized for each individual evaluation metric. We employ the Minimum Bayes-Risk (MBR) classification framework to construct systems sensitive to specific error criteria. We present the formulation of MBR decoders in speech recognition and in two sub-problems in machine translation: bitext word alignment and translation. MBR decoding is performed by rescoring a set of likely hypotheses represented as lattices or N-best lists. Statistical ASR systems for generating word lattices have become widely available in the recent years. In contrast, Statistical MT (SMT) has become popular only within the last decade and we did not have access to SMT systems for generating word alignment and translation hypotheses. We therefore formulate and implement a generative, source channel Translation Template Model (TTM) for SMT. The approach we describe allows us to implement each stochastic transformation in this model using a weighted finite state transducer (WFST). This allows translation and bitext word alignment to be realized immediately by standard WFST operations on the component transducers. The TTM is the first phrase-based translation model to be used for bitext word alignment. We describe the construction of a TTM Chinese-to-English translation system that ranked among the top performing systems in the NIST 2004 international MT evaluation.MBR decoders face computational challenges when applied to large vocabulary speech recognition tasks. We introduce the segmental MBR recognition framework that decomposes a large MBR search problem into a sequence of smaller MBR problems. To achieve this, we develop a risk-driven lattice segmentation procedure to segment large recognition word lattices into smaller sub-lattices over which MBR decoding is performed. (Abstract shortened by UMI.)