Training speech translation from audio recordings of interpreter-mediated communication

Authors:
Matthias Paulik;Alex Waibel
Affiliations:
Cisco Systems, 170 W Tasman Dr, San Jose, CA 95134, USA;Carnegie Mellon University, USA and Karlsruhe Institute of Technology, Germany
Venue:
Computer Speech and Language
Year:
2013

Citing 3
Cited 0

A characterization of the problem of new, out-of-vocabulary words in continuous-speech recognition and understanding

A characterization of the problem of new, out-of-vocabulary words in continuous-speech recognition and understanding
A systematic comparison of various statistical alignment models

Computational Linguistics
Manual and automatic evaluation of machine translation between European languages

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abstract: Globalization as well as international crises and disasters spur the need for cross-lingual verbal communication for myriad languages. This is reflected in ongoing intense research activity in the field of speech translation. However, the development of deployable speech translation systems still happens only for a handful of languages. Prohibitively high costs attached to the acquisition of sufficient amounts of suitable speech translation training data are one of the main reasons for this situation. A new language pair or domain is typically only considered for speech translation development after a major need for cross-lingual verbal communication just arose-justifying the high development costs. In such situations, communication has to rely on the help of interpreters, while massive data collections for system development are conducted in parallel. We propose an alternative to this time-consuming and costly parallel effort. By training speech translation directly on audio recordings of interpreter-mediated communication, we omit most of the manual transcription effort and all of the manual translation effort that characterizes traditional speech translation development.