BBN TransTalk: Robust multilingual two-way speech-to-speech translation for mobile platforms

  • Authors:
  • Rohit Prasad;Prem Natarajan;David Stallard;Shirin Saleem;Shankar Ananthakrishnan;Stavros Tsakalidis;Chia-Lin Kao;Fred Choi;Ralf Meermeier;Mark Rawls;Jacob Devlin;Kriste Krstovski;Aaron Challenner

  • Affiliations:
  • Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA;Raytheon BBN Technologies, Speech, Language, and Multimedia Department, 10 Moulton Street, Cambridge, MA, USA

  • Venue:
  • Computer Speech and Language
  • Year:
  • 2013

Quantified Score

Hi-index 0.02

Visualization

Abstract

In this paper we present a speech-to-speech (S2S) translation system called the BBN TransTalk that enables two-way communication between speakers of English and speakers who do not understand or speak English. The BBN TransTalk has been configured for several languages including Iraqi Arabic, Pashto, Dari, Farsi, Malay, Indonesian, and Levantine Arabic. We describe the key components of our system: automatic speech recognition (ASR), machine translation (MT), text-to-speech (TTS), dialog manager, and the user interface (UI). In addition, we present novel techniques for overcoming specific challenges in developing high-performing S2S systems. For ASR, we present techniques for dealing with lack of pronunciation and linguistic resources and effective modeling of ambiguity in pronunciations of words in these languages. For MT, we describe techniques for dealing with data sparsity as well as modeling context. We also present and compare different user confirmation techniques for detecting errors that can cause the dialog to drift or stall.