Real-time incremental speech-to-speech translation of dialogs

  • Authors:
  • Srinivas Bangalore;Vivek Kumar Rangarajan Sridhar;Prakash Kolan;Ladan Golipour;Aura Jimenez

  • Affiliations:
  • AT&T Labs - Research, NJ;AT&T Labs - Research, NJ;AT&T Labs - Research, NJ;AT&T Labs - Research, NJ;AT&T Labs - Research, NJ

  • Venue:
  • NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a conventional telephone conversation between two speakers of the same language, the interaction is real-time and the speakers process the information stream incrementally. In this work, we address the problem of incremental speech-to-speech translation (S2S) that enables cross-lingual communication between two remote participants over a telephone. We investigate the problem in a novel real-time Session Initiation Protocol (SIP) based S2S framework. The speech translation is performed incrementally based on generation of partial hypotheses from speech recognition. We describe the statistical models comprising the S2S system and the SIP architecture for enabling real-time two-way cross-lingual dialog. We present dialog experiments performed in this framework and study the tradeoff in accuracy versus latency in incremental speech translation. Experimental results demonstrate that high quality translations can be generated with the incremental approach with approximately half the latency associated with non-incremental approach.