Real-time incremental speech-to-speech translation of dialogs

Authors:
Srinivas Bangalore;Vivek Kumar Rangarajan Sridhar;Prakash Kolan;Ladan Golipour;Aura Jimenez
Affiliations:
AT&T Labs - Research, NJ;AT&T Labs - Research, NJ;AT&T Labs - Research, NJ;AT&T Labs - Research, NJ;AT&T Labs - Research, NJ
Venue:
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Year:
2012

Citing 8
Cited 0

A systematic comparison of various statistical alignment models

Computational Linguistics
Incremental translation utilizing constituent boundary patterns

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
JANUS: a speech-to-speech translation system using connectionist and symbolic processing strategies

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
AxsJAX: a talking translation bot using google IM: bringing web-2.0 applications to life

W4A '08 Proceedings of the 2008 international cross-disciplinary conference on Web accessibility (W4A)
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Incremental decoding for phrase-based statistical machine translation

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Vector quantization for the efficient computation of continuous density likelihoods

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a conventional telephone conversation between two speakers of the same language, the interaction is real-time and the speakers process the information stream incrementally. In this work, we address the problem of incremental speech-to-speech translation (S2S) that enables cross-lingual communication between two remote participants over a telephone. We investigate the problem in a novel real-time Session Initiation Protocol (SIP) based S2S framework. The speech translation is performed incrementally based on generation of partial hypotheses from speech recognition. We describe the statistical models comprising the S2S system and the SIP architecture for enabling real-time two-way cross-lingual dialog. We present dialog experiments performed in this framework and study the tradeoff in accuracy versus latency in incremental speech translation. Experimental results demonstrate that high quality translations can be generated with the incremental approach with approximately half the latency associated with non-incremental approach.