The ATR multilingual speech-to-speech translation system

Authors:
S. Nakamura;K. Markov;H. Nakaiwa;G. Kikui;H. Kawai;T. Jitsuhiro;Jin-Song Zhang;H. Yamamoto;E. Sumita;S. Yamamoto
Affiliations:
ATR Spoken Language Translation Res. Labs., Kyoto, Japan;-;-;-;-;-;-;-;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2006

Citing 0
Cited 10

Effects of real-time transcription on non-native speaker's comprehension in computer-mediated communications

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
NICT-ATR speech-to-speech translation system

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
A walk on the other side: adding statistical components to a transfer-based translation system

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Hyperbolic structure of fundamental frequency contour

Proceedings of the 3rd International Universal Communication Symposium
Japanese Spontaneous Spoken Document Retrieval Using NMF-Based Topic Models

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Effects of automated transcription quality on non-native speakers' comprehension in real-time computer-mediated communication

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Thousands of voices for HMM-based speech synthesis: analysis and application of TTS systems built on various ASR corpora

IEEE Transactions on Audio, Speech, and Language Processing
Learning Novel Objects for Extended Mobile Manipulation

Journal of Intelligent and Robotic Systems
Enriching machine-mediated speech-to-speech translation using contextual information

Computer Speech and Language
Correcting phoneme recognition errors in learning word pronunciation through speech interaction

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we describe the ATR multilingual speech-to-speech translation (S2ST) system, which is mainly focused on translation between English and Asian languages (Japanese and Chinese). There are three main modules of our S2ST system: large-vocabulary continuous speech recognition, machine text-to-text (T2T) translation, and text-to-speech synthesis. All of them are multilingual and are designed using state-of-the-art technologies developed at ATR. A corpus-based statistical machine learning framework forms the basis of our system design. We use a parallel multilingual database consisting of over 600 000 sentences that cover a broad range of travel-related conversations. Recent evaluation of the overall system showed that speech-to-speech translation quality is high, being at the level of a person having a Test of English for International Communication (TOEIC) score of 750 out of the perfect score of 990.