Distributed speech translation technologies for multiparty multilingual communication

  • Authors:
  • Sakriani Sakti;Michael Paul;Andrew Finch;Xinhui Hu;Jinfu Ni;Noriyuki Kimura;Shigeki Matsuda;Chiori Hori;Yutaka Ashikari;Hisashi Kawai;Hideki Kashioka;Eiichiro Sumita;Satoshi Nakamura

  • Affiliations:
  • Nara Institute of Science and Technology, Japan;National Institute of Information and Communications Technology, Japan;National Institute of Information and Communications Technology, Japan;National Institute of Information and Communications Technology, Japan;National Institute of Information and Communications Technology, Japan;National Institute of Information and Communications Technology, Japan;National Institute of Information and Communications Technology, Japan;National Institute of Information and Communications Technology, Japan;National Institute of Information and Communications Technology, Japan;National Institute of Information and Communications Technology, Japan;National Institute of Information and Communications Technology, Japan;National Institute of Information and Communications Technology, Japan;Nara Institute of Science and Technology, Japan/ National Institute of Information and Communications Technology, Japan

  • Venue:
  • ACM Transactions on Speech and Language Processing (TSLP)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Developing a multilingual speech translation system requires efforts in constructing automatic speech recognition (ASR), machine translation (MT), and text-to-speech synthesis (TTS) components for all possible source and target languages. If the numerous ASR, MT, and TTS systems for different language pairs developed independently in different parts of the world could be connected, multilingual speech translation systems for a multitude of language pairs could be achieved. Yet, there is currently no common, flexible framework that can provide an entire speech translation process by bringing together heterogeneous speech translation components. In this article we therefore propose a distributed architecture framework for multilingual speech translation in which all speech translation components are provided on distributed servers and cooperate over a network. This framework can facilitate the connection of different components and functions. To show the overall mechanism, we first present our state-of-the-art technologies for multilingual ASR, MT, and TTS components, and then describe how to combine those systems into the proposed network-based framework. The client applications are implemented on a handheld mobile terminal device, and all data exchanges among client users and spoken language technology servers are managed through a Web protocol. To support multiparty communication, an additional communication server is provided for simultaneously distributing the speech translation results from one user to multiple users. Field testing shows that the system is capable of realizing multiparty multilingual speech translation for real-time and location-independent communication.