Corpus-centered computation

  • Authors:
  • Eiichiro Sumita

  • Affiliations:
  • ATR Spoken Language Translation Research Laboratories, Kyoto, Japan

  • Venue:
  • S2S '02 Proceedings of the ACL-02 workshop on Speech-to-speech translation: algorithms and systems - Volume 7
  • Year:
  • 2002
  • Input sentence splitting and translating

    HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3

Quantified Score

Hi-index 0.00

Visualization

Abstract

To achieve translation technology that is adequate for speech-to-speech translation (S2S), this paper introduces a new attempt named Corpus-Centered Computation, (abbreviated to C3 and pronounced c-cube). As opposed to conventional approaches adopted by machine translation systems for written language, C3 places corpora at the center of the technology. For example, translation knowledge is extracted from corpora, translation quality is gauged by referring to corpora and the corpora themselves are normalized by paraphrasing or filtering. High-quality translation has been demonstrated in the domain of travel conversation, and the prospects of this approach are promising due to the benefits of synergistic effects.