Acquisition of English-Chinese transliterated word pairs from parallel-aligned texts using a statistical machine transliteration model

  • Authors:
  • Chun-Jen Lee;Jason S. Chang

  • Affiliations:
  • Chunghwa Telecom Co., Ltd. Chungli, Taiwan, R.O.C.;National Tsing Hua University, Hsinchu, Taiwan, R.O.C.

  • Venue:
  • HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a framework for extracting English and Chinese transliterated word pairs from parallel texts. The approach is based on the statistical machine transliteration model to exploit the phonetic similarities between English words and corresponding Chinese transliterations. For a given proper noun in English, the proposed method extracts the corresponding transliterated word from the aligned text in Chinese. Under the proposed approach, the parameters of the model are automatically learned from a bilingual proper name list. Experimental results show that the average rates of word and character precision are 86.0% and 94.4%, respectively. The rates can be further improved with the addition of simple linguistic processing.