Transliteration alignment

  • Authors:
  • Vladimir Pervouchine;Haizhou Li;Bo Lin

  • Affiliations:
  • Institute for Infocomm Research, A*STAR, Singapore;Institute for Infocomm Research, A*STAR, Singapore;School of Computer Engineering, NTU, Singapore

  • Venue:
  • ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper studies transliteration alignment, its evaluation metrics and applications. We propose a new evaluation metric, alignment entropy, grounded on the information theory, to evaluate the alignment quality without the need for the gold standard reference and compare the metric with F-score. We study the use of phonological features and affinity statistics for transliteration alignment at phoneme and grapheme levels. The experiments show that better alignment consistently leads to more accurate transliteration. In transliteration modeling application, we achieve a mean reciprocal rate (MRR) of 0.773 on Xinhua personal name corpus, a significant improvement over other reported results on the same corpus. In transliteration validation application, we achieve 4.48% equal error rate on a large LDC corpus.