Transliteration alignment

Authors:
Vladimir Pervouchine;Haizhou Li;Bo Lin
Affiliations:
Institute for Infocomm Research, A*STAR, Singapore;Institute for Infocomm Research, A*STAR, Singapore;School of Computer Engineering, NTU, Singapore
Venue:
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Year:
2009

Citing 21
Cited 7

The TREC-5 Confusion Track: Comparing Retrieval Methods for Scanned Text

Information Retrieval
Error-responsive feedback mechanisms for speech recognizers

Error-responsive feedback mechanisms for speech recognizers
Statistical transliteration for english-arabic cross language information retrieval

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Machine transliteration

Computational Linguistics
Automatic English-Chinese name transliteration for development of multilingual resources

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
An English to Korean transliteration model of extended Markov window

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
An English-Korean transliteration model using pronunciation and contextual rules

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Machine transliteration of names in Arabic text

SEMITIC '02 Proceedings of the ACL-02 workshop on Computational approaches to semitic languages
An evaluation exercise for word alignment

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Transliteration of proper names in cross-lingual information retrieval

MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Machine Learning Based English-to-Korean Transliteration Using Grapheme and Phoneme Information

IEICE - Transactions on Information and Systems
A joint source-channel model for machine transliteration

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Finding ideographic representations of Japanese names written in Latin script via language identification and corpus validation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A hybrid back-transliteration system for Japanese

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A modified joint source-channel model for transliteration

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A phonetic similarity model for automatic extraction of transliteration pairs

ACM Transactions on Asian Language Information Processing (TALIP)
Modeling impression in probabilistic transliteration into Chinese

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Unsupervised named entity transliteration using temporal and phonetic correlation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Named entity translation with web mining and transliteration

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Phoneme-Based transliteration of foreign names for OOV problem

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
A Vector Space Modeling Approach to Spoken Language Identification

IEEE Transactions on Audio, Speech, and Language Processing

Letter-phoneme alignment: an exploration

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
CONE: metrics for automatic evaluation of named entity co-reference resolution

NEWS '10 Proceedings of the 2010 Named Entities Workshop
Machine transliteration survey

ACM Computing Surveys (CSUR)
Machine transliteration: leveraging on third languages

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Sequence alignment with arbitrary steps and further generalizations, with applications to alignments in linguistics

Information Sciences: an International Journal
A joint model to identify and align bilingual named entities

Computational Linguistics
MDL-based models for transliteration generation

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper studies transliteration alignment, its evaluation metrics and applications. We propose a new evaluation metric, alignment entropy, grounded on the information theory, to evaluate the alignment quality without the need for the gold standard reference and compare the metric with F-score. We study the use of phonological features and affinity statistics for transliteration alignment at phoneme and grapheme levels. The experiments show that better alignment consistently leads to more accurate transliteration. In transliteration modeling application, we achieve a mean reciprocal rate (MRR) of 0.773 on Xinhua personal name corpus, a significant improvement over other reported results on the same corpus. In transliteration validation application, we achieve 4.48% equal error rate on a large LDC corpus.