Phoneme-Based transliteration of foreign names for OOV problem

Authors:
Wei Gao;Kam-Fai Wong;Wai Lam
Affiliations:
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T, Hong Kong;Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T, Hong Kong;Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T, Hong Kong
Venue:
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Year:
2004

Citing 6
Cited 20

A maximum entropy approach to natural language processing

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Machine transliteration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Fast decoding and optimal decoding for machine translation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Transliteration of proper names in cross-lingual information retrieval

MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15

Named entity transliteration with comparable corpora

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Direct orthographical mapping for machine transliteration

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A phonetic similarity model for automatic extraction of transliteration pairs

ACM Transactions on Asian Language Information Processing (TALIP)
A Structure-Based Model for Chinese Organization Name Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Unsupervised named entity transliteration using temporal and phonetic correlation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Discriminative methods for transliteration

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Learning to match names across languages

MMIES '08 Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization
Transliteration alignment

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Can Chinese phonemes improve machine transliteration?: a comparative study of English-to-Chinese transliteration models

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Name matching between Chinese and Roman scripts: machine complements human

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Transliteration for Resource-Scarce Languages

ACM Transactions on Asian Language Information Processing (TALIP)
Comparative analysis of transliteration techniques based on statistical machine translation and joint-sequence model

Proceedings of the 2010 Symposium on Information and Communication Technology
Machine transliteration survey

ACM Computing Surveys (CSUR)
Comparison of ensemble classifiers in extracting synonymous Chinese transliteration pairs from web

ICSI'11 Proceedings of the Second international conference on Advances in swarm intelligence - Volume Part II
Generating term transliterations using contextual information and validating generated results using web corpora

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Improving transliteration with precise alignment of phoneme chunks and using contextual features

AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Translation techniques in cross-language information retrieval

ACM Computing Surveys (CSUR)
Regularized interlingual projections: evaluation on multilingual transliteration

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Report of NEWS 2012 machine transliteration shared task

NEWS '12 Proceedings of the 4th Named Entity Workshop
A joint model to identify and align bilingual named entities

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

A proper noun dictionary is never complete rendering name translation from English to Chinese ineffective. One way to solve this problem is not to rely on a dictionary alone but to adopt automatic translation according to pronunciation similarities, i.e. to map phonemes comprising an English name to the phonetic representations of the corresponding Chinese name. This process is called transliteration. We present a statistical transliteration method. An efficient algorithm for aligning phoneme chunks is described. Unlike rule-based approaches, our method is data-driven. Compared to source-channel based statistical approaches, we adopt a direct transliteration model, i.e. the direction of probabilistic estimation conforms to the transliteration direction. We demonstrate comparable performance to source-channel based system.