Improving back-transliteration by combining information sources

Authors:
Slaven Bilac;Hozumi Tanaka
Affiliations:
Department of Computer Science, Tokyo Institute of Technology, Tokyo;Department of Computer Science, Tokyo Institute of Technology, Tokyo
Venue:
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Year:
2004

Citing 7
Cited 7

A technique for computer detection and correction of spelling errors

Communications of the ACM
Machine transliteration

Computational Linguistics
An English-Korean transliteration model using pronunciation and contextual rules

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
An improved error model for noisy channel spelling correction

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Backward machine transliteration by learning phonetic similarity

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Finding the k shortest paths

SFCS '94 Proceedings of the 35th Annual Symposium on Foundations of Computer Science
Translating names and technical terms in Arabic text

Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages

An ensemble of transliteration models for information retrieval

Information Processing and Management: an International Journal
Direct orthographical mapping for machine transliteration

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A comparison of different machine transliteration models

Journal of Artificial Intelligence Research
Machine transliteration survey

ACM Computing Surveys (CSUR)
Improving machine transliteration performance by using multiple transliteration models

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Direct combination of spelling and pronunciation information for robust back-transliteration

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
An ensemble of grapheme and phoneme for machine transliteration

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Transliterating words and names from one language to another is a frequent and highly productive phenomenon. Transliteration is information loosing since important distinctions are not preserved in the process. Hence, automatically converting transliterated words back into their original form is a real challenge. However, due to wide applicability in MT and CLIR, it is a computationally interesting problem. Previously proposed back-transliteration methods are based either on phoneme modeling or grapheme modeling across languages. In this paper, we propose a new method, combining the two models in order to enhance the back–transliterations of words transliterated in Japanese. Our experiments show that the resulting system outperforms single-model systems.