An English to Korean transliteration model of extended Markov window

Authors:
Sung Young Jung;SungLim Hong;Eunok Paek
Affiliations:
Information Technology Lab., LG Electronics Institute of Technology, Seoul, Korea;Information Technology Lab., LG Electronics Institute of Technology, Seoul, Korea;Information Technology Lab., LG Electronics Institute of Technology, Seoul, Korea
Venue:
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Year:
2000

Citing 8
Cited 25

Toward memory-based reasoning

Communications of the ACM - Special issue on parallelism
Automatic text processing

Automatic text processing
Class-based n-gram models of natural language

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Tagging English text with a probabilistic model

Computational Linguistics
Machine transliteration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Syllable-based phonetic transcription by maximum likelihood methods

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Markov random field based English part-of-speech tagging system

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1

Transliteration of proper names in cross-language applications

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Transliteration of proper names in cross-lingual information retrieval

MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
A joint source-channel model for machine transliteration

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Weakly supervised named entity transliteration and discovery from multilingual comparable corpora

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Punjabi machine transliteration

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Direct orthographical mapping for machine transliteration

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Named entity transliteration and discovery from multilingual comparable corpora

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
A modified joint source-channel model for transliteration

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A phonetic similarity model for automatic extraction of transliteration pairs

ACM Transactions on Asian Language Information Processing (TALIP)
Unsupervised constraint driven learning for transliteration discovery

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A comparison of different machine transliteration models

Journal of Artificial Intelligence Research
Web-Based Transliteration of Person Names

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Transliteration alignment

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
English to Hindi machine transliteration system at NEWS 2009

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Improving transliteration accuracy using word-origin detection and lexicon lookup

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Name matching between Chinese and Roman scripts: machine complements human

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Compositional Machine Transliteration

ACM Transactions on Asian Language Information Processing (TALIP)
Comparative analysis of transliteration techniques based on statistical machine translation and joint-sequence model

Proceedings of the 2010 Symposium on Information and Communication Technology
Everybody loves a rich cousin: an empirical study of transliteration through bridge languages

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Arabic script web page language identifications using decision tree neural networks

Pattern Recognition
English to Indian languages machine transliteration system at NEWS 2010

NEWS '10 Proceedings of the 2010 Named Entities Workshop
Machine transliteration survey

ACM Computing Surveys (CSUR)
Improving machine transliteration performance by using multiple transliteration models

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
English to persian transliteration

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Regularized interlingual projections: evaluation on multilingual transliteration

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic transliteration problem is to transcribe foreign words in one's own alphabet. Machine generated transliteration can be useful in various applications such as indexing in an information retrieval system and pronunciation synthesis in a text-to-speech system. In this paper we present a model for statistical English-to-Korean transliteration that generates transliteration candidates with probability. The model is designed to utilize various information sources by extending a conventional Markov window. Also, an efficient and accurate method for alignment and syllabification of pronunciation units is described. The experimental results show a recall of 0.939 for trained words and 0.875 for untrained words when the best 10 candidates are considered.