Comparative analysis of transliteration techniques based on statistical machine translation and joint-sequence model

Authors:
Nam X. Cao;Nhut M. Pham;Quan H. Vu
Affiliations:
University of Science, Vietnam;University of Science, Vietnam;University of Science, Vietnam
Venue:
Proceedings of the 2010 Symposium on Information and Communication Technology
Year:
2010

Citing 5
Cited 0

Regular models of phonological rule systems

Computational Linguistics - Special issue on computational phonology
An English to Korean transliteration model of extended Markov window

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Transliteration of proper names in cross-lingual information retrieval

MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Joint-sequence models for grapheme-to-phoneme conversion

Speech Communication
Phoneme-Based transliteration of foreign names for OOV problem

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The inability to deal with words in foreign languages imposes difficulties to both Vietnamese speech recognition and text-to-speech systems. A common solution is to look up a dictionary, but the number of available entries is finite and therefore not flexible because speech recognition and text-to-speech systems are expected to handle arbitrary words. Alternatively, data-driven approaches can be employed to transliterate a foreign word into its Vietnamese pronunciation by learning samples and predicting unseen words. This paper presents a comparative analysis between two data-driven approaches based on statistical machine translation and joint-sequence model. Two systems based on these approaches are developed and tested using the same experimental protocol and a dataset consisting of 8050 English words. Results show that joint-sequence model outperforms statistical machine translation in English-to-Vietnamese transliteration.