Introduction to the Theory of Computation
Introduction to the Theory of Computation
Expert Systems
A context-sensitive homograph disambiguation in Thai text-to-speech synthesis
NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Hi-index | 0.00 |
Transliteration or transcription of names is necessary to communicate between different language communities, e.g. English to Thai writing system. Since names tend to show a certain intrinsic grade of variation this is even more the case for the transliterated or transcribed forms. Correct transcription and transliteration of names is one of the major problems in inter-cultural communication. Available standard "manual" transcription systems are often simply not used or are used inconsistently. Many computer-assisted systems are based on orthographic forms or pronunciation, rule based, and statistics-based approaches. In this paper we discuss the problems of Romanization, e.g. ambiguities of pronunciation as well as syllable and word segmentation. These problems can be considerable guidelines an implementation of backward transcription from English to Thai. To standardise this process the author proposes an automated English to Thai transcription system, called RESETT (Rule-based Expert System for English to Thai Transcription). This tool uses rule based Royal Thai General System of Transcription, syllable pronounciation and segmentation, and a hybrid name matching algorithm called LIG3 (Levenshtein, Index of similarity, and Guth). An advantage of the name matching process is an optimised transliteration of the rather complex Thai writing system. The LIG3 algorithm helps to produce highly accurate matches for transcribed forms.