A context-sensitive homograph disambiguation in Thai text-to-speech synthesis

Authors:
Virongrong Tesprasit;Paisarn Charoenpornsawat;Virach Sornlertlamvanich
Affiliations:
National Electronics and Computer Technology Center, Klong Luang, Pathumthani, Thailand;National Electronics and Computer Technology Center, Klong Luang, Pathumthani, Thailand;National Electronics and Computer Technology Center, Klong Luang, Pathumthani, Thailand
Venue:
NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Year:
2003

Citing 3
Cited 4

Empirical Support for Winnow and Weighted-MajorityAlgorithms: Results on a Calendar Scheduling Domain

Machine Learning
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Decision lists for lexical ambiguity resolution: application to accent restoration in Spanish and French

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics

Thai speech processing technology: A review

Speech Communication
Automatic rule-based expert system for English to Thai transcription

ACST'07 Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and Technology
Prosody analysis of Thai emotion utterances

NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
ChulaDAISY: an automated DAISY audio book generation

Proceedings of the 6th International Conference on Rehabilitation Engineering & Assistive Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Homograph ambiguity is an original issue in Text-to-Speech (TTS). To disambiguate homograph, several efficient approaches have been proposed such as part-of-speech (POS) n-gram, Bayesian classifier, decision tree, and Bayesian-hybrid approaches. These methods need words or/and POS tags surrounding the question homographs in disambiguation. Some languages such as Thai, Chinese, and Japanese have no word-boundary delimiter. Therefore before solving homograph ambiguity, we need to identify word boundaries. In this paper, we propose a unique framework that solves both word segmentation and homograph ambiguity problems altogether. Our model employs both local and long-distance contexts, which are automatically extracted by a machine learning technique called Winnow.