Morphological tagging: data vs. dictionaries
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Transformation-based learning in the fast lane
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Web text data mining for building large scale language modelling corpus
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Web-based system for automatic reading of technical documents for vision impaired students
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Hi-index | 0.00 |
In this paper we describe the part of the text preprocessing module in our text-to-speech synthesis system which converts numerals written as figures into a readable full-length form, which could be processed by a phonetic transcription module. The numerals conversion is a significant issue in inflectional language as Czech, Russian or Slovak because morphological and semantic information is necessary to make the conversion unambiguous. In the paper three part-of-speech tagging methods are compared. Furthermore, a method reducing the tagset to increase the numerals conversion accuracy is presented in the paper.