Automatic transcription of numerals in inflectional languages

  • Authors:
  • Jan Zelinka;Jakub Kanis;Luděk Müller

  • Affiliations:
  • Department of Cybernetics, University of West Bohemia, Plzeň, Czech Republic;Department of Cybernetics, University of West Bohemia, Plzeň, Czech Republic;Department of Cybernetics, University of West Bohemia, Plzeň, Czech Republic

  • Venue:
  • TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we describe the part of the text preprocessing module in our text-to-speech synthesis system which converts numerals written as figures into a readable full-length form, which could be processed by a phonetic transcription module. The numerals conversion is a significant issue in inflectional language as Czech, Russian or Slovak because morphological and semantic information is necessary to make the conversion unambiguous. In the paper three part-of-speech tagging methods are compared. Furthermore, a method reducing the tagset to increase the numerals conversion accuracy is presented in the paper.