Robust rule-based method for automatic break assignment in russian texts

  • Authors:
  • Ilya Oparin

  • Affiliations:
  • ,Dept. of Phonetics, St.Petersburg State University, St.Petersburg, Russia

  • Venue:
  • TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper a new rule-based approach to break assignment for the Russian language is discussed. It is a flexible and robust method of segmentation of texts in Russian in prosodic units. We implemented it in the recent “Orator” text-to-speech (TTS) system. The model was developed to use for the inflective languages as an alternative both for statistic and for strict rule-based algorithms. It is designed in such a way that all potentially tunable context dependencies are brought up to the interface grammar and can be easily modified by linguists. The algorithm we developed performs well on different kinds of texts due to this simple and intuitive grammar built upon an elaborate mechanism of morpho-grammatical analysis. Juncture correct rate varies between more than 98% for simple literary texts and 85% for raw transcripts of spontaneous speech.