Special speech synthesis for social network websites

  • Authors:
  • Csaba Zainkó;Tamás Gábor Csapó;Géza Németh

  • Affiliations:
  • Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Hungary;Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Hungary;Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Hungary

  • Venue:
  • TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper gives an overview of the design concepts and implementation of a Hungarian microblog reading system. Speech synthesis of such special text requires some special components. First, an efficient diacritic reconstruction algorithm was applied. The accuracy of a former dictionary-based method was improved by machine learning to handle ambiguous cases properly. Second, an unlimited domain text-to-speech synthesizer was applied with extensions for emotional and spontaneous styles. Chat or blog texts often contain "emoticons" which mark the emotional state of the user. Therefore, an expressive speech synthesis method was adapted to a corpus-based synthesizer. Four emotions were generated and evaluated in a listening test: neutral, happy, angry and sad. The results of the experiments showed that happy and sad emotions can be generated with this algorithm, with best accuracy for female voice.