High quality emotional HMM-Based synthesis in spanish

Authors:
Xavi Gonzalvo;Paul Taylor;Carlos Monzo;Ignasi Iriondo;Joan Claudi Socoró
Affiliations:
Phonetic-Arts Ltd. St. John's Innovation Center, Cambridge, UK;Phonetic-Arts Ltd. St. John's Innovation Center, Cambridge, UK;Enginyeria i Arquitectura La Salle, Universitat Ramon Llull Grup de Recerca en Processament;Enginyeria i Arquitectura La Salle, Universitat Ramon Llull Grup de Recerca en Processament;Enginyeria i Arquitectura La Salle, Universitat Ramon Llull Grup de Recerca en Processament
Venue:
NOLISP'09 Proceedings of the 2009 international conference on Advances in Nonlinear Speech Processing
Year:
2009

Citing 4
Cited 0

Speech Representation and Transformation IJsing Adaptive Interpolation of Weighted Spectrum: VOCODER Revisited

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005

IEICE - Transactions on Information and Systems
A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis

IEICE - Transactions on Information and Systems
Objective and subjective evaluation of an expressive speech corpus

NOLISP'07 Proceedings of the 2007 international conference on Advances in nonlinear speech processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a high-quality Spanish HMM-based speech synthesis of emotional speaking styles. The quality of the HMM-based speech synthesis is enhanced by using the most recent features presented for the Blizzard system (i.e. STRAIGHT spectrum extraction and mixed excitation). Two techniques are evaluated. First, a method simultaneously model all emotions within a single acoustic model. Second, an adaptation techniques to convert a neutral emotional style to a target emotion. We consider 3 kinds of emotions expressions: neutral, happy and sad. A subjective evaluation will show the quality of the system and the intensity of the produced emotion while an objective evaluation based on voice quality parameters evaluates the effectiveness of the approaches.