Joint prosody prediction and unit selection for concatenative speech synthesis

Authors:
I. Bulyko;M. Ostendorf
Affiliations:
Dept. of Electr. Eng., Washington Univ., Seattle, WA, USA;-
Venue:
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Year:
2001

Citing 0
Cited 4

Time and space-efficient architecture for a corpus-based text-to-speech synthesis system

Speech Communication
Multisyn: Open-domain unit selection for the Festival speech synthesis system

Speech Communication
Individual and domain adaptation in sentence planning for dialogue

Journal of Artificial Intelligence Research
Embodied conversational agents in Wizard-of-Oz and multimodal interaction applications

COST 2102'07 Proceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe how prosody prediction can be efficiently integrated with the unit selection process in a concatenative speech synthesizer under a weighted finite-state transducer (WFST) architecture. WFSTs representing prosody prediction and unit selection can be composed during synthesis, thus effectively expanding the space of possible prosodic targets. We implemented a symbolic prosody prediction module and a unit selection database as the synthesis components of a travel planning system. Results of perceptual experiments show that by combining the steps of prosody prediction and unit selection we are able to achieve improved naturalness of synthetic speech compared to the sequential implementation.