Using a pitch-synchronous residual codebook for hybrid HMM/frame selection speech synthesis

Authors:
Thomas Drugman;Alexis Moinet;Thierry Dutoit;Geoffrey Wilfart
Affiliations:
Faculté Polytechnique de Mons, TCTS Lab, 31, Boulevard Dolez, 7000, Belgium;Faculté Polytechnique de Mons, TCTS Lab, 31, Boulevard Dolez, 7000, Belgium;Faculté Polytechnique de Mons, TCTS Lab, 31, Boulevard Dolez, 7000, Belgium;Acapela Group, Research and Development, 33, Boulevard Dolez, 7000 Mons, Belgium
Venue:
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Year:
2009

Citing 0
Cited 5

Review: Statistical parametric speech synthesis

Speech Communication
Speech modeling using the complex cepstrum

Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues
Data-driven voice source waveform analysis and synthesis

Speech Communication
Glottal source estimation using an automatic chirp decomposition

NOLISP'09 Proceedings of the 2009 international conference on Advances in Nonlinear Speech Processing
Pitch-Scaled Spectrum Based Excitation Model for HMM-based Speech Synthesis

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a method to improve the quality delivered by statistical parametric speech synthesizers. For this, we use a codebook of pitch-synchronous residual frames, so as to construct a more realistic source signal. First a limited codebook of typical excitations is built from some training database. During the synthesis part, HMMs are used to generate filter and source coefficients. The latter coefficients contain both the pitch and a compact representation of target residual frames. The source signal is obtained by concatenating excitation frames picked up from the codebook, based on a selection criterion and taking target residual coefficients as input. Subjective results show a relevant improvement compared to the basic technique.