Stochastic modeling of spectral adjustment for high quality pitch modification

Authors:
A. Kain;Y. Stylianou
Affiliations:
Center for Spoken Language Understanding, Oregon Graduate Inst. of Sci. & Technol., Beaverton, OR, USA;-
Venue:
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Year:
2000

Citing 0
Cited 2

Hierarchical prosody conversion using regression-based clustering for emotional speech synthesis

IEEE Transactions on Audio, Speech, and Language Processing
Voice conversion based on weighted least squares estimation criterion and residual prediction from pitch contour

ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a new algorithm for adjusting the magnitude spectrum when the fundamental frequency (F/sub 0/) of a speech signal is altered. The algorithm exploits the correlation between F/sub 0/ and the magnitude spectrum of speech as represented by line spectral frequencies (LSFs). This correlation is class-dependent, and thus a broad classification of the input is achieved by a Gaussian mixture model (GMM). The within-class dependencies of LSFs on F/sub 0/ values are captured by constructing their joint probability densities using a series of GMMs, one for each speech class. The proposed system is used for post-processing the pitch modified signal. Perceptual tests showed that the addition of this post-processing system improves the naturalness of the pitch modified signal for large pitch modification factors.