Non-parametric techniques for pitch-scale and time-scale modification of speech
Speech Communication - Special issue: voice conversion: state of the art and perspectives
Discrete-time signal processing (2nd ed.)
Discrete-time signal processing (2nd ed.)
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Intonation modeling for Indian languages
Computer Speech and Language
Application of prosody models for developing speech systems in Indian languages
International Journal of Speech Technology
Expressive speech synthesis: a review
International Journal of Speech Technology
Hi-index | 0.00 |
This paper proposes a technique for prosodic (pitch and duration) manipulation using instants of significant excitation. Instants of significant excitation correspond to the instants of glottal closure (epochs) in voiced speech and to some random excitations like burst onset in the case of nonvoiced speech. Instants of significant excitation are computed from the average group delay of minimum phase signals. The manipulation of pitch and duration is achieved by modifying the linear prediction (LP) residual with the help of instants of significant excitation as pitch markers. The modified residual is used to excite the time-varying filter whose parameters are derived from the original speech signal. Perceptual quality of the synthesized speech is found to be natural, and is without any distortion. The original and corresponding synthesized speech signals from the proposed approach are available for listening at http://speech.cs.iitm.ernet.in/Main/Results/Prosody.html.