Prosodic manipulation using instants of significant excitation

  • Authors:
  • K. S. Rao;B. Yegnanarayana

  • Affiliations:
  • Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Madras, India;Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Madras, India

  • Venue:
  • ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a technique for prosodic (pitch and duration) manipulation using instants of significant excitation. Instants of significant excitation correspond to the instants of glottal closure (epochs) in voiced speech and to some random excitations like burst onset in the case of nonvoiced speech. Instants of significant excitation are computed from the average group delay of minimum phase signals. The manipulation of pitch and duration is achieved by modifying the linear prediction (LP) residual with the help of instants of significant excitation as pitch markers. The modified residual is used to excite the time-varying filter whose parameters are derived from the original speech signal. Perceptual quality of the synthesized speech is found to be natural, and is without any distortion. The original and corresponding synthesized speech signals from the proposed approach are available for listening at http://speech.cs.iitm.ernet.in/Main/Results/Prosody.html.