Voice transformation by mapping the features at syllable level

  • Authors:
  • K. Sreenivasa Rao;R. H. Laskar;Shashidhar G. Koolagudi

  • Affiliations:
  • School of Information Technology, IIT Kharagpur, Kharagpur, West Bengal, India;Department of Electrical Engineering, NIT Silchar, Silchar, Assam, India;School of Information Technology, IIT Kharagpur, Kharagpur, West Bengal, India

  • Venue:
  • PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Voice transformation involves modifying the source speaker voice to target speaker voice. Voice characteristics of a speaker depends on the shape of the glottal pulse (source characteristics), shape of the vocal tract system (system characteristics) and the long term features (prosody or supra-segmental) of the speech signal produced by the speaker. In this paper we proposed the mapping functions to transform the vocal tract characteristics and intonation characteristics from source speaker to target speaker. Mapping functions are developed by the features extracted from syllable level. The shape of the vocal tract system is characterized by linear prediction coefficients, and the mapping function is realized by a five layer feedforward neural network. Mapping of the intonation characteristics (pitch contour) is provided by associating the code books derived fromthe pitch contours of the source and target speakers. The proposed mapping functions are used in voice transformation task. The target speaker's speech is synthesized and evaluated using listening tests. The results of the listening tests indicate that the proposed voice transformation provides better mapping of the voice characteristics compared to the earlier method proposed by the author. The original and the synthesized speech signals obtained usingmapping functions are available for listening at http://shilloi.iitg.ernet.in/~ksrao/result.html