Transformation of formants for voice conversion using artificial neural networks
Speech Communication - Special issue: voice conversion: state of the art and perspectives
Speaker transformation algorithm using segmental codebooks (STASC)
Speech Communication
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Voice Conversion by Prosody and Vocal Tract Modification
ICIT '06 Proceedings of the 9th International Conference on Information Technology
Modeling durations of syllables using neural networks
Computer Speech and Language
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Statistical Approach for Voice Personality Transformation
IEEE Transactions on Audio, Speech, and Language Processing
Prosody modification using instants of significant excitation
IEEE Transactions on Audio, Speech, and Language Processing
Voice conversion by mapping the speaker-specific features using pitch synchronous approach
Computer Speech and Language
Film segmentation and indexing using autoassociative neural networks
International Journal of Speech Technology
Hi-index | 0.00 |
Voice transformation involves modifying the source speaker voice to target speaker voice. Voice characteristics of a speaker depends on the shape of the glottal pulse (source characteristics), shape of the vocal tract system (system characteristics) and the long term features (prosody or supra-segmental) of the speech signal produced by the speaker. In this paper we proposed the mapping functions to transform the vocal tract characteristics and intonation characteristics from source speaker to target speaker. Mapping functions are developed by the features extracted from syllable level. The shape of the vocal tract system is characterized by linear prediction coefficients, and the mapping function is realized by a five layer feedforward neural network. Mapping of the intonation characteristics (pitch contour) is provided by associating the code books derived fromthe pitch contours of the source and target speakers. The proposed mapping functions are used in voice transformation task. The target speaker's speech is synthesized and evaluated using listening tests. The results of the listening tests indicate that the proposed voice transformation provides better mapping of the voice characteristics compared to the earlier method proposed by the author. The original and the synthesized speech signals obtained usingmapping functions are available for listening at http://shilloi.iitg.ernet.in/~ksrao/result.html