Voice morphing using 3D waveform interpolation surfaces and lossless tube area functions
EURASIP Journal on Applied Signal Processing
Quality enhancement of compressed audio based on statistical conversion
EURASIP Journal on Audio, Speech, and Music Processing - Scalable Audio-Content Analysis
Multimodal Human Machine Interactions in Virtual and Augmented Reality
Multimodal Signals: Cognitive and Algorithmic Issues
Transformation procedure for paternal and pathology voices
PDCN '08 Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks
Voice conversion by mapping the speaker-specific features using pitch synchronous approach
Computer Speech and Language
Voice disguise and automatic detection: review and perspectives
Progress in nonlinear speech processing
TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Spectral mapping using artificial neural networks for voice conversion
IEEE Transactions on Audio, Speech, and Language Processing
Developing objective measures of foreign-accent conversion
IEEE Transactions on Audio, Speech, and Language Processing
ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
First steps towards new czech voice conversion system
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Data driven approaches to speech and language processing
Nonlinear Speech Modeling and Applications
Comparing ANN and GMM in a voice conversion framework
Applied Soft Computing
Voice conversion using linear prediction coefficients and artificial neural network
Proceedings of the CUBE International Information Technology Conference
Hi-index | 0.00 |
The purpose of a voice conversion (VC) system is to change the perceived speaker identity of a speech signal. We propose an algorithm based on converting the LPC spectrum and predicting the residual as a function of the target envelope parameters. We conduct listening tests based on speaker discrimination of same/difference pairs to measure the accuracy by which the converted voices match the desired target voices. To establish the level of human performance as a baseline, we first measure the ability of listeners to discriminate between original speech utterances under three conditions: normal, fundamental frequency and duration normalized, and LPC coded. Additionally, the spectral parameter conversion function is tested in isolation by listening to source, target, and converted speakers as LPC coded speech. The results show that the speaker identity of speech whose LPC spectrum has been converted can be recognized as the target speaker with the same level of performance as discriminating between LPC coded speech. However, the level of discrimination of converted utterances produced by the full VC system is significantly below that of speaker discrimination of natural speech.