Voice transformation using PSOLA technique
Speech Communication - Eurospeech '91
Speech Communication - Special issue: voice conversion: state of the art and perspectives
Transformation of formants for voice conversion using artificial neural networks
Speech Communication - Special issue: voice conversion: state of the art and perspectives
Speaker transformation algorithm using segmental codebooks (STASC)
Speech Communication
Self-Organizing Maps
Incremental Nonlinear Dimensionality Reduction by Manifold Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust and efficient quantization of speech LSP parameters using structured vector quantizers
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Information Sciences: an International Journal
Rapid and brief communication: Incremental locally linear embedding
Pattern Recognition
Embedding new data points for manifold learning via coordinate propagation
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
We propose new supervisory data alignment methods for text-independent voice conversion which do not need parallel training corpora. Phonetic information is used as a restriction during alignment for mapping the data from the source speaker onto the parameter space of a target speaker. Both linear and nonlinear methods are derived by considering alignment accuracy and topology preservation. For the linear alignment, we consider common phoneme clusters of the source and target space as benchmarks and adapt the source data vector to the target space while maintaining the relative phonetic positions among neighborhood clusters. In order to preserve the topological structure of the source parameter space and improve the stability of conversion and the accuracy of the phonetic mapping, a supervised self-organizing learning algorithm considering phonetic restriction is proposed for iteratively improving the alignment outcome of the previous step. Both the linear and nonlinear methods can also be applied in the cross-lingual case. Evaluation results show that the proposed methods improve the performance of alignment in terms of both alignment accuracy and stability for text-independent voice conversion in intra-lingual and cross-lingual cases.