Using phone and diphone based acoustic models for voice conversion: a step towards creating voice fonts

Authors:
A. Kumar;A. Verma
Affiliations:
Centre for Appl. Res. in Electron., Indian Inst. of Technol., New Delhi, India;Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Madras, India
Venue:
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Year:
2003

Citing 4
Cited 1

Research on individuality features in speech waves and automatic speaker recognition techniques

Speech Communication - Special issue: Speech research in Japan
Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones

Speech Communication
Voice transformation using PSOLA technique

Speech Communication - Eurospeech '91
A segment-based approach to voice conversion

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference

Minnesang: speak medieval German

CHI '06 Extended Abstracts on Human Factors in Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Voice conversion techniques attempt to modify speech signal so that it is perceived as if spoken by another speaker, different from the original speaker. In this paper, we present a novel approach to perform voice conversion. Our approach uses acoustic models based on units of speech, like phones and diphones, for voice conversion. These models can be computed and used independently for a given speaker without being concerned about the source or target speaker. It avoids the use of a parallel speech corpus in the voices of source and target speakers. It is shown that by using the proposed approach, voice fonts can be created and stored which will represent individual characteristics of a particular speaker, to be used for customization of synthetic speech. We also show through objective and subjective tests, that voice conversion quality is comparable to other approaches that require a parallel speech corpus.