An On-the-Fly Mandarin Singing Voice Synthesis System

Authors:
Cheng-Yuan Lin;Jyh-Shing Roger Jang;Shaw-Hwa Hwang
Affiliations:
-;-;-
Venue:
PCM '02 Proceedings of the Third IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Year:
2002

Citing 5
Cited 2

Synthesis of the singing voice

Current directions in computer music research
Frequency modulation synthesis of the singing voice

Current directions in computer music research
Fundamentals of speech recognition

Fundamentals of speech recognition
A Singing Voice Synthesis System Based on Sinusoidal Modeling

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Speech synthesis based on sinusoidal modeling

Speech synthesis based on sinusoidal modeling

An automatic singing voice rectifier design

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Rhythm Speech Lyrics Input for MIDI-Based Singing Voice Synthesis

PCM '09 Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

An on-the-fly Mandarin singing voice synthesis system, called SINVOIS (singing voice synthesis), is proposed in this paper. The SINVOIS system can receive the continuous speech of the lyrics of a song, and generate the singing voice immediately based on the music score information (embedded in a MIDI file) of the song. Two sub-systems are designed and embedded into the system. One is the synthesis unit generator and the other is the pitch-shifting module. In the first one, the Viterbi decoding algorithm is employed on a continuous speech to generate the synthesis unit for singing voice. And the PSOLA method is employed to implement the pitch-shifting function in the second one. Moreover, the energy, duration, and spectrum modifications on the synthesis unit are also implemented in the second part. The synthesized singing voice sounds reasonably good. From the subjective listening test, the MOS (mean opinion score) of 3.1 are obtained for synthesized singing voices.