Rhythm Speech Lyrics Input for MIDI-Based Singing Voice Synthesis

Authors:
Hong-Ru Lee;Chih-Fang Huang;Chih-Hao Hsu;Wen-Nan Wang
Affiliations:
Department of Mechanical Engineering, National Chiao-Tung University,;Department of Information Communication, Yuan Ze University,;Innovative DigiTech-Enabled Applications & Services Institute, Taipei City, Taiwan 105;Innovative DigiTech-Enabled Applications & Services Institute, Taipei City, Taiwan 105
Venue:
PCM '09 Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Year:
2009

Citing 9
Cited 0

Synthesis of the singing voice

Current directions in computer music research
Hierarchical filtering method for content-based music retrieval via acoustic input

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Discrete Time Processing of Speech Signals

Discrete Time Processing of Speech Signals
Query by Tapping: A New Paradigm for Content-Based Music Retrieval from Acoustic Input

PCM '01 Proceedings of the Second IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
An On-the-Fly Mandarin Singing Voice Synthesis System

PCM '02 Proceedings of the Third IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
An automatic singing voice rectifier design

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Microcontroller implementation of melody recognition: a prototype

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Speech concatenation and synthesis using an overlap-add sinusoidal model

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Automatic Phonetic Segmentation by Score Predictive Model for the Corpora of Mandarin Singing Voices

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents useful techniques and considerations in implementing underlying mandarin singing voice synthesis system using the RSLI unit. The system can receive the continuous speech of the lyrics of a song, and can synthesize the intended song based on the MIDI-based music database. This system is designed based on 3 units.. The first one is the input unit which allows the user specifies a musical score and phonetically-spelled lyrics to system. The second one is the modified unit and it is employed to implement the pitch-shifting function using the PSOLA method. The third one is the mixed unit which has some undesirable artificial-sounding buzzy-effects, including echo and vibrato effects. Moreover, the energy, duration, and spectrum modifications are also implemented in the mixed unit. The synthesized singing voice sounds reasonably good. From the subjective listening test, the MOS (mean opinion score) of 3.3 and 3.2 are obtained for the synthesized singing voices and the similarity of singer's voice, respectively.