Automatic Phonetic Segmentation by Score Predictive Model for the Corpora of Mandarin Singing Voices

Authors:
Cheng-Yuan Lin;Jyh-Shing Jang
Affiliations:
Nat. Tsing Hua Univ., Hsinchu;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2007

Citing 0
Cited 4

Speech segmentation using regression fusion of boundary predictions

Computer Speech and Language
Rhythm Speech Lyrics Input for MIDI-Based Singing Voice Synthesis

PCM '09 Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Phoneme and tonal accent recognition for Thai speech

Expert Systems with Applications: An International Journal
Melody recognition system based on overtone series theory

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes the concept of a score predictive model (SPM) that can refine the phoneme boundaries obtained by a hidden Markov model (HMM) and dynamic time warping (DTW) for a Mandarin singing voice corpus. An SPM is constructed by using support vector regression. It predicts the score of a phoneme boundary according to the boundary's 58-dimensional feature vector. The correctly identified boundaries of a singing corpus can then be used for corpus-based singing voice synthesis. Several experiments with different settings, including the use of different initial estimates, different acoustic features, and various regression approaches, were designed to verify the feasibility of the proposed approach. Experimental results demonstrate that the proposed SPM is able to effectively refine the results of the HMM and DTW.