Singing voice detection using perceptually-motivated features

Authors:
Tin Lay Nwe;Haizhou Li
Affiliations:
Institute for Infocomm Research, Singapore, Singapore;Institute for Infocomm Research, Singapore, Singapore
Venue:
Proceedings of the 15th international conference on Multimedia
Year:
2007

Citing 4
Cited 1

Fundamentals of speech recognition

Fundamentals of speech recognition
Speech recognition: theory and C++ implementation

Speech recognition: theory and C++ implementation
Automatic singer identification

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Exploring Vibrato-Motivated Acoustic Features for Singer Identification

IEEE Transactions on Audio, Speech, and Language Processing

Machine Recognition of Music Emotion: A Review

ACM Transactions on Intelligent Systems and Technology (TIST)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Perceptual features are motivated by human perception of sounds. In this paper, several perceptually-motivated features such as harmonic, vibrato and timbre are studied to detect singing voice segments in a song. In addition, singing formant and attack-decay envelope of the sound are also studied for acoustic feature formulation. The cepstral coefficients which reflect the timbre characteristics are formulated by combining information from harmonic content, vibrato, singing formant and attack-decay envelope of the sound. Bandpass filters that spread according to the octave frequency scale are used to extract vibrato and harmonic information. Several experiments are conducted using a database that includes 84 popular songs from commercially available CD recordings. The experiments show that the proposed feature formulation methods are effective.