Singing voice detection in popular music

Authors:
Tin Lay Nwe;Arun Shenoy;Ye Wang
Affiliations:
National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore
Venue:
Proceedings of the 12th annual ACM international conference on Multimedia
Year:
2004

Citing 3
Cited 4

Fundamentals of speech recognition

Fundamentals of speech recognition
Speech recognition: theory and C++ implementation

Speech recognition: theory and C++ implementation
Robust singing detection in speech/music discriminator design

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02

LyricAlly: automatic synchronization of acoustic musical signals and textual lyrics

Proceedings of the 12th annual ACM international conference on Multimedia
Key, Chord, and Rhythm Tracking of Popular Music Recordings

Computer Music Journal
Enriching music with synchronized lyrics, images and colored lights

Proceedings of the 1st international conference on Ambient media and systems
MUSIZ: a generic framework for music resizing with stretching and cropping

MM '11 Proceedings of the 19th ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a novel technique for the automatic classification of vocal and non-vocal regions in an acoustic musical signal. Our technique uses a combination of harmonic content attenuation using higher level musical knowledge of key followed by sub-band energy processing to obtain features from the musical audio signal. We employ a Multi-Model Hidden Markov Model (MM-HMM) classifier for vocal and non-vocal classification that utilizes song structure information to create multiple models as opposed to conventional HMM training methods that employ only one model for each class. A statistical hypothesis testing approach followed by an automatic bootstrapping process is employed to further improve the accuracy of classification. An experimental evaluation on a database of 20 popular songs shows the validity of the proposed approach with an average classification accuracy of 86.7%.