Context-Aware features for singing voice detection in polyphonic music

Authors:
Vishweshwara Rao;Chitralekha Gupta;Preeti Rao
Affiliations:
Department of Electrical Engineering, IIT Bombay, Mumbai, India;Department of Electrical Engineering, IIT Bombay, Mumbai, India;Department of Electrical Engineering, IIT Bombay, Mumbai, India
Venue:
AMR'11 Proceedings of the 9th international conference on Adaptive Multimedia Retrieval: large-scale multimedia retrieval and evaluation
Year:
2011

Citing 14
Cited 0

On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic singer identification

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
The influence of polyphony on the dynamical modelling of musical timbre

Pattern Recognition Letters
Robust singing detection in speech/music discriminator design

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Singing voice detection in music tracks using direct voice vibrato detection

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
On the suitability of state-of-the-art music information retrieval methods for analyzing, categorizing and accessing non-Western and ethnic music collections

Signal Processing
A modeling of singing voice robust to accompaniment sounds and its application to singer identification and vocal-timbre-similarity-based music information retrieval

IEEE Transactions on Audio, Speech, and Language Processing
Dynamic spectral envelope modeling for timbre analysis of musical instrument sounds

IEEE Transactions on Audio, Speech, and Language Processing
Explicit modeling of temporal dynamics within musical signals for acoustical unit similarity

Pattern Recognition Letters
Vocal melody extraction in the presence of pitched accompaniment in polyphonic music

IEEE Transactions on Audio, Speech, and Language Processing
Exploring Vibrato-Motivated Acoustic Features for Singer Identification

IEEE Transactions on Audio, Speech, and Language Processing
Separation of Singing Voice From Music Accompaniment for Monaural Recordings

IEEE Transactions on Audio, Speech, and Language Processing
Effectiveness of ICF features for collection-specific CBIR

AMR'11 Proceedings of the 9th international conference on Adaptive Multimedia Retrieval: large-scale multimedia retrieval and evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The effectiveness of audio content analysis for music retrieval may be enhanced by the use of available metadata. In the present work, observed differences in singing style and instrumentation across genres are used to adapt acoustic features for the singing voice detection task. Timbral descriptors traditionally used to discriminate singing voice from accompanying instruments are complemented by new features representing the temporal dynamics of source pitch and timbre. A method to isolate the dominant source spectrum serves to increase the robustness of the extracted features in the context of polyphonic audio. While demonstrating the effectiveness of combining static and dynamic features, experiments on a culturally diverse music database clearly indicate the value of adapting feature sets to genre-specific acoustic characteristics. Thus commonly available metadata, such as genre, can be useful in the front-end of an MIR system.