IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic singer identification
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
The influence of polyphony on the dynamical modelling of musical timbre
Pattern Recognition Letters
Robust singing detection in speech/music discriminator design
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Singing voice detection in music tracks using direct voice vibrato detection
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
IEEE Transactions on Audio, Speech, and Language Processing
Dynamic spectral envelope modeling for timbre analysis of musical instrument sounds
IEEE Transactions on Audio, Speech, and Language Processing
Explicit modeling of temporal dynamics within musical signals for acoustical unit similarity
Pattern Recognition Letters
Vocal melody extraction in the presence of pitched accompaniment in polyphonic music
IEEE Transactions on Audio, Speech, and Language Processing
Exploring Vibrato-Motivated Acoustic Features for Singer Identification
IEEE Transactions on Audio, Speech, and Language Processing
Separation of Singing Voice From Music Accompaniment for Monaural Recordings
IEEE Transactions on Audio, Speech, and Language Processing
Effectiveness of ICF features for collection-specific CBIR
AMR'11 Proceedings of the 9th international conference on Adaptive Multimedia Retrieval: large-scale multimedia retrieval and evaluation
Hi-index | 0.00 |
The effectiveness of audio content analysis for music retrieval may be enhanced by the use of available metadata. In the present work, observed differences in singing style and instrumentation across genres are used to adapt acoustic features for the singing voice detection task. Timbral descriptors traditionally used to discriminate singing voice from accompanying instruments are complemented by new features representing the temporal dynamics of source pitch and timbre. A method to isolate the dominant source spectrum serves to increase the robustness of the extracted features in the context of polyphonic audio. While demonstrating the effectiveness of combining static and dynamic features, experiments on a culturally diverse music database clearly indicate the value of adapting feature sets to genre-specific acoustic characteristics. Thus commonly available metadata, such as genre, can be useful in the front-end of an MIR system.