Graph Embedding and Extensions: A General Framework for Dimensionality Reduction
IEEE Transactions on Pattern Analysis and Machine Intelligence
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
Singing voice detection in music tracks using direct voice vibrato detection
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
On the improvement of singing voice separation for monaural recordings using the MIR-1K dataset
IEEE Transactions on Audio, Speech, and Language Processing
Proceedings of the international conference on Multimedia
A Singular Value Thresholding Algorithm for Matrix Completion
SIAM Journal on Optimization
Robust principal component analysis?
Journal of the ACM (JACM)
Music Emotion Recognition
Music retagging using label propagation and robust principal component analysis
Proceedings of the 21st international conference companion on World Wide Web
Unsupervised Single-Channel Music Source Separation by Average Harmonic Structure Modeling
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
Over recent years there has been a growing interest in finding ways to transform signals/matrices into sparse or low-rank representations, i.e., representations which are sparse in support or of low redundancy. Such decompositions are proving to be particularly powerful for a variety of signal processing and compression problems. In this paper, we investigate the application of this technique to the challenging task of singing voice/accompaniment separation for popular music. The vocal part is modeled as a sparse signal, whereas the instrumental part is considered to be low-rank. In addition, to better account for the particular properties of music, two new algorithms are proposed to improve the decomposition, including the incorporation of harmonicity priors and a back-end drum removal procedure. Evaluations on the MIR-1K benchmark dataset show that the proposed algorithms outperform the state-of-the-art by 0.01-2.41 db.