A New Method of Interpolation and Smooth Curve Fitting Based on Local Procedures
Journal of the ACM (JACM)
Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 05
Source/filter model for unsupervised main melody extraction from polyphonic audio signals
IEEE Transactions on Audio, Speech, and Language Processing
First stereo audio source separation evaluation campaign: data, algorithms and results
ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Multiple fundamental frequency estimation and polyphony inference of polyphonic music signals
IEEE Transactions on Audio, Speech, and Language Processing
A general modular framework for audio source separation
LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Blind separation of speech mixtures via time-frequency masking
IEEE Transactions on Signal Processing
Audio source separation with a single sensor
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
This research focuses on the removal of the singing voice in polyphonic audio recordings under real-time constraints. It is based on time-frequency binary masks resulting from the combination of azimuth, phase difference and absolute frequency spectral bin classification and harmonic-derived masks. For the harmonic-derived masks, a pitch likelihood estimation technique based on Tikhonov regularization is proposed. A method for target instrument pitch tracking makes use of supervised timbre models. This approach runs in real-time on off-the-shelf computers with latency below 250ms. The method was compared to a state of the art Non-negative Matrix Factorization (NMF) offline technique and to the ideal binary mask separation. For the evaluation we used a dataset of multi-track versions of professional audio recordings.