Atomic Decomposition by Basis Pursuit
SIAM Review
Blind Source Separation by Sparse Decomposition in a Signal Dictionary
Neural Computation
Blind separation of speech mixtures via time-frequency masking
IEEE Transactions on Signal Processing
Matching pursuits with time-frequency dictionaries
IEEE Transactions on Signal Processing
Entropy-based algorithms for best basis selection
IEEE Transactions on Information Theory - Part 2
Sparse component analysis and blind source separation of underdetermined mixtures
IEEE Transactions on Neural Networks
Hi-index | 0.08 |
Audio source separation is a very challenging problem, and many different approaches have been proposed in attempts to solve it. We consider the problem of separating sources from two-channel instantaneous audio mixtures. One approach to this is to transform the mixtures into the time-frequency domain to obtain approximately disjoint representations of the sources, and then separate the sources using time-frequency masking. We focus on demixing the sources by binary masking, and assume that the mixing parameters are known. In this paper, we investigate the application of cosine packet (CP) trees as a foundation for the transform. We determine an appropriate transform by applying a computationally efficient best basis algorithm to a set of possible local cosine bases organised in a tree structure. We develop a heuristically motivated cost function which maximises the energy of the transform coefficients associated with a particular source. Finally, we evaluate objectively our proposed transform method by comparing it against fixed-basis transforms such as the short-time Fourier transform (STFT) and modified discrete cosine transform (MDCT). Evaluation results indicate that our proposed transform method outperforms MDCT and is competitive with the STFT, and informal listening tests suggest that the proposed method exhibits less objectionable noise than the STFT.