Informed source separation through spectrogram coding and data embedding

Authors:
Antoine Liutkus;Jonathan Pinel;Roland Badeau;Laurent Girin;GaëL Richard
Affiliations:
Institut Telecom, Telecom ParisTech, CNRS LTCI, 37/39 rue Dareau, 75014 Paris, France;Grenoble Institute of Technology, 38402 Grenoble Cedex, France;Institut Telecom, Telecom ParisTech, CNRS LTCI, 37/39 rue Dareau, 75014 Paris, France;Grenoble Institute of Technology, 38402 Grenoble Cedex, France;Institut Telecom, Telecom ParisTech, CNRS LTCI, 37/39 rue Dareau, 75014 Paris, France
Venue:
Signal Processing
Year:
2012

Citing 24
Cited 0

The JPEG still picture compression standard

Communications of the ACM - Special issue on digital multimedia systems
Multidimensional Signal, Image, and Video Processing and Coding

Multidimensional Signal, Image, and Video Processing and Coding
Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis

Neural Computation
A Uniform Framework for Ad-Hoc Indexes to Answer Reachability Queries on Large Graphs

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
A watermarking-based method for single-channel audio source separation

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
An iterative approach to monaural musical mixture de-soloing

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Gamma Markov random fields for audio source modeling

IEEE Transactions on Audio, Speech, and Language Processing
Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation

Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation
Handbook of Blind Source Separation: Independent Component Analysis and Applications

Handbook of Blind Source Separation: Independent Component Analysis and Applications
A watermarking-based method for informed source separation of audio signals with a single sensor

IEEE Transactions on Audio, Speech, and Language Processing
A general modular framework for audio source separation

LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
Under-determined reverberant audio source separation using local observed covariance and auditory-motivated time-frequency representation

LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
Consistent wiener filtering: generalized time-frequency masking respecting spectrogram consistency

LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
Informed source separation using latent components

LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
Stability analysis of multiplicative update algorithms and application to nonnegative matrix factorization

IEEE Transactions on Neural Networks
Algorithms for nonnegative matrix factorization with the β-divergence

Neural Computation
Blind separation of speech mixtures via time-frequency masking

IEEE Transactions on Signal Processing
Performance measurement in blind audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Audio source separation with a single sensor

IEEE Transactions on Audio, Speech, and Language Processing
Quantization index modulation: a class of provably good methods for digital watermarking and information embedding

IEEE Transactions on Information Theory
Proper complex random processes with applications to information theory

IEEE Transactions on Information Theory
Gaussian Processes for Underdetermined Source Separation

IEEE Transactions on Signal Processing
Subjective and Objective Quality Assessment of Audio Source Separation

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.08

Visualization

Abstract

We address the issue of underdetermined source separation in a particular informed configuration where both the sources and the mixtures are known during a so-called encoding stage. This knowledge enables the computation of a side-information which is small enough to be inaudibly embedded into the mixtures. At the decoding stage, the sources are no longer assumed to be known, only the mixtures and the extracted side-information are processed for source separation. The proposed system models the sources as independent and locally stationary Gaussian processes (GP) and the mixing process as a linear filtering. This model allows reliable estimation of the sources through generalized Wiener filtering, provided their spectrograms are known. As these spectrograms are too large to be embedded in the mixtures, we show how they can be efficiently approximated using either Nonnegative Tensor Factorization (NTF) or image compression. A high-capacity embedding method is used by the system to inaudibly embed the separation side-information into the mixtures. This method is an application of the Quantization Index Modulation technique applied to the time-frequency coefficients of the mixtures and permits to reach embedding rates of about 250kbps. Finally, a study of the performance of the full system is presented.