Principles and typical computational limitations of sparse speaker separation based on deterministic speech features

Authors:
Albert Kern;Ruedi Stoop
Affiliations:
-;-
Venue:
Neural Computation
Year:
2011

Citing 9
Cited 0

Multilayer feedforward networks are universal approximators

Neural Networks
Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks

Neural Networks
An information-maximization approach to blind separation and blind deconvolution

Neural Computation
Atomic Decomposition by Basis Pursuit

SIAM Review
Monaural musical sound separation based on pitch and common amplitude modulation

IEEE Transactions on Audio, Speech, and Language Processing
From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images

SIAM Review
Fast matching pursuit with a multiscale dictionary of Gaussianchirps

IEEE Transactions on Signal Processing
Matching pursuits with time-frequency dictionaries

IEEE Transactions on Signal Processing
Separation of speech from interfering sounds based on oscillatory correlation

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

The separation of mixed auditory signals into their sources is an eminent neuroscience and engineering challenge. We reveal the principles underlying a deterministic, neural network-like solution to this problem. This approach is orthogonal to ICA/PCA that views the signal constituents as independent realizations of random processes. We demonstrate exemplarily that in the absence of salient frequency modulations, the decomposition of speech signals into local cosine packets allows for a sparse, noise-robust speaker separation. As the main result, we present analytical limitations inherent in the approach, where we propose strategies of how to deal with this situation. Our results offer new perspectives toward efficient noise cleaning and auditory signal separation and provide a new perspective of how the brain might achieve these tasks.