Location Feature Integration for Clustering-Based Speech Separation in Distributed Microphone Arrays

Authors:
Mehrez Souden;Keisuke Kinoshita;Marc Delcroix;Tomohiro Nakatani
Affiliations:
Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA;NTT Commun. Sci. Labs., Kyoto, Japan;NTT Commun. Sci. Labs., Kyoto, Japan;NTT Commun. Sci. Labs., Kyoto, Japan
Venue:
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Year:
2014

Citing 15
Cited 0

On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Frequency-domain criterion for the speech distortion weighted multichannel Wiener filter for robust noise reduction

Speech Communication
The fusion of distributed microphone arrays for sound localization

EURASIP Journal on Applied Signal Processing
Distributed adaptive node-specific signal estimation in fully connected sensor networks-part II: simultaneous and asynchronous node updating

IEEE Transactions on Signal Processing
Distributed EM algorithms for density estimation and clustering in sensor networks

IEEE Transactions on Signal Processing
Blind separation of speech mixtures via time-frequency masking

IEEE Transactions on Signal Processing
Signal enhancement using beamforming and nonstationarity withapplications to speech

IEEE Transactions on Signal Processing
Reduced-Bandwidth and Distributed MWF-Based Noise Reduction Algorithms for Binaural Hearing Aids

IEEE Transactions on Audio, Speech, and Language Processing
Performance measurement in blind audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment

IEEE Transactions on Audio, Speech, and Language Processing
Suppression of Late Reverberation Effect on Speech Signal Using Long-Term Multiple-step Linear Prediction

IEEE Transactions on Audio, Speech, and Language Processing
Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application

IEEE Transactions on Image Processing
Distributed EM Algorithm for Gaussian Mixtures in Sensor Networks

IEEE Transactions on Neural Networks
Distributed Adaptive Estimation of Node-Specific Signals in Wireless Sensor Networks With a Tree Topology

IEEE Transactions on Signal Processing
An Integrated Solution for Online Multichannel Noise Tracking and Reduction

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In distributed microphone arrays (DMAs) the source location information can be defined at the intra and inter-node levels. Indeed, while the first type of information results from the diversity of acoustic channels recorded by microphones embedded in the same node, the second is attributed to the differences between the acoustic channels observed by spatially distributed nodes. Both cues are very useful in DMA processing, and the aim of this paper is to utilize both of them to cluster and separate multiple competing speech signals. To capture the intra-node information, we employ the normalized recording vector, while at the inter-node level, we consider different features including the energy level differences with and without the phase differences between nodes. We model the intra-node information using the Watson mixture model (WMM), and propose using the Gamma mixture model (GaMM), Dirichlet mixture model (DMM), and WMM to model different inter-node location features. Furthermore, we propose several integrations of the intra-node and inter-node feature contributions to cluster speech recordings using the expectation maximization algorithm. Finally, simulation results are provided to demonstrate the performance of all ensuing methods.