A Robust Method for Speech Signal Time-Delay Estimation in Reverberant Rooms
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition, Third Edition
Pattern Recognition, Third Edition
Localization by harmonic structure and its application to harmonic sound stream segregation
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Adaptive eigenvalue decomposition algorithm for real time acoustic source localization system
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Time delay estimation in room acoustic environments: an overview
EURASIP Journal on Applied Signal Processing
A class of frequency-domain adaptive approaches to blind multichannel identification
IEEE Transactions on Signal Processing
Blind separation of speech mixtures via time-frequency masking
IEEE Transactions on Signal Processing
Soft Mask Methods for Single-Channel Speaker Separation
IEEE Transactions on Audio, Speech, and Language Processing
Self-localizing dynamic microphone arrays
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Hi-index | 0.00 |
This paper proposes a two microphone-based source localization technique for multiple speech sources utilizing speech specific properties and novel clustering algorithms. Voiced speech is sparse in the frequency domain and can be represented by sinusoidal tracks via sinusoidal modeling which provides high local signal-to-noise ratio (SNR). By utilizing the inter-channel phase differences (IPDs) between the dual channels on the sinusoidal tracks, the source localization of the mixed multiple speech sources is turned into a clustering problem on the IPD versus frequency plot. The generalized mixture decomposition algorithm (GMDA) is used to cluster the groups of points corresponding to multiple sources and thus estimate the direction of arrival (DOA) of the sources. Experiments illustrate the proposed GMDA algorithm with the Laplacian noise model can estimate the number of sources accurately and exhibits smaller DOA estimation error than the baseline histogram based DOA estimation algorithm in various scenarios including reverberant and additive white noise environments. Experiments suggest that appropriate power thresholding can be a simple and good approximation to the sinusoidal modeling, for the purpose of selecting time-frequency points with high local SNR, with slight loss in performance.