An Adaptive Non Reference Anchor Array Framework for Audio Retrieval in Teleconferencing Environment

Authors:
Karan Nathwani;Arpit Shukla;Shubham Khunteta;Rajesh M. Hegde
Affiliations:
Department of Electrical Engineering, Indian Institute of Technology, Kanpur, India 16;Department of Electrical Engineering, Indian Institute of Technology, Kanpur, India 16;Department of Electrical Engineering, Indian Institute of Technology, Kanpur, India 16;Department of Electrical Engineering, Indian Institute of Technology, Kanpur, India 16
Venue:
Journal of Signal Processing Systems
Year:
2014

Citing 7
Cited 0

Reverberant speech enhancement using cepstral processing

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
On spatial aliasing in microphone arrays

IEEE Transactions on Signal Processing
Single Channel Inverse Filtering of Room Impulse Response by Maximizing Skewness of LP Residual

ICSAP '10 Proceedings of the 2010 International Conference on Signal Acquisition and Processing
Speech Dereverberation

Speech Dereverberation
On robust Capon beamforming and diagonal loading

IEEE Transactions on Signal Processing
PEMO-Q—A New Method for Objective Audio Quality Assessment Using a Model of Auditory Perception

IEEE Transactions on Audio, Speech, and Language Processing
An adaptive non reference anchor array framework for distant speech recognition

PCM'12 Proceedings of the 13th Pacific-Rim conference on Advances in Multimedia Information Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, an adaptive framework for audio retrieval in live teleconferencing environments with multiple participants is proposed. The framework uses a non reference anchor array (NRA) to capture the interfering speech sources, in addition to the primary array that captures the speech source of interest (SOI). A linearly constrained-minimum variance (LC-MV) beamformer is used herein such that the signal coming from the look direction is preserved while interferences coming from the non look direction are nulled. Additionally, the reverberant component of the speech acquired by this framework is removed by a novel method that uses the linear prediction (LP) residual cepstrum. This method does not require the computation of the acoustic impulse response (AIR) of the teleconferencing room and hence is computationally efficient. The NRA framework is therefore able to remove correlated noise coming from the direction of the SOI and also dereverberating the noise free signal. The performance of the proposed framework is evaluated by conducting experiments on clean speech acquisition from distant microphone arrays. Experiments on distant speech recognition are also conducted using the TIMIT and MONC databases. Experimental results obtained from the proposed framework indicate a reasonable improvement over correlation, subspace and standard minimum variance beamforming methods. The application of the framework in audio retrieval in a live teleconferencing environment with multiple participants is also discussed.