Improved likelihood ratio test based voice activity detector applied to speech recognition

Authors:
J. M. Górriz;J. Ramírez;E. W. Lang;C. G. Puntonet;I. Turias
Affiliations:
Dpt. Signal Theory, Networking and Communications, University of Granada, 18071 Granada, Spain;Dpt. Signal Theory, Networking and Communications, University of Granada, 18071 Granada, Spain;Institute of Biophysics, University of Regensburg, 93040 Regensburg, Germany;Dpt. Computer Architecture and Technology, University of Granada, 18071 Granada, Spain;Dpt. Lenguajes y Sistemas Informáticos, University of Cádiz, 11202 Algeciras, Spain
Venue:
Speech Communication
Year:
2010

Citing 8
Cited 2

Multivariate statistical methods: a primer

Multivariate statistical methods: a primer
Study of a voice activity detector and its influence on a noise reduction system

Speech Communication
Towards improving speech detection robustness for speech recognition in adverse conditions

Speech Communication
Hard C-means clustering for voice activity detection

Speech Communication
Improved voice activity detection based on a smoothed statistical likelihood ratio

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Generalized Gaussian modeling of correlated signal sources

IEEE Transactions on Signal Processing
Jointly Gaussian PDF-Based Likelihood Ratio Test for Voice Activity Detection

IEEE Transactions on Audio, Speech, and Language Processing
ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications

IEEE Communications Magazine

Voice activity detection algorithm using nonlinear spectral weights, hangover and hangbefore criteria

Computers and Electrical Engineering
A study of voice activity detection techniques for NIST speaker recognition evaluations

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

Nowadays, the accuracy of speech processing systems is strongly affected by acoustic noise. This is a serious obstacle regarding the demands of modern applications. Therefore, these systems often need a noise reduction algorithm working in combination with a precise voice activity detector (VAD). The computation needed to achieve denoising and speech detection must not exceed the limitations imposed by real time speech processing systems. This paper presents a novel VAD for improving speech detection robustness in noisy environments and the performance of speech recognition systems in real time applications. The algorithm is based on a Multivariate Complex Gaussian (MCG) observation model and defines an optimal likelihood ratio test (LRT) involving multiple and correlated observations (MCO) based on a jointly Gaussian probability distribution (jGpdf) and a symmetric covariance matrix. The complete derivation of the jGpdf-LRT for the general case of a symmetric covariance matrix is shown in terms of the Cholesky decomposition which allows to efficiently compute the VAD decision rule. An extensive analysis of the proposed methodology for a low dimensional observation model demonstrates: (i) the improved robustness of the proposed approach by means of a clear reduction of the classification error as the number of observations is increased, and (ii) the trade-off between the number of observations and the detection performance. The proposed strategy is also compared to different VAD methods including the G.729, AMR and AFE standards, as well as other recently reported algorithms showing a sustained advantage in speech/non-speech detection accuracy and speech recognition performance using the AURORA databases.