Improved Likelihood Ratio Test Detector Using a Jointly Gaussian Probability Distribution Function

Authors:
O. Pernía;J. M. Górriz;J. Ramírez;C. G. Puntonet;I. Turias
Affiliations:
E.T.S.I.I., Universidad de Granada, C/ Periodista Daniel Saucedo, 18071 Granada, Spain;E.T.S.I.I., Universidad de Granada, C/ Periodista Daniel Saucedo, 18071 Granada, Spain;E.T.S.I.I., Universidad de Granada, C/ Periodista Daniel Saucedo, 18071 Granada, Spain;E.T.S.I.I., Universidad de Granada, C/ Periodista Daniel Saucedo, 18071 Granada, Spain;E.T.S.I.I., Universidad de Granada, C/ Periodista Daniel Saucedo, 18071 Granada, Spain
Venue:
IWINAC '07 Proceedings of the 2nd international work-conference on Nature Inspired Problem-Solving Methods in Knowledge Engineering: Interplay Between Natural and Artificial Computation, Part II
Year:
2007

Citing 3
Cited 0

Study of a voice activity detector and its influence on a noise reduction system

Speech Communication
Improved voice activity detection based on a smoothed statistical likelihood ratio

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications

IEEE Communications Magazine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Currently, the accuracy of speech processing systems is stro- ngly affected by the acoustic noise. This is a serious obstacle to meet the demands of modern applications and therefore these systems often needs a noise reduction algorithm working in combination with a precise voice activity detector (VAD). This paper presents a new voice activity detector (VAD) for improving speech detection robustness in noisy environments and the performance of speech recognition systems. The algorithm defines an optimum likelihood ratio test (LRT) involving Multiple and correlated Observations (MO). The so defined decision rule reports significant improvements in speech/non-speech discrimination accuracy over existing VAD methods with optimal performance when just a single observation is processed. The algorithm has an inherent delay in MO scenario that, for several applications including robust speech recognition, does not represent a serious implementation obstacle. An analysis of the methodology for a pair-wise observation dependence shows the improved robustness of the proposed approach by means of a clear reduction of the classification error as the number of observations is increased. The proposed strategy is also compared to different VAD methods including the G.729, AMR and AFE standards, as well as recently reported algorithms showing a sustained advantage in speech/non-speech detection accuracy and speech recognition performance.