Voice activity detection based on multiple statistical models

Authors:
Joon-Hyuk Chang;Nam Kim;S.K. Mitra
Affiliations:
Dept. of Electron. Eng., Inha Univ., Incheon, South Korea;-;-
Venue:
IEEE Transactions on Signal Processing - Part I
Year:
2006

Citing 0
Cited 11

Modifying voice activity detection in low SNR by correction factors

CISST'09 Proceedings of the 3rd WSEAS international conference on Circuits, systems, signal and telecommunications
Efficient voice activity detection in reverberant enclosures using far field microphones

DSP'09 Proceedings of the 16th international conference on Digital Signal Processing
Low-delay noise estimation based on spectrum ripples and minimum statistics in adverse environments

DSP'09 Proceedings of the 16th international conference on Digital Signal Processing
Low complexity DFT-domain noise PSD tracking using high-resolution periodograms

EURASIP Journal on Advances in Signal Processing
Speech activity detection for multi-party conversation analyses based on likelihood ratio test on spatial magnitude

IEEE Transactions on Audio, Speech, and Language Processing
Voice activity detection under the highly fluctuant recording conditions of call centres

ECS'10/ECCTD'10/ECCOM'10/ECCS'10 Proceedings of the European conference of systems, and European conference of circuits technology and devices, and European conference of communications, and European conference on Computer science
An improved noise-robust voice activity detector based on hidden semi-Markov models

Pattern Recognition Letters
Optimization and evaluation of sigmoid function with a priori SNR estimate for real-time speech enhancement

Speech Communication
Noisy speech enhancement based on improved minimum statistics incorporating acoustic environment-awareness

Digital Signal Processing
A study of voice activity detection techniques for NIST speaker recognition evaluations

Computer Speech and Language
Speech enhancement using generalized weighted β-order spectral amplitude estimator

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the key issues in practical speech processing is to achieve robust voice activity detection (VAD) against the background noise. Most of the statistical model-based approaches have tried to employ the Gaussian assumption in the discrete Fourier transform (DFT) domain, which, however, deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models. In addition to the Gaussian model, we also incorporate the complex Laplacian and Gamma probability density functions to our analysis of statistical properties. With a goodness-of-fit tests, we analyze the statistical properties of the DFT spectra of the noisy speech under various noise conditions. Based on the statistical analysis, the likelihood ratio test under the given statistical models is established for the purpose of VAD. Since the statistical characteristics of the speech signal are differently affected by the noise types and levels, to cope with the time-varying environments, our approach is aimed at finding adaptively an appropriate statistical model in an online fashion. The performance of the proposed VAD approaches in both the stationary and nonstationary noise environments is evaluated with the aid of an objective measure.