Energy-based VAD with grey magnitude spectral subtraction

Authors:
Cheng-Hsiung Hsieh;Ting-Yu Feng;Po-Chin Huang
Affiliations:
Department of Computer Science and Information Engineering, Chaoyang University of Technology, Wufong 413, Taiwan, ROC;Department of Computer Science and Information Engineering, Chaoyang University of Technology, Wufong 413, Taiwan, ROC;Department of Computer Science and Information Engineering, Chaoyang University of Technology, Wufong 413, Taiwan, ROC
Venue:
Speech Communication
Year:
2009

Citing 5
Cited 1

Introduction to Grey system theory

The Journal of Grey System
Hard C-means clustering for voice activity detection

Speech Communication
Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold

IEEE Transactions on Audio, Speech, and Language Processing
A Soft Voice Activity Detection Using GARCH Filter and Variance Gamma Distribution

IEEE Transactions on Audio, Speech, and Language Processing
Improved Voice Activity Detection Using Contextual Multiple Hypothesis Testing for Robust Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing

Voice activity detection algorithm using nonlinear spectral weights, hangover and hangbefore criteria

Computers and Electrical Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a novel voice activity detection (VAD) scheme for low SNR conditions with additive white noise. The proposed approach consists of two parts. First, a grey magnitude spectral subtraction (GMSS) is applied to remove additive noise from a given noisy speech. By this doing, an estimated clean speech is obtained. Second, the enhanced speech by the GMSS is segmented and put into an energy-based VAD to determine whether it is a speech or non-speech segment. The approach presented in this paper is called the GMSS/EVAD. Simulation results indicate that the proposed GMSS/EVAD outperforms VAD in G.729 and GSM AMR for the given low SNR examples. To investigate the performance of the GMSS/EVAD for real-life background noises, the babble and volvo noises in the NOISEX-92 database are under consideration. The simulation results for the given examples indicate that the GMSS/EVAD is able to handle appropriately for the cases of the babble noise with the SNR above 10dB and the cases of the volvo noise with SNR 15dB and up.