Voice activity detection algorithm using nonlinear spectral weights, hangover and hangbefore criteria

Authors:
Damjan Vlaj;Zdravko KačIč;Marko Kos
Affiliations:
University of Maribor, Faculty of Electrical Engineering and Computer Science, Smetanova ulica 17, SI-2000 Maribor, Slovenia;University of Maribor, Faculty of Electrical Engineering and Computer Science, Smetanova ulica 17, SI-2000 Maribor, Slovenia;University of Maribor, Faculty of Electrical Engineering and Computer Science, Smetanova ulica 17, SI-2000 Maribor, Slovenia
Venue:
Computers and Electrical Engineering
Year:
2012

Citing 11
Cited 1

Robustness in Automatic Speech Recognition: Fundamentals and Applications

Robustness in Automatic Speech Recognition: Fundamentals and Applications
Speech enhancement using voice source models

Speech enhancement using voice source models
A computationally efficient mel-filter bank VAD algorithm for distributed speech recognition systems

EURASIP Journal on Applied Signal Processing
Energy-based VAD with grey magnitude spectral subtraction

Speech Communication
Noise robust voice activity detection based on periodic to aperiodic component ratio

Speech Communication
Improved likelihood ratio test based voice activity detector applied to speech recognition

Speech Communication
Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition

Computers and Electrical Engineering
Robust speech detection in real acoustic backgrounds with perceptually motivated features

Speech Communication
Combining pulse-based features for rejecting far-field speech in a HMM-based Voice Activity Detector

Computers and Electrical Engineering
Trial pruning based on genetic algorithm for single-trial EEG classification

Computers and Electrical Engineering
Robust Voice Activity Detection Using Long-Term Signal Variability

IEEE Transactions on Audio, Speech, and Language Processing

A study of voice activity detection techniques for NIST speaker recognition evaluations

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a nonlinear function into the frequency spectrum that improves the detection of vowels, diphthongs, and semivowels within the speech signal. The lower efficiency of consonant detection was solved by implementing the hangover and hangbefore criteria. This paper presents a procedure for faster definition of those optimal constants used by hangover and hangbefore criteria. A nonlinearly changed frequency spectrum is used in the proposed GMM (Gaussian Mixture Model) based VAD (Voice Activity Detection) algorithm. Comparative tests between the proposed VAD algorithm and seven other VAD algorithms were made on the Aurora 2 database. The experiments were based on frame error detection and on speech recognition performance for two types of acoustic training modes (multi-condition and clean only). The lowest average percentage of frame errors was obtained by the proposed VAD algorithm, which also achieved positive improvement in the speech recognition performance for both types of acoustic training modes.