Hybrid multimode/multirate CS-ACELP speech coding for adaptive voice over IP
Speech Communication
Intrastandard Hybrid Speech Coding for Adaptive IP Telephony
QoS-IP '01 Proceedings of the International Workshop on Quality of Service in Multiservice IP Networks
Voice activity detection based on a family of parametric distributions
Pattern Recognition Letters
Speech/nonspeech detection using minimal walsh basis functions
EURASIP Journal on Audio, Speech, and Music Processing
A semi-continuous state-transition probability HMM-based voice activity detector
EURASIP Journal on Audio, Speech, and Music Processing
Voice activity detection based on statistical models and machine learning approaches
Computer Speech and Language
Holonic multi-agent system model for fuzzy automatic speech / speaker recognition
KES-AMSTA'08 Proceedings of the 2nd KES International conference on Agent and multi-agent systems: technologies and applications
Voice activity detection based on using wavelet packet
Digital Signal Processing
Expert Systems with Applications: An International Journal
A portable medical system using real-time streaming transport over 3G wireless networks
Journal of Real-Time Image Processing
Fuzzy logic speech/non-speech discrimination for noise robust speech processing
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part I
Robust emotional speech classification in the presence of babble noise
International Journal of Speech Technology
Hi-index | 0.07 |
Discontinuous transmission based on speech/pause detection represents a valid solution to improve the spectral efficiency of new generation wireless communication systems. In this context, robust voice activity detection (VAD) algorithms are required, as traditional solutions present a high misclassification rate in the presence of the background noise typical of mobile environments. This paper presents a voice detection algorithm which is robust to noisy environments, thanks to a new methodology adopted for the matching process. More specifically, the VAD proposed is based on a pattern recognition approach in which the matching phase is performed by a set of six fuzzy rules, trained by means of a new hybrid learning tool. A series of objective tests performed on a large speech database, varying the signal-to-noise ratio (SNR), the types of background noise, and the input signal level, showed that, as compared with the VAD standardized by ITU-T in Recommendation G.729 annex B, the fuzzy VAD, on average, achieves an improvement in reduction both of the activity factor of about 25% and of the clipping introduced of about 43%. Informal listening tests also confirm an improvement in the perceived speech quality