A semi-continuous state-transition probability HMM-based voice activity detector
EURASIP Journal on Audio, Speech, and Music Processing
Speech waveform compression using robust adaptive voice activity detection for nonstationary noise
EURASIP Journal on Audio, Speech, and Music Processing - Atypical Speech
Improved Likelihood Ratio Test Detector Using a Jointly Gaussian Probability Distribution Function
IWINAC '07 Proceedings of the 2nd international work-conference on Nature Inspired Problem-Solving Methods in Knowledge Engineering: Interplay Between Natural and Artificial Computation, Part II
Energy efficient implementation of G.729 for wireless VoIP application
ICAIT '08 Proceedings of the 2008 International Conference on Advanced Infocomm Technology
Improving voice activity detection used in ITU-T G.729.B
CISST'09 Proceedings of the 3rd WSEAS international conference on Circuits, systems, signal and telecommunications
Online pairing of VoIP conversations
The VLDB Journal — The International Journal on Very Large Data Bases
Wavelet-based voice activity detection algorithm in variable-level noise environment
WSEAS Transactions on Computers
An overview of text-independent speaker recognition: From features to supervectors
Speech Communication
Efficient voice activity detection in reverberant enclosures using far field microphones
DSP'09 Proceedings of the 16th international conference on Digital Signal Processing
Advanced film grain noise extraction and synthesis for high-definition video coding
IEEE Transactions on Circuits and Systems for Video Technology
Improved voice activity detection algorithm using wavelet and support vector machine
Computer Speech and Language
An efficient VAD based on a hang-over scheme and a likelihood ratio test
IWANN'07 Proceedings of the 9th international work conference on Artificial neural networks
An efficient VAD based on a generalized gaussian PDF
NOLISP'07 Proceedings of the 2007 international conference on Advances in nonlinear speech processing
NTMS'09 Proceedings of the 3rd international conference on New technologies, mobility and security
Voice activity detection based on using wavelet packet
Digital Signal Processing
An efficient transcoding algorithm between AMR-NB and G.729ab
Speech Communication
Audio authenticity: detecting ENF discontinuity with high precision phase analysis
IEEE Transactions on Information Forensics and Security
An improved noise-robust voice activity detector based on hidden semi-Markov models
Pattern Recognition Letters
Robustness of group delay representations for noisy speech signals
International Journal of Speech Technology
Impact of VoIP codecs on the energy consumption of portable devices
Proceedings of the 6th ACM workshop on Performance monitoring and measurement of heterogeneous wireless and wired networks
Bispectra analysis-based VAD for robust speech recognition
IWINAC'05 Proceedings of the First international work-conference on the Interplay Between Natural and Artificial Computation conference on Artificial Intelligence and Knowledge Engineering Applications: a bioinspired approach - Volume Part II
Bispectrum estimators for voice activity detection and speech recognition
NOLISP'05 Proceedings of the 3rd international conference on Non-Linear Analyses and Algorithms for Speech Processing
Hi-index | 0.25 |
This article describes Annex B to ITU-T Recommendation G.729. Annex B defines a low-bit-rate silence compression scheme designed and optimized to work in conjunction with both the full version of G.729 and its low-complexity Annex A. To achieve good quality low-bit-rate silence compression, a robust frame-based voice activity detector module is essential to detect inactive voice frames, also called silence or background noise frames. For these detected inactive voice frames, a discontinuous transmission module measures the changes over time of the inactive voice signal characteristics and decides whether a new silence information descriptor frame should be sent to maintain the reproduction quality of the background noise at the receiving end. If such a frame is needed, the spectrum and energy parameters describing the perceptual characteristics of the background noise are efficiently coded and transmitted using 15 b/frame. At the receiving end, the comfort noise generation module regenerates the output background noise using transmitted updates or previously available parameters. The synthesized background noise is obtained by linear predictive filtering of a locally generated pseudo-white excitation signal of a controlled level. This method of coding the background noise enables the achievement of bit-rate savings for coded speech at average rates as low as 4 kb/s during normal speech conversation while maintaining reproduction quality