Effects of speaking rate and word frequency on pronunciations in conversational speech
Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Overlap-add methods for time-scaling of speech
Speech Communication
Hybrid multi-mode/multi-rate CS-ACELP speech coding for adaptive voice over IP
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Adaptive playout scheduling using time-scale modification in packet voice communications
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 2001. on IEEE International Conference - Volume 03
Adaptive delay concealment for Internet voice applications with packet based time-scale modification
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 2001. on IEEE International Conference - Volume 03
IEEE Transactions on Wireless Communications
A simulation study of adaptive voice communications on IP networks
Computer Communications
Non-intrusive single-ended speech quality assessment in VoIP
Speech Communication
Enhancing VoIP service for ubiquitous communication in a campus WLAN with partial coverage
Computer Networks: The International Journal of Computer and Telecommunications Networking
Signal transformation and interpolation based on modified DCT synthesis
Digital Signal Processing
Hi-index | 0.09 |
This paper proposes an alternative scheme to variable bit rate (VBR) speech coding for voice over Internet protocol (VoIP) during network congestion in Internet. The proposed scheme is called "adaptive bit rate switching" and ensures that the available bandwidth is most efficiently used. When congestion is signaled, a time scale modification algorithm called WSOLA (waveform similarity overlap and add) with time-dependent compression rate, determined according to the severity of the network congestion, is employed in order to reduce the bit rate required to transmit speech adaptively. This approach is different from VBR speech coding and novel in the sense that the coder operates at any desired bit rate for any desired duration. This is particularly useful in network environments because load may be different at each direction.WSOLA algorithm has been selected as the time scale modification algorithm because it is computationally efficient and produces high quality output. In addition, the proposed scheme integrates WSOLA, or any time scale modification algorithm into any commercial or military constant bit rate (CBR) or VBR codec without any modification in the vocoder structure. The results of the proposed method are statistically evaluated by using diagnostics rhyme tests (DRT) and mean opinion score (MOS) tests. The DRT results obtained from the simulation of the proposed system revealed, under 90% confidence interval, that the perceptual success of the adaptively compressed and G.729 coded speech is 98.92±0.03 percent. The MOS test results, on the other hand, proved that the system provides better perceptual quality than the standard time scale modification, indicating that the proposed system indeed provides graceful degradation in voice quality even in additive increase multiplicative decrease modeled channels, provided that the dynamic network conditions grant bandwidth.