A low complexity time-scaling expansion algorithm of speech signals suitable for real time implementation

Authors:
Gonzalo Duchen-Sanchez;Jose Juan Garcia-Hernandez;Mariko Nakano-Miyatake;Hector Perez-Meana
Affiliations:
SEPI ESIME Culhuacan, The National Polytechnic Institute of Mexico, Av. Santa Ana 1000, Col. San Francisco Culhuacan, 04430 Mexico, D.F., Mexico;SEPI ESIME Culhuacan, The National Polytechnic Institute of Mexico, Av. Santa Ana 1000, Col. San Francisco Culhuacan, 04430 Mexico, D.F., Mexico;SEPI ESIME Culhuacan, The National Polytechnic Institute of Mexico, Av. Santa Ana 1000, Col. San Francisco Culhuacan, 04430 Mexico, D.F., Mexico;SEPI ESIME Culhuacan, The National Polytechnic Institute of Mexico, Av. Santa Ana 1000, Col. San Francisco Culhuacan, 04430 Mexico, D.F., Mexico
Venue:
Digital Signal Processing
Year:
2009

Citing 3
Cited 1

Time-Scale Modification of Audio Signals with Combined Harmonic and Wavelet Representations

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Adaptive time scale modification of speech for graceful degrading voice quality in congested networks for VoIP applications

Signal Processing
Shape invariant time-scale and pitch modification of speech

IEEE Transactions on Signal Processing

Self-Adjustable Neural Network for speech recognition

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper presents the development and implementation of a variable rate time-scaling expansion system for speech signals, based on the pitch information, in which only the voiced segments are expanded, keeping the unvoiced and silence segments unchanged. The proposed system was first evaluated by computer simulation and then implemented on a digital signal processor (DSP). Time-domain, frequency-domain, mean opinion score (MOS) and diagnostic rhyme test (DRT) evaluations were done to test the actual performance of developed algorithm, which show that the proposed system allows improving the learning level of foreign language students as well as the understanding ability of elderly people. Objective tests also were carried out in order to probe similarity between the original and the expanded signals. Applying an iterative refinement of the C source code it was possible to obtain a real-time implementation. The current implemented algorithm requires 11 kwords program memory and about 9 million of floating point operations per second (MFLOPS).