Speech analysis and coding using a multi-resolution sinusoidal transform

Authors:
D. V. Anderson
Affiliations:
Sch. of Electr. & Compu. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
Venue:
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Year:
1996

Citing 0
Cited 2

Analysis/Synthesis of speech signals based on AbS/OLA sinusoidal modeling using elliptic filter

IDEAL'05 Proceedings of the 6th international conference on Intelligent Data Engineering and Automated Learning
Sinusoidal modeling using wavelet packet transform applied to the analysis and synthesis of speech signals

TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

The sinusoidal transform, as developed by Quatieri and McAulay (1986), provides a sparse representation for speech signals by taking advantage of psychoacoustic masking. The currently reported work takes the sinusoidal transform one step further by considering the frequency resolution abilities of the human auditory system in more detail. The new transform is based on the wavelet principle of variable resolution in time/frequency analysis. Specifically, a sinusoidal transform is developed which uses quadrature mirror filter (QMF) banks to obtain better time resolution at high frequencies and better frequency resolution at low frequencies. This naturally provides a perceptually improved allocation of the sinusoids. The new transform matches the human auditory system better than its predecessor and it also matches speech signals well, both fricative sounds and voiced speech. The QMF based ST is then shown to be equivalent to a more efficient FFT based implementation.