Analysis/Synthesis of speech signals based on AbS/OLA sinusoidal modeling using elliptic filter
IDEAL'05 Proceedings of the 6th international conference on Intelligent Data Engineering and Automated Learning
TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Hi-index | 0.00 |
The sinusoidal transform, as developed by Quatieri and McAulay (1986), provides a sparse representation for speech signals by taking advantage of psychoacoustic masking. The currently reported work takes the sinusoidal transform one step further by considering the frequency resolution abilities of the human auditory system in more detail. The new transform is based on the wavelet principle of variable resolution in time/frequency analysis. Specifically, a sinusoidal transform is developed which uses quadrature mirror filter (QMF) banks to obtain better time resolution at high frequencies and better frequency resolution at low frequencies. This naturally provides a perceptually improved allocation of the sinusoids. The new transform matches the human auditory system better than its predecessor and it also matches speech signals well, both fricative sounds and voiced speech. The QMF based ST is then shown to be equivalent to a more efficient FFT based implementation.