On methods for perfect reconstruction WI Speech coding with preprocessing

Authors:
Mikko Tammi;Ari Heikkinen;Jukka Saarinen
Affiliations:
Digital and Computer Systems Laboratory, Tampere University of Technology, P.O. Box 553, FIN-33101 Tampere, Finland;Speech and Audio Systems Laboratory, Nokia Research Center, P.O. Box 100, FIN-33721 Tampere, Finland;Digital and Computer Systems Laboratory, Tampere University of Technology, P.O. Box 553, FIN-33101 Tampere, Finland
Venue:
Speech Communication
Year:
2002

Citing 1
Cited 1

Digital Coding of Waveforms: Principles and Applications to Speech and Video

Digital Coding of Waveforms: Principles and Applications to Speech and Video

Improved characteristic waveform decomposition and novel bit reduction scheme for WI coders

SSIP'07 Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The waveform interpolation (WI) speech coding algorithm has been shown to be an efficient method to describe the evolution of periodic voiced components in the speech signal. However, the conventional WI coding does not provide perfect reconstruction property, i.e. the decoded signal does not converge to the original signal with decreasing quantization error. Therefore errors in the coding model cannot be fixed by quantization. In this paper we discuss about characteristics of the WI coding model and about modifications to the model which enable the perfect reconstruction property. The new requirements and features are examined and discussed in detail. While the perfect reconstruction property brings many benefits it also causes new demands to the operation of the coder. Particularly high requirements are set to the exactness of the pitch estimate; inaccuracies hamper rapidly the possibilities to quantize the parameters efficiently. To overcome this we introduce a preprocessing method which slightly modifies the pitch structure of the residual signal before waveform extraction. The modifications to the signal are minor and therefore the quality of the preprocessed signal is very close to that of the input speech. In the proposed method the perfect reconstruction property is maintained in relation to the preprocessed signal.