Non-linear speech representation based on local predictability exponents

Authors:
V. Khanagha;K. Daoudi;O. Pont;H. Yahia;A. Turiel
Affiliations:
-;-;-;-;-
Venue:
Neurocomputing
Year:
2014

Citing 7
Cited 0

A nonuniform sampling method of speech signal and its application to speech coding

Signal Processing
Digital signal processing (3rd ed.): principles, algorithms, and applications

Digital signal processing (3rd ed.): principles, algorithms, and applications
Speech Coding Algorithms: Foundation and Evolution of Standardized Coders

Speech Coding Algorithms: Foundation and Evolution of Standardized Coders
An optimized algorithm for the evaluation of local singularity exponents in digital signals

IWCIA'11 Proceedings of the 14th international conference on Combinatorial image analysis
Reconstruction of speech signals from their unpredictable points manifold

NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Evaluation of Objective Quality Measures for Speech Enhancement

IEEE Transactions on Audio, Speech, and Language Processing
Reconstructing images from their most singular fractal manifold

IEEE Transactions on Image Processing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Looking for new perspectives to analyze non-linear dynamics of speech, this paper presents a novel approach based on a microcanonical multiscale formulation which allows the geometric and statistical description of multiscale properties of the complex dynamics. Speech is a complex system whose dynamics can be, to some extent, geometrically and statistically accessed by the computation of Local Predictability Exponents (LPEs) unlocking the determination of the most informative subset (Most Singular Manifold or MSM), leading to associated compact representation and reconstruction. But the complex intertwining of different dynamics in speech (added to purely turbulent descriptions) suggests the definition of appropriate multiscale functionals that might influence the evaluation of LPEs, hence leading to more compact MSM. Consequently, by using the classical and generic Sauer/Allebach algorithm for signal reconstruction from irregularly spaced samples, we show that speech reconstruction of good quality can be achieved using MSM of low cardinality. Moreover, in order to further show the potential of the new methodology, we develop a simple and efficient waveform coder which achieves almost the same level of perceptual quality as a standard coder, while having a lower bit-rate.