Analysis and classification of speech signals by generalized fractal dimension features

Authors:
Vassilis Pitsikalis;Petros Maragos
Affiliations:
School of Electrical and Computer Engineering, National Technical University of Athens, Iroon Polytexneiou Str., Athens 15773, Greece;School of Electrical and Computer Engineering, National Technical University of Athens, Iroon Polytexneiou Str., Athens 15773, Greece
Venue:
Speech Communication
Year:
2009

Citing 10
Cited 2

Production models as a structural basis for automatic speech recognition

Speech Communication - Special issue on speech production: models and data
Nonlinear time series analysis

Nonlinear time series analysis
Improved Accuracy in the Singularity Spectrum of Multifractal Chaotic Time Series

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 3 - Volume 3
Chaos and Fractals

Chaos and Fractals
Fractal aspects of speech signals: dimension and interpolation

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Nonlinear prediction of speech

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Synthesis and coding of continuous speech with the nonlinear oscillator model

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
A finite element model of fluid flow in the vocal tract

Computer Speech and Language
Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
Energy separation in signal modulations with application to speechanalysis

IEEE Transactions on Signal Processing

Combining Mel frequency Cepstral coefficients and fractal dimensions for automatic speech recognition

NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Gender-dependent emotion recognition based on HMMs and SPHMMs

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

We explore nonlinear signal processing methods inspired by dynamical systems and fractal theory in order to analyze and characterize speech sounds. A speech signal is at first embedded in a multidimensional phase-space and further employed for the estimation of measurements related to the fractal dimensions. Our goals are to compute these raw measurements in the practical cases of speech signals, to further utilize them for the extraction of simple descriptive features and to address issues on the efficacy of the proposed features to characterize speech sounds. We observe that distinct feature vector elements obtain values or show statistical trends that on average depend on general characteristics such as the voicing, the manner and the place of articulation of broad phoneme classes. Moreover the way that the statistical parameters of the features are altered as an effect of the variation of phonetic characteristics seem to follow some roughly formed patterns. We also discuss some qualitative aspects concerning the linear phoneme-wise correlation between the fractal features and the commonly employed mel-frequency cepstral coefficients (MFCCs) demonstrating phonetic cases of maximal and minimal correlation. In the same context we also investigate the fractal features' spectral content, in terms of the most and least correlated components with the MFCC. Further the proposed methods are examined under the light of indicative phoneme classification experiments. These quantify the efficacy of the features to characterize broad classes of speech sounds. The results are shown to be comparable for some classification scenarios with the corresponding ones of the MFCC features.