Pitch-and formant-based order adaptation of the fractional Fourier transform and its application to speech recognition

Authors:
Hui Yin;Climent Nadeu;Volker Hohmann
Affiliations:
TALP Research Center, Universitat Politècnica de Catalunya, Barcelona, Spain and Department of Electronic Engineering, Beijing Institute of Technology, Beijing, China;TALP Research Center, Universitat Politècnica de Catalunya, Barcelona, Spain;TALP Research Center, Universitat Politècnica de Catalunya, Barcelona, Spain and Medizinische Physik, Universität Oldenburg, Oldenburg, Germany
Venue:
EURASIP Journal on Audio, Speech, and Music Processing
Year:
2009

Citing 8
Cited 1

Speaker Recognition Using Features Derived from Fractional Fourier Transform

AUTOID '05 Proceedings of the Fourth IEEE Workshop on Automatic Identification Advanced Technologies
Characterization of transient wandering tones by dynamic modeling of fractional-Fourier features

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Mixture Gaussian envelope chirp model for speech and audio

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Statistical analysis of amplitude modulation in speech signals using an AM-FM model

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Estimation of Frequency for AM/FM Models Using the Phase Vocoder Framework

IEEE Transactions on Signal Processing
Analysis of multicomponent LFM signals by a combined Wigner-Houghtransform

IEEE Transactions on Signal Processing
Multicomponent AM–FM Representations: An Asymptotically Exact Approach

IEEE Transactions on Audio, Speech, and Language Processing
Adaptive maximum windowed likelihood multicomponent AM-FM signal decomposition

IEEE Transactions on Audio, Speech, and Language Processing

Spectral histogram of oriented gradients (SHOGs) for Tamil language male/female speaker classification

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Fractional Fourier transform(FrFT) has been proposed to improve the time-frequency resolution in signal analysis and processing. However, selecting the FrFT transform order for the proper analysis of multicomponent signals like speech is still debated. In this work, we investigated several order adaptation methods. Firstly, FFT- and FrFT- based spectrograms of an artificially-generated vowel are compared to demonstrate the methods. Secondly, an acoustic feature set combining MFCC and FrFT is proposed, and the transform orders for the FrFT are adaptively set according to various methods based on pitch and formants. A tonal vowel discrimination test is designed to compare the performance of these methods using the feature set. The results show that the FrFT-MFCC yields a better discriminability of tones and also of vowels, especially by using multitransform-order methods. Thirdly, speech recognition experiments were conducted on the clean intervocalic English consonants provided by the Consonant Challenge. Experimental results show that the proposed features with different order adaptation methods can obtain slightly higher recognition rates compared to the reference MFCC-based recognizer.