A joint time-frequency and matrix decomposition feature extraction methodology for pathological voice classification

Authors:
Behnaz Ghoraani;Sridhar Krishnan
Affiliations:
Signal Analysis Research Lab, Department of Electrical and Computer Engineering, Ryerson University, Toronto, ON, Canada;Signal Analysis Research Lab, Department of Electrical and Computer Engineering, Ryerson University, Toronto, ON, Canada
Venue:
EURASIP Journal on Advances in Signal Processing - Special issue on analysis and signal processing of oesophageal and pathological voices
Year:
2009

Citing 4
Cited 2

Non-negative Matrix Factorization with Sparseness Constraints

The Journal of Machine Learning Research
Projected Gradient Methods for Nonnegative Matrix Factorization

Neural Computation
Automated speech analysis applied to laryngeal disease categorization

Computer Methods and Programs in Biomedicine
Improving the readability of time-frequency and time-scalerepresentations by the reassignment method

IEEE Transactions on Signal Processing

Audio signal processing using time-frequency approaches: coding, classification, fingerprinting, and watermarking

EURASIP Journal on Advances in Signal Processing - Special issue on time-frequency analysis and its applications to multimedia signals
Time-frequency data reduction for event related potentials: combining principal component analysis and matching pursuit

EURASIP Journal on Advances in Signal Processing - Special issue on applications of time-frequency signal processing in wireless communications and bioengineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The number of people affected by speech problems is increasing as the modern world places increasing demands on the human voice via mobile telephones, voice recognition software, and interpersonal verbal communications. In this paper, we propose a novel methodology for automatic pattern classification of pathological voices. The main contribution of this paper is extraction of meaningful and unique features using Adaptive time-frequency distribution (TFD) and nonnegative matrix factorization (NMF). We construct Adaptive TFD as an effective signal analysis domain to dynamically track the nonstationarity in the speech and utilize NMF as a matrix decomposition (MD) technique to quantify the constructed TFD. The proposed method extracts meaningful and unique features from the joint TFD of the speech, and automatically identifies and measures the abnormality of the signal. Depending on the abnormality measure of each signal, we classify the signal into normal or pathological. The proposed method is applied on the Massachusetts Eye and Ear Infirmary (MEEI) voice disorders database which consists of 161 pathological and 51 normal speakers, and an overall classification accuracy of 98.6% was achieved.