An Acoustic Framework for Detecting Fatigue in Speech Based Human-Computer-Interaction

Authors:
Jarek Krajewski;Rainer Wieland;Anton Batliner
Affiliations:
University of Wuppertal, Work and Organizational Psychology, Wuppertal, Germany 42097;University of Wuppertal, Work and Organizational Psychology, Wuppertal, Germany 42097;University of Erlangen-Nuremberg, Lehrstuhl fuer Mustererkennung, Erlangen, Germany 91058
Venue:
ICCHP '08 Proceedings of the 11th international conference on Computers Helping People with Special Needs
Year:
2008

Citing 1
Cited 3

Fusion of state space and frequency-domain features for improved microsleep detection

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II

Towards effective, efficient and elderly-friendly multimodal interaction

Proceedings of the 4th International Conference on PErvasive Technologies Related to Assistive Environments
Speech and gesture interaction in an Ambient assisted living lab

SMIAE '12 Proceedings of the 1st Workshop on Speech and Multimodal Interaction in Assistive Environments
Speaker state classification based on fusion of asymmetric simple partial least squares (SIMPLS) and support vector machines

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article describes a general framework for detecting accident-prone fatigue states based on prosody, articulation and speech quality related speech characteristics. The advantages of this real-time measurement approach are that obtaining speech data is non obtrusive, and free from sensor application and calibration efforts. The main part of the feature computation is the combination of frame level based speech features and high level contour descriptors resulting in over 8,500 features per speech sample. In general the measurement process follows the speech adapted steps of pattern recognition: (a) recording speech, (b) preprocessing (segmenting speech units of interest), (c) feature computation (using perceptual and signal processing related features, as e.g. fundamental frequency, intensity, pause patterns, formants, cepstral coefficients), (d) dimensionality reduction (filter and wrapper based feature subset selection, (un-)supervised feature transformation), (e) classification (e.g. SVM, K-NN classifier), and (f) evaluation (e.g. 10-fold cross validation). The validity of this approach is briefly discussed by summarizing the empirical results of a sleep deprivation study.