Instance-Based Learning Algorithms
Machine Learning
Speech Communication - Eurospeech '91
Machine Learning
Improvements to Platt's SMO Algorithm for SVM Classifier Design
Neural Computation
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Objective comparison of speech enhancement algorithms under real world conditions
Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments
Expert Systems with Applications: An International Journal
Olympus: an open-source framework for conversational spoken language interface research
NAACL-HLT-Dialog '07 Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies
Emerging Input Technologies for Always-Available Mobile Interaction
Foundations and Trends in Human-Computer Interaction
Affective speech interface in serious games for supporting therapy of mental disorders
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
Aiming at robust spoken dialogue interaction in motorcycle environment, we investigate various configurations for a speech front-end, which consists of speech pre-processing, speech enhancement and speech recognition components. These components are implemented as agents in the Olympus/RavenClaw framework, which is the core of a multimodal dialogue interaction interface of a wearable solution for information support of the motorcycle police force on the move. In the present effort, aiming at optimizing the speech recognition performance, different experimental setups are considered for the speech front-end. The practical value of various speech enhancement techniques is assessed and, after analysis of their performances, a collaborative scheme is proposed. In this collaborative scheme independent speech enhancement channels operate in parallel on a common input and their outputs are fed to the multithread speech recognition component. The outcome of the speech recognition process is post-processed by an appropriate fusion technique, which contributes for a more accurate interpretation of the input. Investigating various fusion algorithms, we identified the Adaboost.M1 algorithm as the one performing best. Utilizing the fusion collaborative scheme based on the Adaboost.M1 algorithm, significant improvement of the overall speech recognition performance was achieved. This is expressed in terms of word recognition rate and correctly recognized words, as accuracy gain of 8.0% and 5.48%, respectively, when compared to the performance of the best speech enhancement channel, alone. The advance offered in the present work reaches beyond the specifics of the present application, and can be beneficial to spoken interfaces operating in non-stationary noise environments.