Introduction to Bayesian Networks
Introduction to Bayesian Networks
Neural Networks - 2005 Special issue: IJCNN 2005
Learning to Forget: Continual Prediction with LSTM
Neural Computation
ICML '06 Proceedings of the 23rd international conference on Machine learning
Neural Computation
Discriminative keyword spotting
Speech Communication
Embodied conversational agents in computer assisted language learning
Speech Communication
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
HSI'09 Proceedings of the 2nd conference on Human System Interactions
Affective interactive narrative in the CALLAS project
ICVS'07 Proceedings of the 4th international conference on Virtual storytelling: using virtual reality technologies for storytelling
An application of recurrent neural networks to discriminative keyword spotting
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
AVEC 2011-the first international audio/visual emotion challenge
ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
An online algorithm for hierarchical phoneme classification
MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction
Tandem connectionist feature extraction for conversational speech recognition
MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction
Bidirectional recurrent neural networks
IEEE Transactions on Signal Processing
Online Driver Distraction Detection Using Long Short-Term Memory
IEEE Transactions on Intelligent Transportation Systems
Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
Building Autonomous Sensitive Artificial Listeners
IEEE Transactions on Affective Computing
Hi-index | 0.00 |
We investigate various techniques for keyword spotting which are exclusively based on acoustic modeling and do not presume the existence of an in-domain language model. Since adequate context modeling is nevertheless necessary for word spotting, we show how the principle of Long Short-Term Memory (LSTM) can be incorporated into the decoding process. We propose a novel technique that exploits LSTM in combination with Connectionist Temporal Classification in order to improve performance by using a self-learned amount of contextual information. All considered approaches are evaluated on read speech as contained in the TIMIT corpus as well as on the SEMAINE database which consists of spontaneous and emotionally colored speech. As further evidence for the effectiveness of LSTM modeling for keyword spotting, results on the CHiME task are shown.