Fundamentals of speech recognition
Fundamentals of speech recognition
Machine Learning
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
A Segment-Based Wordspotter Using Phonetic Filler Models
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
HTIMIT and LLHDB: Speech Corpora for the Study of Handset Transducer Effects
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
An online algorithm for hierarchical phoneme classification
MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction
On the generalization ability of on-line learning algorithms
IEEE Transactions on Information Theory
Query-driven strategy for on-the-fly term spotting in spontaneous speech
EURASIP Journal on Audio, Speech, and Music Processing - Special issue on scalable audio-content analysis
Keyword spotting exploiting Long Short-Term Memory
Speech Communication
Predicting human strategic decisions using facial expressions
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Extension of a Kernel-Based Classifier for Discriminative Spoken Keyword Spotting
Neural Processing Letters
Hi-index | 0.00 |
This paper proposes a new approach for keyword spotting, which is based on large margin and kernel methods rather than on HMMs. Unlike previous approaches, the proposed method employs a discriminative learning procedure, in which the learning phase aims at achieving a high area under the ROC curve, as this quantity is the most common measure to evaluate keyword spotters. The keyword spotter we devise is based on mapping the input acoustic representation of the speech utterance along with the target keyword into a vector-space. Building on techniques used for large margin and kernel methods for predicting whole sequences, our keyword spotter distills to a classifier in this vector-space, which separates speech utterances in which the keyword is uttered from speech utterances in which the keyword is not uttered. We describe a simple iterative algorithm for training the keyword spotter and discuss its formal properties, showing theoretically that it attains high area under the ROC curve. Experiments on read speech with the TIMIT corpus show that the resulted discriminative system outperforms the conventional context-independent HMM-based system. Further experiments using the TIMIT trained model, but tested on both read (HTIMIT, WSJ) and spontaneous speech (OGI Stories), show that without further training or adaptation to the new corpus our discriminative system outperforms the conventional context-independent HMM-based system.