Artificial intelligence: a modern approach
Artificial intelligence: a modern approach
Statistical methods for speech recognition
Statistical methods for speech recognition
Connectionist Speech Recognition: A Hybrid Approach
Connectionist Speech Recognition: A Hybrid Approach
An application of recurrent neural networks to discriminative keyword spotting
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Combining compound recognition and PCFG-LA parsing with word lattices and conditional random fields
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2
Hi-index | 0.00 |
This paper addresses the problem of detecting keywords in unconstrained speech. The proposed algorithms search for the speech segment maximizing the average observation probability along the most likely path in the hypothesized keyword model. As known, this approach (sometimes referred to as sliding model method) requires a relaxation of the begin/endpoints of the Viterbi matching, as well as a time normalization of the resulting score. This makes solutions complex (i.e., LN2/2 basic operations for keyword HMM models with L states and utterances with N frames). We present here two altemative (quite simple and efficient) solutions to this problem. a) First we provide a method that finds the optimal segmentation according to the criteria of maximizing the average observation probability. It uses Dynamic Programming as a step, but does not require scoring for all possible begin/endpoints. While the worst case remains O(LN2), this technique converged in at most 3(L+2)N basic operations in each experiment for two very different applications. b) The second proposed algorithm does not provide a segmentation but can be used for the decision problem of whether the utterance should be classified as containing the keyword or not (provided a predefined threshold on the acceptable average observation probability). This allows the algorithm to be even faster, with fix cost of (L+2)N.