Spoken Dialogues with Computers
Spoken Dialogues with Computers
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Connectionist Speech Recognition: A Hybrid Approach
Connectionist Speech Recognition: A Hybrid Approach
Networks with trainable amplitude of activation functions
Neural Networks
Training of HMM with filtered speech material for hands-free recognition
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Hi-index | 0.00 |
Neural network learning theory draws a relationship between "learning with noise" and applying a regularization term in the cost function that is minimized during the training process on clean (non-noisy) data. Application of regularizers and other robust training techniques are aimed at improving the generalization capabilities of connectionist models, reducing overfitting. In spite of that, the generalization problem is usually overlooked by automatic speech recognition (ASR) practioners who use hidden Markov models (HMM) or other standard ASR paradigms. Nonetheless, it is reasonable to expect that an adequate neural network model (due to its universal approximation property and generalization capability) along with a suitable regularizer can exhibit good recognition performance whenever noise is added to the test data, although training is accomplished on clean data. This paper presents applications of a variant of the so called segmental neural network (SNN), introduced at BBN by Zavaliagkos et al. for rescoring the N-best hypothesis yielded by a standard continuous density HMM (CDHMM). An enhanced connectionist model, called SNN with trainable amplitude of activation functions (SNN-TA) is first used in this paper instead of the CDHMM to perform the recognition of isolated words. Viterbi-based segmentation is then introduced, relying on the level-building algorithm, that can be combined with the SNN-TA to obtain a hybrid framework for continuous speech recognition. The proposed paradigm is applied to the recognition of isolated and connected Italian digits under several noisy conditions, outperforming the CDHMMs.