Links Between Markov Models and Multilayer Perceptrons
IEEE Transactions on Pattern Analysis and Machine Intelligence
Speech recognition using segmental neural nets
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Hi-index | 0.00 |
This paper proposes a Time-Frequency Segmental Neural Network (TFSNN) which classifies phonemes according to the two-dimensional time frequency distribution of the whole phonetic segment. It uses a network architecture similar to those used for optical character recognition (OCR) [2] in order to provide local shift invariance along both the time and the frequency axis. The TFSNN can be used in place of a segmental neural network (SNN) [1] in a hybrid HMM/ANN system for automatic speech recognition as it shows significantly better performance than the SNN. The training time for the TFSNN is also smaller as it employs very few connection weights compared to the SNN.