RecNorm: Simultaneous normalisation and classification applied to speech recognition
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Connected Letter Recognition with a Multi-State Time Delay Neural Network
Advances in Neural Information Processing Systems 5, [NIPS Conference]
Integrating time alignment and neural networks for high performance continuous speech recognition
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Subphonetic modeling with Markov states: senone
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
An LVQ based reference model for speaker-adaptive speech recognition
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Hi-index | 0.00 |
In this paper we present an improved Multi-State Time Delay Neural Network (MS-TDNN) for speaker-independent, connected letter recognition which outperforms an HMM based system (SPHINX) and previous MS-TDNNs [2], and explore new network architectures with "internal speaker models". Four different architectures characterized by an increasing number of speaker-specific parameters are introduced. The speaker-specific parameters can be adjusted by "automatic speaker identification" or by speaker adaptation, allowing for "tuning-in" to a new speaker. Both methods lead to significant improvements over the straightforward speaker-independent architecture. Similar as described in [1], even unsupervised "tuning-in" (speech is unlabeled) works astonishingly well.