An improved automatic lipreading system to enhance speech recognition
CHI '88 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Neural network perception for mobile robot guidance
Neural network perception for mobile robot guidance
Connected Letter Recognition with a Multi-State Time Delay Neural Network
Advances in Neural Information Processing Systems 5, [NIPS Conference]
Integration of acoustic and visual speech signals using neural networks
IEEE Communications Magazine
Hi-index | 0.00 |
In this paper we show how recognition performance in automated speech perception can be significantly improved by additional Lipreading, so called "Speech-reading". We show this on an extension of an existing state-of-the-art speech recognition system, a modular MS-TDNN. The acoustic and visual speech data is preclassified in two separate front-end phoneme TDNNs and combined to acoustic-visual hypotheses for the Dynamic Time Warping algorithm. This is shown on a connected word recognition problem, the notoriously difficult letter spelling task. With speechreading we could reduce the error rate up to half of the error rate of the pure acoustic recognition.