Learning the architecture of neural networks for speech recognition

  • Authors:
  • U. Bodenhausen;A. Waibel

  • Affiliations:
  • Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA;Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA

  • Venue:
  • ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
  • Year:
  • 1991

Quantified Score

Hi-index 0.00

Visualization

Abstract

Results are presented that suggest that it is possible to learn the architecture of neural networks for speech recognition systems. The Tempo 2 algorithm is proposed. It is a training algorithm for neural networks that trains the temporal parameters of the network (delays and widths of the input windows) as well as the weights. A comparison of the performances with one adaptive parameter set (either weights, delays or widths) shows that the main parameters are the weights. Delays and widths seem to be of lesser importance, but in combination with the weights the temporal parameters can improve performance, especially generalization. A Tempo 2 network with trained delays and widths and random weights can classify 70% of the phonemes correctly. The application to phoneme classification, shows that this adaptive architecture can approach the performance of a carefully hand-tuned TDNN (time-delay neural network) and leads to more compact networks.