Word recognition with a hierarchical neural network

  • Authors:
  • Xavier Domont;Martin Heckmann;Heiko Wersing;Frank Joublin;Stefan Menzel;Bernhard Sendhoff;Christian Goerick

  • Affiliations:
  • Honda Research Institute Europe GmbH, Offenbach am Main, Germany and Technische Universität Darmstadt, Control Theory and Robotics Lab, Darmstadt, Germany;Honda Research Institute Europe GmbH, Offenbach am Main, Germany;Honda Research Institute Europe GmbH, Offenbach am Main, Germany;Honda Research Institute Europe GmbH, Offenbach am Main, Germany;Honda Research Institute Europe GmbH, Offenbach am Main, Germany;Honda Research Institute Europe GmbH, Offenbach am Main, Germany;Honda Research Institute Europe GmbH, Offenbach am Main, Germany

  • Venue:
  • NOLISP'07 Proceedings of the 2007 international conference on Advances in nonlinear speech processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we propose a feedforward neural network for syllable recognition. The core of the recognition system is based on a hierarchical architecture initially developed for visual object recognition. We show that, given the similarities between the primary auditory and visual cortexes, such a system can successfully be used for speech recognition. Syllables are used as basic units for the recognition. Their spectrograms, computed using a Gammatone filterbank, are interpreted as images and subsequently feed into the neural network after a preprocessing step that enhances the formant frequencies and normalizes the length of the syllables. The performance of our system has been analyzed on the recognition of 25 different monosyllabic words. The parameters of the architecture have been optimized using an evolutionary strategy. Compared to the Sphinx-4 speech recognition system, our system achieves better robustness and generalization capabilities in noisy conditions.