Towards neurocomputational speech and sound processing

  • Authors:
  • Jean Rouat;Stéphane Loiselle;Ramin Pichevar

  • Affiliations:
  • Université de Sherbrooke;Université de Sherbrooke;Université de Sherbrooke and Communications Research Centre, Ottawa

  • Venue:
  • Progress in nonlinear speech processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

From physiology we learn that the auditory system extracts simultaneous features from the underlying signal, giving birth to simultaneous representations of audible signals. We also learn that pattern analysis and recognition are not separated processes (in opposition to the engineering approach of pattern recognition where analysis and recognition are usually separated processes). Furthermore, in the visual system, it has been observed that the sequence order of firing is crucial to perform fast visual recognition tasks (Rank Order Coding). The use of the Rank Order Coding has also been recently hypothesized in the mammalian auditory system. In a first application we compare a very simplistic speech recognition prototype that uses the Rank Order Coding with a conventional Hidden Markov Model speech recognizer. It is also shown that the type of neurons being used should be adapted to the type of phonemes (consonants/transients or vowels/stable) to be recognized. In a second application, we combine a simultaneous auditory images representation with a network of oscillatory spiking neurons to segregate and bind auditory objects for acoustical source separation. It is shown that the spiking neural network performs unsupervised auditory images segmentation (to find 'auditory' objects) and binding of the objects belonging to the same auditory source (yielding automatic sound source separation).