A time-frequency segmental neural network for phoneme recognition

  • Authors:
  • Anjan Basu;Torbjørn Svendsen

  • Affiliations:
  • Department of Telecommunications, Norwegian Institute of Technology, Trondheim, Norway;Department of Telecommunications, Norwegian Institute of Technology, Trondheim, Norway

  • Venue:
  • ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: plenary, special, audio, underwater acoustics, VLSI, neural networks - Volume I
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a Time-Frequency Segmental Neural Network (TFSNN) which classifies phonemes according to the two-dimensional time frequency distribution of the whole phonetic segment. It uses a network architecture similar to those used for optical character recognition (OCR) [2] in order to provide local shift invariance along both the time and the frequency axis. The TFSNN can be used in place of a segmental neural network (SNN) [1] in a hybrid HMM/ANN system for automatic speech recognition as it shows significantly better performance than the SNN. The training time for the TFSNN is also smaller as it employs very few connection weights compared to the SNN.