A neural fuzzy training approach for continuous speech recognition improvement

  • Authors:
  • Yasuhiro Komori

  • Affiliations:
  • ATR Interpreting Telephony Research Laboratories, Kyoto, Japan

  • Venue:
  • ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a new training method for phoneme identification neural networks, called a "Neural Fuzzy Training" method, is proposed. The difference between the proposed method and the conventional method is that the target values of each training sample are given as fuzzy phoneme class information instead of discrete phoneme class information. In the conventional training method, the target values are defined as 0s or 1s. However, in the proposed method, the target values are defined as likelihoods to phoneme classes in between 0 and 1. This likelihood is computed by a likelihood transformation function according to the distance between the input sample and its nearest sample belonging to each phoneme class in the training set. The effectiveness of the proposed method is shown by an 18- consonant identification experiment and a continuous speech recognition experiment using the ATR isolated word and phrase database. Improvements can be observed in every experiment, particularly on the continuous speech recognition results.