A neural fuzzy training approach for continuous speech recognition improvement

Authors:
Yasuhiro Komori
Affiliations:
ATR Interpreting Telephony Research Laboratories, Kyoto, Japan
Venue:
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Year:
1992

Citing 0
Cited 2

Speaker-independent features extracted by a neural network

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: plenary, special, audio, underwater acoustics, VLSI, neural networks - Volume I
ATREUS: a comparative study of continuous speech recognition systems at ATR

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a new training method for phoneme identification neural networks, called a "Neural Fuzzy Training" method, is proposed. The difference between the proposed method and the conventional method is that the target values of each training sample are given as fuzzy phoneme class information instead of discrete phoneme class information. In the conventional training method, the target values are defined as 0s or 1s. However, in the proposed method, the target values are defined as likelihoods to phoneme classes in between 0 and 1. This likelihood is computed by a likelihood transformation function according to the distance between the input sample and its nearest sample belonging to each phoneme class in the training set. The effectiveness of the proposed method is shown by an 18- consonant identification experiment and a continuous speech recognition experiment using the ATR isolated word and phrase database. Improvements can be observed in every experiment, particularly on the continuous speech recognition results.