Speaker-independent features extracted by a neural network
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: plenary, special, audio, underwater acoustics, VLSI, neural networks - Volume I
ATREUS: a comparative study of continuous speech recognition systems at ATR
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Hi-index | 0.00 |
In this paper, a new training method for phoneme identification neural networks, called a "Neural Fuzzy Training" method, is proposed. The difference between the proposed method and the conventional method is that the target values of each training sample are given as fuzzy phoneme class information instead of discrete phoneme class information. In the conventional training method, the target values are defined as 0s or 1s. However, in the proposed method, the target values are defined as likelihoods to phoneme classes in between 0 and 1. This likelihood is computed by a likelihood transformation function according to the distance between the input sample and its nearest sample belonging to each phoneme class in the training set. The effectiveness of the proposed method is shown by an 18- consonant identification experiment and a continuous speech recognition experiment using the ATR isolated word and phrase database. Improvements can be observed in every experiment, particularly on the continuous speech recognition results.