Speaker-independent features extracted by a neural network
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: plenary, special, audio, underwater acoustics, VLSI, neural networks - Volume I
Hi-index | 0.00 |
A new feature parameter space for speech recognition called PRPG (Probability Ratios between Phoneme Group pairs) has been proposed and speaker adaptive phoneme recognition has been performed. In the coordinate system proposed here, the area with the same information for speech recognition is compressed into one point. The mapping function from spectral coordinate system to proposed one is realized using a neural network. The code-vectors designed on this coordinate system are assured to be information-theoretically more efficient than that of spectral coordinate system. Moreover, by the definition of the coordinate system, the meaning of axes are equivalent among different speakers, so the speaker adaptation can be easily performed without trajectory mapping. The experimental results show that the 40% of errors are reduced by the coordinate conversion in the speaker-dependent tasks. The scores of the speaker-adaptive tasks in the proposed feature domain are always superior to those of the speaker-dependent tasks in the spectral domain.