A segment-based speaker adaptation neural network applied to continuous speech recognition

Authors:
Keiji Fukuzawa;Yasuhiro Komori;Hidefumi Sawai;Masahide Sugiyama
Affiliations:
ATR Interpreting Telephony Research Laboratories, Kyoto, Japan;ATR Interpreting Telephony Research Laboratories, Kyoto, Japan;Ricoh Co., Ltd., Yokohama, Japan;ATR Interpreting Telephony Research Laboratories, Kyoto, Japan
Venue:
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Year:
1992

Citing 1
Cited 1

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations

ATREUS: a comparative study of continuous speech recognition systems at ATR

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a speaker adaptation technique using a segment-based neural-mapping applied to continuous speech recognition. The adaptation neural network has a time shifted sub-connection architecture to maintain the temporal structure in the acoustic segment and to decrease the amount of speech data for training. The effectiveness of this network has been reported for phoneme recognition. In this paper, this speaker adaptation network is combined with a TDNNLR continuous speech recognizer, and is evaluated in word and phrase recognition experiments with several speakers. The results of 500-word recognition experiments show that the recognition rate by segment-based adaptation is 92.2%, 28.8% higher than the rate without adaptation. The results of 278 phrase recognition experiments show that the recognition rate by segmentbased adaptation is 57.4%,27.7% higher than the rate without adaptation.