Speaker normalization for speech recognition

Authors:
Xuedong Huang
Affiliations:
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
Venue:
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Year:
1992

Citing 3
Cited 0

DARPA resource management benchmark test results June 1990

HLT '90 Proceedings of the workshop on Speech and Natural Language
Hidden Markov Models for Speech Recognition

Hidden Markov Models for Speech Recognition
On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

A successful speaker normalization mechanism will not only be useful to speaker adaptation but also speaker-independent speech recognition. In this paper, a codeword-dependent neural network (CDNN) is presented for the study of speaker adaptation. The CDNN is used as a nonlinear mapping function to transform speech data between two speakers. The mapping function is characterized by a number of important properties. First, the assembly of mapping functions enhances overall mapping quality. Second, multiple input vectors are used simultaneously in the transformation. This not only makes full use of dynamic information but also alleviates possible errors in the supervision data. Finally, the mapping function is derived from training data and the quality will dependent on the available amount of training data. Based on speaker-dependent models, performance evaluation showed that speaker normalization significantly reduced the error rate from 41.9% to 5.0%.