Speaker clustering and transformation for speaker adaptation in large-vocabulary speech recognition systems

  • Authors:
  • M. Padmanabhan;L. R. Bahl;D. Nahamoo;M. A. Picheny

  • Affiliations:
  • IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA;-;-;-

  • Venue:
  • ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

A speaker adaptation strategy is described that is based on finding a subset of speakers, from the training set, who are acoustically close to the test speaker, and using only the data from these speakers (rather than the complete training corpus) to re-estimate the system parameters. Further, a linear transformation is computed for every one of the selected training speakers to better map the training speaker's data to the test speaker's acoustic space. Finally, the system parameters (Gaussian means) are re-estimated specifically for the test speaker using the transformed data from the selected training speakers. Experiments showed that this scheme is capable of reducing the error rate by 10-15% with the use of as little as 3 sentences of adaptation data.