A basis representation of constrained MLLR transforms for robust adaptation

  • Authors:
  • Daniel Povey;Kaisheng Yao

  • Affiliations:
  • -;-

  • Venue:
  • Computer Speech and Language
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abstract: Constrained Maximum Likelihood Linear Regression (CMLLR) is a speaker adaptation method for speech recognition that can be realized as a feature-space transformation. In its original form it does not work well when the amount of speech available for adaptation is less than about 5s, because of the difficulty of robustly estimating the parameters of the transformation matrix. In this paper we describe a basis representation of the CMLLR transformation matrix, in which the variation between speakers is concentrated in the leading coefficients. When adapting to a speaker, we can select a variable number of coefficients to estimate depending on the amount of adaptation data available, and assign a zero value to the remaining coefficients. We obtain improved performance when the amount of adaptation data is limited, while retaining the same asymptotic performance as conventional CMLLR. We demonstrate that our method performs better than the popular existing approaches, and is more efficient than conventional CMLLR estimation.