A piecewise linear spectral mapping for supervised speaker adaptation

  • Authors:
  • Hiroshi Matsukoto;Hirowo Inoue

  • Affiliations:
  • Dept. of Electrical and Electronic Eng., Faculty of Eng., Shinshu University, Nagano-shi, Nagano, Japan;Dept. of Electrical and Electronic Eng., Faculty of Eng., Shinshu University, Nagano-shi, Nagano, Japan

  • Venue:
  • ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a supervised spectral mapping method for speaker adaptation based on a piecewise linear transformation of cepstral vector. In this method, an input vector is mapped onto the target spectral space by a weighted sum of linearly transformed vectors using a set of mapping matrices which are associated with the fuzzy partitioned spaces. These matrices were estimated so as to minimize the total mean square error between the mapped and target spectra. This method was compared with the difference interpolation mapping (D-Map) method which is an extension of the codebook mapping methods. Through 16 phoneme recognition tests using a single Gaussian distribution hidden Markov model (HMM), it was found that the proposed method with 16 fuzzy partitioned spaces improved recognition performance by 4 % compared to the usual linear mapping method when using 100 training words, and also achieved a 3 percent higher rate on average than the D-Map method.