Real-time speaker adaptation for speech recognition on mobile devices

  • Authors:
  • Gil Ho Lee

  • Affiliations:
  • Computer Science Lab., Samsung Advanced Institute of Technology, Samsung Electronics, Yongin, Korea

  • Venue:
  • CCNC'10 Proceedings of the 7th IEEE conference on Consumer communications and networking conference
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduce a real-time speaker adaptation method for speech recognition on mobile devices. In order to adapt speech recognition system to any speakers, we employ vocal tract length normalization (VTLN). In conventional VTLN, warping factors are computed by maximum likelihood estimation. After all possible warping factors are applied to speech recognition, the best warping factor is selected corresponding to speaker or speech. Therefore it is not efficient for mobile devices because of expensive computation although its performance is good. To reduce computational effort, we employ pitch-based VTLN and simplify pitch estimation. The proposed method gives the relative word error rate reduction by 21.5% in Korean while the speed is slower by 10.5% as compared to the baseline.