Speech Recognition Based on Student's t-Distribution Derived from Total Bayesian Framework

  • Authors:
  • Shinji Watanabe;Atsushi Nakamura

  • Affiliations:
  • The authors are with NTT Communication Science Laboratories, NTT Corporation, Kyoto-fu, 619--0237 Japan. E-mail: watanabe@cslab.kecl.ntt.co.jp, E-mail: ats@cslab.kecl.ntt.co.jp;The authors are with NTT Communication Science Laboratories, NTT Corporation, Kyoto-fu, 619--0237 Japan. E-mail: watanabe@cslab.kecl.ntt.co.jp, E-mail: ats@cslab.kecl.ntt.co.jp

  • Venue:
  • IEICE - Transactions on Information and Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a robust classification method based on the Bayesian predictive distribution (Bayesian Predictive Classification, referred to as BPC) for speech recognition. We and others have recently proposed a total Bayesian framework named Variational Bayesian Estimation and Clustering for speech recognition (VBEC). VBEC includes the practical computation of approximate posterior distributions that are essential for BPC, based on variational Bayes (VB). BPC using VB posterior distributions (VB-BPC) provides an analytical solution for the predictive distribution as the Student's t-distribution, which can mitigate the over-training effects by marginalizing the model parameters of an output distribution. We address the sparse data problem in speech recognition, and show experimentally that VB-BPC is robust against data sparseness.