Using boosting to improve a hybrid HMM/neural network speech recognizer

  • Authors:
  • H. Schwenk

  • Affiliations:
  • Int. Comput. Sci. Inst., Berkeley, CA, USA

  • Venue:
  • ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

"Boosting" is a general method for improving the performance of almost any learning algorithm. A previously proposed and very promising boosting algorithm is AdaBoost. In this paper we investigate if AdaBoost can be used to improve a hybrid HMM/neural network continuous speech recognizer. Boosting significantly improves the word error rate from 6.3% to 5.3% on a test set of the OGI Numbers 95 corpus, a medium size continuous numbers recognition task. These results compare favorably with other combining techniques using several different feature representations or additional information from longer time spans. In summary, we can say that the reasons for the impressive success of AdaBoost are still not completely understood. To the best of our knowledge, an application of AdaBoost to a real world problem has not yet been reported in the literature either. In this paper we investigate if AdaBoost can be applied to boost the performance of a continuous speech recognition system. In this domain we have to deal with large amounts of data (often more than 1 million training examples) and inherently noisy phoneme labels. The paper is organized as follows. We summarize the AdaBoost algorithm and our baseline speech recognizer. We show how AdaBoost can be applied to this task and we report results on the Numbers 95 corpus and compare them with other classifier combination techniques. The paper finishes with a conclusion and perspectives for future work.