Boosting Speech/Non-speech Classification Using Averaged Mel-Frequency Cepstrum Coefficients Features

  • Authors:
  • Ziyou Xiong;Thomas S. Huang

  • Affiliations:
  • -;-

  • Venue:
  • PCM '02 Proceedings of the Third IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

AdaBoost is used to boost and select the best sequence of weak classifiers for the speech/non-speech classification. These weak classifiers are chosen the simple threshold functions. Statistical mean and variance of the Mel-frequency Cepstrum Coefficients(MFCC) over all overlapping frames of an audio file are used as audio features. Training and testing on a database of 410 audio files have shown asymptotic classification improvement by AdaBoost. A classification accuracy of 99.51% has been achieved on the test data. A comparison of AdaBoost with Nearest Neighbor and Nearest Center classifiers is also given.