Speech/music discrimination for multimedia applications

  • Authors:
  • K. El-Maleh;M. Klein;G. Petrucci;P. Kabal

  • Affiliations:
  • Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, Que., Canada;-;-;-

  • Venue:
  • ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 04
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic discrimination of speech and music is an important tool in many multimedia applications. Previous work has focused on using long-term features such as differential parameters, variances and time-averages of spectral parameters. These classifiers use features estimated over windows of 0.5-5 seconds, and are relatively complex. We present our results of combining the line spectral frequencies (LSFs) and zero crossing-based features for frame-level narrowband speech/music discrimination. Our classification results for different types of music and speech show the good discriminating power of these features. Our classification algorithms operate using only a frame delay of 20 ms, making them suitable for real-time multimedia applications.