Noise robust features for speech/music discrimination in real-time telecommunication

  • Authors:
  • Zhong-Hua Fu;Jhing-Fa Wang;Lei Xie

  • Affiliations:
  • School of Computer Science, Northwestern Polytechnical University, Xi'an, China and Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan;Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan;School of Computer Science, Northwestern Polytechnical University, Xi'an, China

  • Venue:
  • ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

While many efforts have been made in the audio signal classification field, the noise interruption problem is seldom concerned so far, especially in many telecommunication applications, where a real-time and noise robust approach is needed. This paper addresses this problem by proposing two novel robust features: Average Pitch Density (APD) and Relative Tonal Power Density (RTPD). APD refers to the differences in tone characteristics of music and speech signals, and RTPD especially focuses on the distinct properties of the percussion instruments. The comparison experiments are implemented on two databases. The first one is reorganized from the corpus collected by Scheirer et al [3]. The second one consists of data collected from various recording situations. The novel features are compared with several state-of-the-art features and are found to achieve significant robustness.