Singing voice detection using perceptually-motivated features

  • Authors:
  • Tin Lay Nwe;Haizhou Li

  • Affiliations:
  • Institute for Infocomm Research, Singapore, Singapore;Institute for Infocomm Research, Singapore, Singapore

  • Venue:
  • Proceedings of the 15th international conference on Multimedia
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Perceptual features are motivated by human perception of sounds. In this paper, several perceptually-motivated features such as harmonic, vibrato and timbre are studied to detect singing voice segments in a song. In addition, singing formant and attack-decay envelope of the sound are also studied for acoustic feature formulation. The cepstral coefficients which reflect the timbre characteristics are formulated by combining information from harmonic content, vibrato, singing formant and attack-decay envelope of the sound. Bandpass filters that spread according to the octave frequency scale are used to extract vibrato and harmonic information. Several experiments are conducted using a database that includes 84 popular songs from commercially available CD recordings. The experiments show that the proposed feature formulation methods are effective.