Identifying the classical music composition of an unknown performance with wavelet dispersion vector and neural nets

  • Authors:
  • Stephan Rein;Martin Reisslein

  • Affiliations:
  • Communications Systems Group, Institut für Telekommunkationssysteme, Technical University Berlin, Sekr. EN 1, Einsteinufer 17, D-10587 Berlin, Germany;Department of Electrical Engineering, Arizona State University, Goldwater Center, MC 5706, Tempe, AZ 85287-5706, United States

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2006

Quantified Score

Hi-index 0.07

Visualization

Abstract

As the internet search evolves toward multimedia content based search and information retrieval, audio content identification and retrieval will likely become one of the key components of next generation internet search machines. In this paper we consider the specific problem of identifying the classical music composition of an unknown performance of the composition. We develop and evaluate a wavelet based methodology for this problem. Our methodology combines a novel music information (audio content) descriptor, the wavelet dispersion vector, with neural net assessment of the similarity between unknown query vectors and known (example set) vectors. We define the wavelet dispersion vector as the histogram of the rank orders obtained by the wavelet coefficients of a given wavelet scale among all the coefficients (of all scales at a given time instant). We demonstrate that the wavelet dispersion vector precisely characterizes the audio content of a performance of a classical music composition while achieving good generalization across different performances of the composition. We examine the identification performance of a combination of 39 different wavelets and three different types of neural nets. We find that our wavelet dispersion vector calculated with a biorthogonal wavelet in conjunction with a probabilistic radial basis neural net trained by only three independent example performances correctly identifies approximately 78% of the unknown performances.