Ensemble Discriminant Sparse Projections Applied to Music Genre Classification

  • Authors:
  • Constantine Kotropoulos;Gonzalo R. Arce;Yannis Panagakis

  • Affiliations:
  • -;-;-

  • Venue:
  • ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Resorting to the rich, psycho-physiologically grounded, properties of the slow temporal modulations of music recordings, a novel classifier ensemble is built, which applies discriminant sparse projections. More specifically, over complete dictionaries are learned and sparse coefficient vectors are extracted to optimally approximate the slow temporal modulations of the training music recordings. The sparse coefficient vectors are then projected to the principal subspaces of their within-class and between-class covariance matrices. Decisions are taken with respect to the minimum Euclidean distance from the class mean sparse coefficient vectors, which undergo the aforementioned projections. The application of majority voting to the decisions taken by 10 individual classifiers, which are trained on the 10 training folds defined by stratified 10-fold cross-validation on the GTZAN dataset, yields a music genre classification accuracy of 84.96% on average. The latter exceeds by 2.46% the highest accuracy previously reported without employing any sparse representations.