Supervised dictionary learning for music genre classification

  • Authors:
  • Chin-Chia Michael Yeh;Yi-Hsuan Yang

  • Affiliations:
  • Research Center for IT Innovation Academia Sinica, Taipei, Taiwan;Research Center for IT Innovation Academia Sinica, Taipei, Taiwan

  • Venue:
  • Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper concerns the development of a music codebook for summarizing local feature descriptors computed over time. Comparing to a holistic representation, this text-like representation better captures the rich and time-varying information of music. We systematically compare a number of existing codebook generation techniques and also propose a new one that incorporates labeled data in the dictionary learning process. Several aspects of the encoding system such as local feature extraction and codeword encoding are also analyzed. Our result demonstrates the superiority of sparsity-enforced dictionary learning over conventional VQ-based or exemplar-based methods. With the new supervised dictionary learning algorithm and the optimal settings inferred from the performance study, we achieve state-of-the-art accuracy of music genre classification using just the log-power spectrogram as the local feature descriptor. The classification accuracies for benchmark datasets GTZAN and IS-MIR2004Genre are 84.7% and 90.8%, respectively.