An analysis of the GTZAN music genre dataset

  • Authors:
  • Bob L. Sturm

  • Affiliations:
  • Aalborg University Copenhagen, Copenhagen, Denmark

  • Venue:
  • Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

A significant amount of work in automatic music genre recognition has used a dataset whose composition and integrity has never been formally analyzed. For the first time, we provide an analysis of its composition, and create a machine-readable index of artist and song titles. We also catalog numerous problems with its integrity, such as replications, mislabelings, and distortions.