A comparison of human, automatic and collaborative music genre classification and user centric evaluation of genre classification systems

Authors:
Klaus Seyerlehner;Gerhard Widmer;Peter Knees
Affiliations:
Dept. of Computational Perception, Johannes Kepler University, Linz, Austria;Dept. of Computational Perception, Johannes Kepler University, Linz, Austria;Dept. of Computational Perception, Johannes Kepler University, Linz, Austria
Venue:
AMR'10 Proceedings of the 8th international conference on Adaptive Multimedia Retrieval: context, exploration, and fusion
Year:
2010

Citing 1
Cited 1

Lightweight measures for timbral similarity of musical audio

Proceedings of the 1st ACM workshop on Audio and music computing multimedia

Classification accuracy is not enough

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper two sets of evaluation experiments are conducted. First, we compare state-of-the-art automatic music genre classification algorithms to human performance on the same dataset, via a listening experiment. This will show that the improvements of content-based systems over the last years have reduced the gap between automatic and human classification performance, but could not yet close this gap. As an important extension to previous work in this context, we will also compare the automatic and human classification performance to a collaborative approach. Second, we propose two evaluation metrics, called user scores, that are based on the votes of the participants of the listening experiment. This user centric evaluation approach allows to get rid of predefined ground truth annotations and allows to account for the ambiguous human perception of musical genre. To take genre ambiguities into account is an important advantage with respect to the evaluation of content-based systems, especially since the dataset compiled in this work (both the audio files and collected votes) are publicly available.