Incorporating machine-learning into music similarity estimation

  • Authors:
  • Kris West;Stephen Cox;Paul Lamere

  • Affiliations:
  • University of East Anglia, Norwich, United Kingdom;University of East Anglia, Norwich, United Kingdom;Sun Microsystems Laboratories, Burlington, MA

  • Venue:
  • Proceedings of the 1st ACM workshop on Audio and music computing multimedia
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Music is a complex form of communication in which both artists and cultures express their ideas and identity. When we listen to music we do not simply perceive the acoustics of the sound in a temporal pattern, but also its relationship to other sounds, songs, artists, cultures and emotions. Owing to the complex, culturally-defined distribution of acoustic and temporal patterns amongst these relationships, it is unlikely that a general audio similarity metric will be suitable as a music similarity metric. Hence, we are unlikely to be able to emulate human perception of the similarity of songs without making reference to some historical or cultural context.The success of music classification systems, demonstrates that this difficulty can be overcome by learning the complex relationships between audio features and the metadata classes to be predicted. We present two approaches to the construction of music similarity metrics based on the use of a classification model to extract high-level descriptions of the music. These approaches achieve a very high-level of performance and do not produce the occasional spurious results or 'hubs' that conventional music similarity techniques produce.