Melody, bass line, and harmony representations for music version identification

Authors:
Justin Salamon;Joan Serrà;Emilia Gómez
Affiliations:
Universitat Pompeu Fabra, Barcelona, Spain;Artificial Intelligence Institute (IIIA-CSIC), Bellaterra, Spain;Universitat Pompeu Fabra, Barcelona, Spain
Venue:
Proceedings of the 21st international conference companion on World Wide Web
Year:
2012

Citing 8
Cited 0

Nonlinear Time Series Analysis

Nonlinear Time Series Analysis
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
A comparative evaluation of search techniques for query-by-humming using the MUSART testbed

Journal of the American Society for Information Science and Technology
Tonal Description of Polyphonic Audio for Music Content Processing

INFORMS Journal on Computing
Introduction to Information Retrieval

Introduction to Information Retrieval
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
A Mid-Level Representation for Melody-Based Retrieval in Audio Collections

IEEE Transactions on Multimedia
Predictability of Music Descriptor Time Series and its Application to Cover Song Detection

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we compare the use of different musical representations for the task of version identification (i.e. retrieving alternative performances of the same musical piece). We automatically compute descriptors representing the melody and bass line using a state-of-the-art melody extraction algorithm, and compare them to a harmony-based descriptor. The similarity of descriptor sequences is computed using a dynamic programming algorithm based on nonlinear time series analysis which has been successfully used for version identification with harmony descriptors. After evaluating the accuracy of individual descriptors, we assess whether performance can be improved by descriptor fusion, for which we apply a classification approach, comparing different classification algorithms. We show that both melody and bass line descriptors carry useful information for version identification, and that combining them increases version detection accuracy. Whilst harmony remains the most reliable musical representation for version identification, we demonstrate how in some cases performance can be improved by combining it with melody and bass line descriptions. Finally, we identify some of the limitations of the proposed descriptor fusion approach, and discuss directions for future research.