Incorporating cultural representations of features into audio music similarity estimation

Authors:
Kris West;Stephen Cox
Affiliations:
School of Computing Sciences, University of East Anglia, Norwich, UK;School of Computing Sciences, University of East Anglia, Norwich, UK
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2010

Citing 9
Cited 0

An algorithmic framework for performing collaborative filtering

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A comparative study on content-based music genre classification

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A Metric for Distributions with Applications to Image Databases

ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Lightweight measures for timbral similarity of musical audio

Proceedings of the 1st ACM workshop on Audio and music computing multimedia
Incorporating machine-learning into music similarity estimation

Proceedings of the 1st ACM workshop on Audio and music computing multimedia
Aggregate features and ADABOOST for music classification

Machine Learning
A model-based approach to constructing music similarity functions

EURASIP Journal on Applied Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the problem of estimating automatically from audio signals the similarity between two pieces of music, a technology that has many applications in the online digital music industry. Conventional methods of audio music search use distance measures between features derived from the audio for this task. We describe three techniques that make use of music classifiers to derive representations of audio features that are based on culturally motivated information learned by the classifier. When these representations are used for similarity estimation, they produce very significant reductions in computational complexity over existing techniques (such as those based on the KL-Divergence), and also produce metric similarity spaces, which facilitate the use of technologies for the sub-linear scaling of search times. We have evaluated each system using both pseudo-objective techniques and human listeners, and we demonstrate that this efficiency gain is obtained while providing a comparable level of performance when compared with existing techniques.