Analysis of Minimum Distances in High-Dimensional Musical Spaces

Authors:
M. Casey;C. Rhodes;M. Slaney
Affiliations:
Goldsmiths Coll., Dept. of Comput., Univ. of London, London;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2008

Citing 0
Cited 9

mHashup: fast visual music discovery via locality sensitive hashing

ACM SIGGRAPH 2008 new tech demos
Efficient and robust music identification with weighted finite-state transducers

IEEE Transactions on Audio, Speech, and Language Processing
Combining multi-probe histogram and order-statistics based LSH for scalable audio content retrieval

Proceedings of the international conference on Multimedia
On similarity search in audio signals using adaptive sparse approximations

AMR'09 Proceedings of the 7th international conference on Adaptive multimedia retrieval: understanding media and adapting to the user
A Probabilistic Model to Combine Tags and Acoustic Similarity for Music Retrieval

ACM Transactions on Information Systems (TOIS)
Fast intra-collection audio matching

Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies
Towards cover group thumbnailing

Proceedings of the 21st ACM international conference on Multimedia
Local and global scaling reduce hubs in space

The Journal of Machine Learning Research
Scalable multimedia content analysis on parallel platforms using python

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose an automatic method for measuring content-based music similarity, enhancing the current generation of music search engines and recommended systems. Many previous approaches to track similarity require brute-force, pair-wise processing between all audio features in a database and therefore are not practical for large collections. However, in an Internet-connected world, where users have access to millions of musical tracks, efficiency is crucial. Our approach uses features extracted from unlabeled audio data and near-neigbor retrieval using a distance threshold, determined by analysis, to solve a range of retrieval tasks. The tasks require temporal features-analogous to the technique of shingling used for text retrieval. To measure similarity, we count pairs of audio shingles, between a query and target track, that are below a distance threshold. The distribution of between-shingle distances is different for each database; therefore, we present an analysis of the distribution of minimum distances between shingles and a method for estimating a distance threshold for optimal retrieval performance. The method is compatible with locality-sensitive hashing (LSH)-allowing implementation with retrieval times several orders of magnitude faster than those using exhaustive distance computations. We evaluate the performance of our proposed method on three contrasting music similarity tasks: retrieval of mis-attributed recordings (fingerprint), retrieval of the same work performed by different artists (cover songs), and retrieval of edited and sampled versions of a query track by remix artists (remixes). Our method achieves near-perfect performance in the first two tasks and 75% precision at 70% recall in the third task. Each task was performed on a test database comprising 4.5 million audio shingles.