Efficient Index-Based Audio Matching

Authors:
F. Kurth;M. Muller
Affiliations:
Res. Establ. for Appl. Scenice, Res. Inst. for Commun., Inf. Process. & Ergonomics, Wachtberg, Germany;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2008

Citing 0
Cited 9

A Framework for Managing Multimodal Digitized Music Collections

ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
Similarity search in animal sound databases

IEEE Transactions on Multimedia
Automatic identification of audio recordings based on statistical modeling

Signal Processing
Towards timbre-invariant audio features for harmony-based music

IEEE Transactions on Audio, Speech, and Language Processing
A concept for using combined multimodal queries in digital music libraries

ECDL'09 Proceedings of the 13th European conference on Research and advanced technology for digital libraries
The PROBADO project: approach and lessons learned in building a digital library system for heterogeneous non-textual documents

ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
The need for music information retrieval with user-centered and multimodal strategies

MIRUM '11 Proceedings of the 1st international ACM workshop on Music information retrieval with user-centered and multimodal strategies
Content-based music access: an approach and its applications

FDIA'09 Proceedings of the Third BCS-IRSG conference on Future Directions in Information Access
Fast intra-collection audio matching

Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a large audio database of music recordings, the goal of classical audio identification is to identify a particular audio recording by means of a short audio fragment. Even though recent identification algorithms show a significant degree of robustness towards noise, MP3 compression artifacts, and uniform temporal distortions, the notion of similarity is rather close to the identity. In this paper, we address a higher level retrieval problem, which we refer to as audio matching: given a short query audio clip, the goal is to automatically retrieve all excerpts from all recordings within the database that musically correspond to the query. In our matching scenario, opposed to classical audio identification, we allow semantically motivated variations as they typically occur in different interpretations of a piece of music. To this end, this paper presents an efficient and robust audio matching procedure that works even in the presence of significant variations, such as nonlinear temporal, dynamical, and spectral deviations, where existing algorithms for audio identification would fail. Furthermore, the combination of various deformation- and fault-tolerance mechanisms allows us to employ standard indexing techniques to obtain an efficient, index-based matching procedure, thus providing an important step towards semantically searching large-scale real-world music collections.