Using earth mover's distance for audio clip retrieval

  • Authors:
  • Yuxin Peng;Cuihua Fang;Xiaoou Chen

  • Affiliations:
  • Institute of Computer Science and Technology, Peking University, Beijing, China;Institute of Computer Science and Technology, Peking University, Beijing, China;Institute of Computer Science and Technology, Peking University, Beijing, China

  • Venue:
  • PCM'06 Proceedings of the 7th Pacific Rim conference on Advances in Multimedia Information Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new approach for audio clip retrieval based on Earth Mover’s Distance (EMD). Instead of using frame-based or salient-based features in most existing methods, our approach propose a segment-based representation, and allows many-to-many matching among audio segments for the clip similarity measure, which is capable of tolerating errors due to audio segmentation and various audio effects. We formulate audio clip retrieval as a graph matching problem in two stages. In the first stage, segment-based feature is employed to represent the audio clips, which can not only capture the change property of audio clip, but also keep and present the change relation and temporal order of audio features. In the second stage, based on the result of the segment similarity measure, a weighted graph is constructed to model the similarity between two clips. EMD is proposed to compute the minimum cost of the weighted graph as the similarity value between two audio clips. Experimental results show that the proposed approach is better than some existing methods in terms of retrieval and ranking capabilities.