A divide-and-conquer approach to latent perceptual indexing of audio for large web 2.0 applications

  • Authors:
  • Shiva Sundaram;Shrikanth Narayanan

  • Affiliations:
  • Deutsche Telekom Laboratories, Quality and Usability Lab, TU-Berlin, Berlin, Germany;Signal Analysis and Interpretation Lab, Dept. of Electrical Engineering-Systems, Univ. of Southern California, Los Angeles

  • Venue:
  • ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the recently proposed latent perceptual indexing of audio, a collection of clips is indexed using unit-document frequency measures between a set of reference clusters as units and the clips as the documents. The reference units are derived by clustering the bag-of-feature vectors extracted from the whole audio library using an unsupervised clustering technique. Indexing is achieved through reduced-rank approximation (using singular-value decomposition) of the unit-document co-occurrence measure matrix that is obtained for the given set of reference clusters and the collection of audio clips. In our initial investigation, the k-means algorithm was used to derive the reference units. In this paper, we attempt to reduce the computation load requirements for the k-means algorithm and singular-value decomposition by randomly splitting the training data into smaller sized parts instead of working on it as a whole. We present results of classification experiments on the BBC sound effects library and our results indicate this approach can significantly reduce the computation time without significant loss in classification performance.