A divide-and-conquer approach to latent perceptual indexing of audio for large web 2.0 applications

Authors:
Shiva Sundaram;Shrikanth Narayanan
Affiliations:
Deutsche Telekom Laboratories, Quality and Usability Lab, TU-Berlin, Berlin, Germany;Signal Analysis and Interpretation Lab, Dept. of Electrical Engineering-Systems, Univ. of Southern California, Los Angeles
Venue:
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Year:
2009

Citing 6
Cited 0

Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Content-Based Classification, Search, and Retrieval of Audio

IEEE MultiMedia
An Efficient k-Means Clustering Algorithm: Analysis and Implementation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Large-scale content-based audio retrieval from text queries

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Semantic Annotation and Retrieval of Music and Sound Effects

IEEE Transactions on Audio, Speech, and Language Processing
Content-based audio classification and retrieval by support vector machines

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the recently proposed latent perceptual indexing of audio, a collection of clips is indexed using unit-document frequency measures between a set of reference clusters as units and the clips as the documents. The reference units are derived by clustering the bag-of-feature vectors extracted from the whole audio library using an unsupervised clustering technique. Indexing is achieved through reduced-rank approximation (using singular-value decomposition) of the unit-document co-occurrence measure matrix that is obtained for the given set of reference clusters and the collection of audio clips. In our initial investigation, the k-means algorithm was used to derive the reference units. In this paper, we attempt to reduce the computation load requirements for the k-means algorithm and singular-value decomposition by randomly splitting the training data into smaller sized parts instead of working on it as a whole. We present results of classification experiments on the BBC sound effects library and our results indicate this approach can significantly reduce the computation time without significant loss in classification performance.