Semantic concept annotation based on audio PLSA model

Authors:
Yuxin Peng;Zhiwu Lu;Jianguo Xiao
Affiliations:
Peking University, Beijing, China;Peking University, Beijing, China;Peking University, Beijing, China
Venue:
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Year:
2009

Citing 9
Cited 3

Audio Feature Extraction and Analysis for Scene Segmentation and Classification

Journal of VLSI Signal Processing Systems - special issue on multimedia signal processing
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Creating audio keywords for event detection in soccer video

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
The challenge problem for automated detection of 101 semantic concepts in multimedia

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Scene classification via pLSA

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
A flexible framework for key audio effects detection and auditory context inference

IEEE Transactions on Audio, Speech, and Language Processing
A generic audio classification and segmentation approach for multimedia indexing and retrieval

IEEE Transactions on Audio, Speech, and Language Processing
Audio Keywords Discovery for Text-Like Audio Content Analysis and Retrieval

IEEE Transactions on Multimedia
Content-based audio classification and retrieval by support vector machines

IEEE Transactions on Neural Networks

Correlated PLSA for image clustering

MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part I
Multimodal video concept detection via bag of auditory words and multiple kernel learning

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Fusing audio vocabulary with visual features for pornographic video detection

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a new approach and algorithm for the semantic concept annotation based on audio PLSA (probabilistic latent semantic analysis) model. The novelty of our approach includes two sides: Audio vocabulary construction, and audio PLSA model. In audio vocabulary construction, we first segment an audio-clip into a few homogeneous audio-segments according to its content change, which not only capture the change property of audio-clip, but also keep and present the change relation and temporal order of audio features. Then an audio vocabulary is constructed by the RPCL (rival penalized competitive learning) clustering of audio-segments. In this way, each audio-clip can be represented by a bag-of-word form. In audio PLSA model, PLSA is employed to discover the latent topics existing in audio-clips. Based on the discovered topics, the concept classification is then carried out by a support vector machine (SVM) classifier. In addition, we also combine the local features extracted by PLSA and global features in audio-clip to further improve the performance of concept annotation. The experiments are evaluated on 85 hours of audio data from the TRECVID 2005, and show the encouraging results of our approach.