Audio Feature Extraction and Analysis for Scene Segmentation and Classification
Journal of VLSI Signal Processing Systems - special issue on multimedia signal processing
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Classification of summarized videos using hidden markov models on compressed chromaticity signatures
MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Automatic Genre Identification for Content-Based Video Categorization
ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 4
Unsupervised content discovery in composite audio
Proceedings of the 13th annual ACM international conference on Multimedia
Horror film genre typing and scene labeling via audio analysis
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
Video classification using transform coefficients
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 06
Scene Classification Using a Hybrid Generative/Discriminative Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
A flexible framework for key audio effects detection and auditory context inference
IEEE Transactions on Audio, Speech, and Language Processing
Audio Keywords Discovery for Text-Like Audio Content Analysis and Retrieval
IEEE Transactions on Multimedia
Automatic Video Classification: A Survey of the Literature
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Hi-index | 0.00 |
We consider the problem of automatically classifying videos into predefined categories based on the analysis of their audio contents. In detail, given a set of labeled videos (such as news, sitcoms, sports, etc.), our objective is to classify a new video into one of these categories. To solve this problem, a novel audio features based video classification method combining an unsupervised generative model named probabilistic Latent Semantic Analysis (pLSA) with a multi-class discriminative classifier is proposed. Since general audio signals usually show complicated distribution in the feature space, k-means clustering method is firstly used to group temporal signal segments with similar low-level features into natural clusters, which are adopted as "audio words". Then, the audio stream of a video is decomposed into a bag of "audio words". To classify those bags of "audio words" which extracted from videos, latent "topics" are discovered by pLSA, and subsequently, training a multi-class classifier on the "topic" distribution vector for each video. Encouraging classification results have been achieved in our experiments.