Audio Feature Extraction and Analysis for Scene Segmentation and Classification
Journal of VLSI Signal Processing Systems - special issue on multimedia signal processing
Video Manga: generating semantically meaningful video summaries
MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Determining computable scenes in films and their structures using audio-visual memory models
MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
Modern Information Retrieval
Multiclass Spectral Clustering
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Semantic context detection based on hierarchical audio models
MIR '03 Proceedings of the 5th ACM SIGMM international workshop on Multimedia information retrieval
Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
Unsupervised content discovery in composite audio
Proceedings of the 13th annual ACM international conference on Multimedia
Creating audio keywords for event detection in soccer video
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
Highlight sound effects detection in audio stream
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 3 (ICME '03) - Volume 03
A flexible framework for key audio effects detection and auditory context inference
IEEE Transactions on Audio, Speech, and Language Processing
Video summarization and scene detection by graph modeling
IEEE Transactions on Circuits and Systems for Video Technology
Text-like segmentation of general audio for content-based retrieval
IEEE Transactions on Multimedia
Fusing audio vocabulary with visual features for pornographic video detection
Future Generation Computer Systems
Hi-index | 0.00 |
Natural semantic sound clusters in an audio document, also referred to as audio elements, can be seen as an analogy to words in a text document. Based on the obtained set of audio elements, the key audio elements, or audio "keywords", can be detected, which are most prominent in characterizing the content of audio data. As such, they can be of great use for automatic audio content analysis and discovery. Motivated by the limitations of the existing methods for key audio element detection, we propose in this paper a novel unsupervised approach to audio elements weighting using multiple audio documents, analog to word weighting in text document analysis. In our approach, dominant feature vectors (DFV) are first extracted from each audio element, and used to measure the audio elements similarity, based on which the occurrence probability of one audio element in different audio documents can be estimated. Then, four factors, including expected term frequency, expected inverse document frequency, expected term duration, and expected inverse document duration, are calculated and combined to give the importance weight of each audio element. Evaluation of the obtained audio "keywords" and their usability for auditory scene segmentation and audio document clustering, performed on 5 hours of diverse audio data, shows highly promising results.