The nature of statistical learning theory
The nature of statistical learning theory
Self-Organizing Maps
Unsupervised document classification using sequential information maximization
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Video search reranking via information bottleneck principle
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
The value of stories for speech-based video search
Proceedings of the 6th ACM international conference on Image and video retrieval
Columbia University's semantic video search engine
Proceedings of the 6th ACM international conference on Image and video retrieval
Video search re-ranking via multi-graph propagation
Proceedings of the 15th international conference on Multimedia
Visual islands: intuitive browsing of visual search results
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Concept-Specific Visual Vocabulary Construction for Object Categorization
PCM '09 Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Text-based video content classification for online video-sharing sites
Journal of the American Society for Information Science and Technology
Category sensitive codebook construction for object category recognition
ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Using local density information to improve IB algorithms
Pattern Recognition Letters
Pattern Recognition Letters
VisionGo: Towards video retrieval with joint exploration of human and computer
Information Sciences: an International Journal
Adapted vocabularies for generic visual categorization
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Learning semantic features for action recognition via diffusion maps
Computer Vision and Image Understanding
Proceedings of the International Workshop on Video and Image Ground Truth in Computer Vision Applications
Hi-index | 0.00 |
Recent research in video analysis has shown a promising direction, in which mid-level features (e.g., people, anchor, indoor) are abstracted from low-level features (e.g., color, texture, motion, etc.) and used for discriminative classification of semantic labels. However, in most systems, such mid-level features are selected manually. In this paper, we propose an information-theoretic framework, visual cue cluster construction (VC3), to automatically discover adequate mid-level features. The problem is posed as mutual information maximization, through which optimal cue clusters are discovered to preserve the highest information about the semantic labels. We extend the Information Bottleneck framework to high-dimensional continuous features and further propose a projection method to map each video into probabilistic memberships over all the cue clusters. The biggest advantage of the proposed approach is to remove the dependence on the manual process in choosing the mid-level features and the huge labor cost involved in annotating the training corpus for training the detector of each mid-level feature. The proposed VC3 framework is general and effective, leading to exciting potential in solving other problems of semantic video analysis. When tested in news video story segmentation, the proposed approach achieves promising performance gain over representations derived from conventional clustering techniques and even the mid-level features selected manually.