Recognition of Visual Activities and Interactions by Stochastic Parsing
IEEE Transactions on Pattern Analysis and Machine Intelligence
The Earth Mover's Distance as a Metric for Image Retrieval
International Journal of Computer Vision
Event Detection and Analysis from Video Streams
IEEE Transactions on Pattern Analysis and Machine Intelligence
View-Invariant Representation and Recognition of Actions
International Journal of Computer Vision
The Representation and Recognition of Human Movement Using Temporal Templates
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Video Data Mining: Semantic Indexing and Event Detection from the Association Perspective
IEEE Transactions on Knowledge and Data Engineering
Temporal event clustering for digital photo collections
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Toward a Common Event Model for Multimedia Applications
IEEE MultiMedia
Kodak's consumer video benchmark data set: concept definition and annotation
Proceedings of the international workshop on Workshop on multimedia information retrieval
Large-scale multimodal semantic concept detection for consumer video
Proceedings of the international workshop on Workshop on multimedia information retrieval
Cross-domain video concept detection using adaptive svms
Proceedings of the 15th international conference on Multimedia
EMMCVPR'07 Proceedings of the 6th international conference on Energy minimization methods in computer vision and pattern recognition
Automated event clustering and quality screening of consumer pictures for digital albuming
IEEE Transactions on Multimedia
Automatic soccer video analysis and summarization
IEEE Transactions on Image Processing
Hi-index | 0.00 |
We study semantic event classification in the consumer domain by incorporating cross-domain and within-domain learning. An event is defined as a set of photos and/or videos that are taken within a common period of time, and have similar visual appearance. Events are generated from unconstrained consumer photo and video collections, by an automatic content management system, e.g., an automatic albuming system. Such consumer events have the following characteristics: an event can contain both photos and videos; there usually exist noisy/erroneous images resulting from imperfect albuming; and event data taken by different users, although from the same semantic category, can have highly diverse visual content. To accommodate these characteristics, we develop a general two-step Event-Level Feature (ELF) learning framework that enables the use of external data sources by cross-domain learning and the use of region-level representations, to enhance classification. Specifically, in the first step an elementary-level feature is used to represent images and videos. Then in the second step an ELF is constructed on top of the elementary feature to model each event as a feature vector. Semantic event classifiers can be directly built based on the ELF. Various ELFs are generated from different types of elementary-level features by using both cross-domain and within-domain learning: cross-domain approaches use two sets of concept scores at both image and region level that are learned from two external data sources; within-domain approaches use low-level visual features at both image and region level. Different types of ELFs complement each other for improved classification. Experiments over a large real consumer data set confirm significant improvements, e.g., over 90% MAP gain compared to the previous semantic event classification method.