Effective semantic classification of consumer events for automatic content management

Authors:
Wei Jiang;Alexander C. Loui
Affiliations:
Columbia University, New York, NY, USA;Eastman Kodak Company, Rochester, NY, USA
Venue:
WSM '09 Proceedings of the first SIGMM workshop on Social media
Year:
2009

Citing 16
Cited 0

Recognition of Visual Activities and Interactions by Stochastic Parsing

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Earth Mover's Distance as a Metric for Image Retrieval

International Journal of Computer Vision
Event Detection and Analysis from Video Streams

IEEE Transactions on Pattern Analysis and Machine Intelligence
View-Invariant Representation and Recognition of Actions

International Journal of Computer Vision
The Representation and Recognition of Human Movement Using Temporal Templates

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Trajectory-based ball detection and tracking with applications to semantic analysis of broadcast soccer video

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Video Data Mining: Semantic Indexing and Event Detection from the Association Perspective

IEEE Transactions on Knowledge and Data Engineering
Temporal event clustering for digital photo collections

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Region-based Image Annotation using Asymmetrical Support Vector Machine-based Multiple-Instance Learning

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Toward a Common Event Model for Multimedia Applications

IEEE MultiMedia
Kodak's consumer video benchmark data set: concept definition and annotation

Proceedings of the international workshop on Workshop on multimedia information retrieval
Large-scale multimodal semantic concept detection for consumer video

Proceedings of the international workshop on Workshop on multimedia information retrieval
Cross-domain video concept detection using adaptive svms

Proceedings of the 15th international conference on Multimedia
Introduction to a large-scale general purpose ground truth database: methodology, annotation tool and benchmarks

EMMCVPR'07 Proceedings of the 6th international conference on Energy minimization methods in computer vision and pattern recognition
Automated event clustering and quality screening of consumer pictures for digital albuming

IEEE Transactions on Multimedia
Automatic soccer video analysis and summarization

IEEE Transactions on Image Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study semantic event classification in the consumer domain by incorporating cross-domain and within-domain learning. An event is defined as a set of photos and/or videos that are taken within a common period of time, and have similar visual appearance. Events are generated from unconstrained consumer photo and video collections, by an automatic content management system, e.g., an automatic albuming system. Such consumer events have the following characteristics: an event can contain both photos and videos; there usually exist noisy/erroneous images resulting from imperfect albuming; and event data taken by different users, although from the same semantic category, can have highly diverse visual content. To accommodate these characteristics, we develop a general two-step Event-Level Feature (ELF) learning framework that enables the use of external data sources by cross-domain learning and the use of region-level representations, to enhance classification. Specifically, in the first step an elementary-level feature is used to represent images and videos. Then in the second step an ELF is constructed on top of the elementary feature to model each event as a feature vector. Semantic event classifiers can be directly built based on the ELF. Various ELFs are generated from different types of elementary-level features by using both cross-domain and within-domain learning: cross-domain approaches use two sets of concept scores at both image and region level that are learned from two external data sources; within-domain approaches use low-level visual features at both image and region level. Different types of ELFs complement each other for improved classification. Experiments over a large real consumer data set confirm significant improvements, e.g., over 90% MAP gain compared to the previous semantic event classification method.