A Multi-Pronged Approach to Improving Semantic Extraction of News Video

Authors:
A. G. Hauptmann;M. -Y. Chen;M. Christel;W. -H. Lin;J. Yang
Affiliations:
School of Computer Science, Carnegie Mellon University, Pittsburgh, USA;School of Computer Science, Carnegie Mellon University, Pittsburgh, USA;School of Computer Science, Carnegie Mellon University, Pittsburgh, USA;School of Computer Science, Carnegie Mellon University, Pittsburgh, USA;School of Computer Science, Carnegie Mellon University, Pittsburgh, USA
Venue:
Journal of Signal Processing Systems
Year:
2010

Citing 22
Cited 0

Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
Does organisation by similarity assist image browsing?

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
End-User Searching Challenges Indexing Practices inthe Digital Newspaper Photo Archive

Information Retrieval
News video classification using SVM-based multimodal classifiers and combination strategies

Proceedings of the tenth ACM international conference on Multimedia
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
X-means: Extending K-means with Efficient Estimation of the Number of Clusters

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Understanding belief propagation and its generalizations

Exploring artificial intelligence in the new millennium
Automatic image annotation and retrieval using cross-media relevance models

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Matching words and pictures

The Journal of Machine Learning Research
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Discriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision

IEEE Transactions on Pattern Analysis and Machine Intelligence
Optimal multimodal fusion for multimedia data analysis

Proceedings of the 12th annual ACM international conference on Multimedia
Learning the semantics of multimedia queries and concepts from a small number of examples

Proceedings of the 13th annual ACM international conference on Multimedia
Learning rich semantics from news video archives by style analysis

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
How many high-level concepts will fill the semantic gap in news video retrieval?

Proceedings of the 6th ACM international conference on Image and video retrieval
Markov Random Field Modeling in Image Analysis

Markov Random Field Modeling in Image Analysis
Keyframe retrieval by keypoints: can point-to-point matching help?

CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we describe a multi-strategy approach to improving semantic extraction from news video. Experiments show the value of careful parameter tuning, exploiting multiple feature sets and multilingual linguistic resources, applying text retrieval approaches for image features, and establishing synergy between multiple concepts through undirected graphical models. We present a discriminative learning framework called Multi-concept Discriminative Random Field (MDRF) for building probabilistic models of video semantic concept detectors by incorporating related concepts as well as the low-level observations. The model exploits the power of discriminative graphical models to simultaneously capture the associations of concept with observed data and the interactions between related concepts. Compared with previous methods, this model not only captures the co-occurrence between concepts but also incorporates the raw data observations into a unified framework. We also describe an approximate parameter estimation algorithm and present results obtained from the TRECVID 2006 data. No single approach, however, provides a consistently better result for all concept detection tasks, which suggests that extracting video semantics should exploit multiple resources and techniques rather than naively relying on a single approach