Explicit performance metric optimization for fusion-based video retrieval

Authors:
Ilseo Kim;Sangmin Oh;Byungki Byun;A. G. Amitha Perera;Chin-Hui Lee
Affiliations:
Georgia Institute of Technology;Kitware Inc.;Microsoft;Kitware Inc.;Georgia Institute of Technology
Venue:
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
Year:
2012

Citing 5
Cited 2

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

International Journal of Computer Vision
A MFoM learning approach to robust multiclass multi-label text categorization

ICML '04 Proceedings of the twenty-first international conference on Machine learning
A support vector method for multivariate performance measures

ICML '05 Proceedings of the 22nd international conference on Machine learning
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Consumer video understanding: a benchmark database and an evaluation of human and machine performance

Proceedings of the 1st ACM International Conference on Multimedia Retrieval

Multimedia event detection with multimodal feature fusion and temporal concept localization

Machine Vision and Applications
An Efficient Gradient-based Approach to Optimizing Average Precision Through Maximal Figure-of-Merit Learning

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a learning framework for fusion-based video retrieval system, which explicitly optimizes given performance metrics. Real-world computer vision systems serve sophisticated user needs, and domain-specific performance metrics are used to monitor the success of such systems. However, the conventional approach for learning under such circumstances is to blindly minimize standard error rates and hope the targeted performance metrics improve, which is clearly suboptimal. In this work, a novel scheme to directly optimize such targeted performance metrics during learning is developed and presented. Our experimental results on two large consumer video archives are promising and showcase the benefits of the proposed approach.