A human-centered multiple instance learning framework for semantic video retrieval

Authors:
Xin Chen;Chengcui Zhang;Shu-Ching Chen;Stuart Rubin
Affiliations:
Department of Computer and Information Sciences, University of Alabama at Birmingham, Birmingham, AL;Department of Computer and Information Sciences, University of Alabama at Birmingham, Birmingham, AL;School of Computing and Information Sciences, Florida International University, Miami, FL;SPAWAR Systems Center San Diego, Intelligence, Surveillance, and Reconnaissance Department, San Diego, CA
Venue:
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Year:
2009

Citing 16
Cited 6

Multilayer feedforward networks are universal approximators

Neural Networks
Automatic symbolic traffic scene analysis using belief networks

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
A framework for multiple-instance learning

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Support vector machine active learning for image retrieval

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Event Detection and Analysis from Video Streams

IEEE Transactions on Pattern Analysis and Machine Intelligence
Time Series Prediction and Neural Networks

Journal of Intelligent and Robotic Systems
MindReader: Querying Databases Through Multiple Examples

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Unsupervised event discrimination based on nonlinear temporal modeling of activity content

Pattern Analysis & Applications
Neural networks for event extraction from time series: a back propagation algorithm approach

Future Generation Computer Systems
Behaviour Understanding in Video: A Combined Method

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
A regularization framework for multiple-instance learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
An Interactive Semantic Video Mining and Retrieval Platform--Application in Transportation Surveillance Video for Incident Detection

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
A Multiple Instance Learning Framework for Incident Retrieval in Transportation Surveillance Video Databases

ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop
Learning-based spatio-temporal vehicle tracking and indexing for transportation multimedia database systems

IEEE Transactions on Intelligent Transportation Systems
Relevance feedback in content-based image retrieval: Bayesian framework, feature subspaces, and progressive learning

IEEE Transactions on Image Processing
Relevance feedback: a power tool for interactive content-based image retrieval

IEEE Transactions on Circuits and Systems for Video Technology

G3P-MI: A genetic programming algorithm for multiple instance learning

Information Sciences: an International Journal
ReliefF-MI: An extension of ReliefF to multiple instance learning

Neurocomputing
Multi-instance genetic programming for predicting student performance in web based educational environments

Applied Soft Computing
HyDR-MI: A hybrid algorithm to reduce dimensionality in multiple instance learning

Information Sciences: an International Journal
Client-Side Relevance Feedback Approach for Image Retrieval in Mobile Environment

International Journal of Multimedia Data Engineering & Management
Rule-Based Semantic Concept Classification from Large-Scale Video Collections

International Journal of Multimedia Data Engineering & Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a human-centered interactive framework for automatically mining and retrieving semantic events in videos. After preprocessing, the object trajectories and event models are fed into the core components of the framework for learning and retrieval. As trajectories are spatiotemporal in nature, the learning component is designed to analyze time series data. The human feedback to the retrieval results provides progressive guidance for the retrieval component in the framework. The retrieval results are in the form of video sequences instead of contained trajectories for user convenience. Thus, the trajectories are not directly labeled by the feedback as required by the training algorithm. A mapping between semantic video retrieval and multiple instance learning (MIL) is established in order to solve this problem. The effectiveness of the algorithm is demonstrated by experiments on real-life transportation surveillance videos.