Framework for Performance Evaluation of Face, Text, and Vehicle Detection and Tracking in Video: Data, Metrics, and Protocol

Authors:
Rangachar Kasturi;Dmitry Goldgof;Padmanabhan Soundararajan;Vasant Manohar;John Garofolo;Rachel Bowers;Matthew Boonstra;Valentina Korzhova;Jing Zhang
Affiliations:
University of South Florida, Tampa;University of South Florida, Tampa;University of South Florida, Tampa;University of South Florida, Tampa;National Institute of Standards and Technology, Gaithersburg;National Institute of Standards and Technology, Gaithersburg;University of South Florida, Tampa;University of South Florida, Tampa;University of South Florida, Tampa
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2009

Citing 0
Cited 30

Concept-Based Video Retrieval

Foundations and Trends in Information Retrieval
Video Analytics in Urban Environments

AVSS '09 Proceedings of the 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance
Why meaningful automatic tagging of images is very hard

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
A two-stage scheme for text detection in video images

Image and Vision Computing
Understanding transit scenes: a survey on human behavior-recognition algorithms

IEEE Transactions on Intelligent Transportation Systems
Video Surveillance Online Repository (ViSOR): an integrated framework

Multimedia Tools and Applications
Logic-based trajectory evaluation in videos

KI'10 Proceedings of the 33rd annual German conference on Advances in artificial intelligence
Globally optimal multi-target tracking on a hexagonal lattice

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Performance metrics for activity recognition

ACM Transactions on Intelligent Systems and Technology (TIST)
Tracking clathrin coated pits with a multiple hypothesis based method

MICCAI'10 Proceedings of the 13th international conference on Medical image computing and computer-assisted intervention: Part II
Maneuvering head motion tracking by coarse-to-fine particle filter

ICIAR'11 Proceedings of the 8th international conference on Image analysis and recognition - Volume Part I
A software for performance evaluation and comparison of people detection and tracking methods in video processing

Multimedia Tools and Applications
Moving object detection and tracking by using annealed background subtraction method in videos: Performance optimization

Expert Systems with Applications: An International Journal
Optimal multiclass classifier threshold estimation with particle swarm optimization for visual object recognition

ISVC'11 Proceedings of the 7th international conference on Advances in visual computing - Volume Part II
Online selection of the best k-feature subset for object tracking

Journal of Visual Communication and Image Representation
Face detection using particle swarm optimization and support vector machines

SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
A large margin framework for single camera offline tracking with hybrid cues

Computer Vision and Image Understanding
A novel distribution-based feature for rapid object detection

Neurocomputing
Adaptive transformation for robust privacy protection in video surveillance

Advances in Multimedia
A cascade face recognition system using hybrid feature extraction

Digital Signal Processing
Radar-based road-traffic monitoring in urban environments

Digital Signal Processing
Multiple human tracking in high-density crowds

Image and Vision Computing
Monocular object detection using 3d geometric primitives

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
GMCP-Tracker: global multi-object tracking using generalized minimum clique graphs

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Detection of independently moving objects in non-planar scenes via multi-frame monocular epipolar constraint

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
(MP)2T: multiple people multiple parts tracker

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Exploiting pedestrian interaction via global optimization and social behaviors

Proceedings of the 15th international conference on Theoretical Foundations of Computer Vision: outdoor and large-scale real-world scene analysis
Symmetry-driven accumulation of local features for human characterization and re-identification

Computer Vision and Image Understanding
Photo 4W: Mobile photo management on what, where, who and when

Neurocomputing
Online parameter tuning for object tracking algorithms

Image and Vision Computing

Quantified Score

Hi-index	0.15

Visualization

Abstract

Common benchmark data sets, standardized performance metrics, and baseline algorithms have demonstrated considerable impact on research and development in a variety of application domains. These resources provide both consumers and developers of technology with a common framework to objectively compare the performance of different algorithms and algorithmic improvements. In this paper, we present such a framework for evaluating object detection and tracking in video: specifically for face, text, and vehicle objects. This framework includes the source video data, ground-truth annotations (along with guidelines for annotation), performance metrics, evaluation protocols, and tools including scoring software and baseline algorithms. For each detection and tracking task and supported domain, we developed a 50-clip training set and a 50-clip test set. Each data clip is approximately 2.5 minutes long and has been completely spatially/temporally annotated at the I-frame level. Each task/domain, therefore, has an associated annotated corpus of approximately 450,000 frames. The scope of such annotation is unprecedented and was designed to begin to support the necessary quantities of data for robust machine learning approaches, as well as a statistically significant comparison of the performance of algorithms. The goal of this work was to systematically address the challenges of object detection and tracking through a common evaluation framework that permits a meaningful objective comparison of techniques, provides the research community with sufficient data for the exploration of automatic modeling techniques, encourages the incorporation of objective evaluation into the development process, and contributes useful lasting resources of a scale and magnitude that will prove to be extremely useful to the computer vision research community for years to come.