An efficient method for face retrieval from large video datasets

Authors:
Thao Ngoc Nguyen;Thanh Duc Ngo;Duy-Dinh Le;Shin'ichi Satoh;Bac Hoai Le;Duc Anh Duong
Affiliations:
University of Science, Ho Chi Minh City, Vietnam;The Graduate University for Advanced Studies, Chiyoda-ku, Tokyo, Japan;National Institute of Informatics, Chiyoda-ku, Tokyo, Japan;National Institute of Informatics, Chiyoda-ku, Tokyo, Japan;University of Science, Ho Chi Minh City, Vietnam;University of Science, Ho Chi Minh City, Vietnam
Venue:
Proceedings of the ACM International Conference on Image and Video Retrieval
Year:
2010

Citing 10
Cited 1

Approximate nearest neighbors: towards removing the curse of dimensionality

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Name-It: Naming and Detecting Faces in News Videos

IEEE MultiMedia
Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Efficient Implementation and Evaluation of Robust Face Sequence Matching

ICIAP '99 Proceedings of the 10th International Conference on Image Analysis and Processing
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Robust Face Track Finding in Video Using Tracked Points

SITIS '08 Proceedings of the 2008 IEEE International Conference on Signal Image Technology and Internet Based Systems
Finding Important People in Large News Video Databases Using Multimodal and Clustering Analysis

ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop
From still image to video-based face recognition: an experimental analysis

FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition
Video-based face recognition using adaptive hidden markov models

CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Person spotting: video shot retrieval for face sets

CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval

The video face book

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling

Quantified Score

Hi-index	0.00

Visualization

Abstract

The human face is one of the most important objects in videos since it provides rich information for spotting certain people of interest, such as government leaders in news video, or the hero in a movie, and is the basis for interpreting facts. Therefore, detecting and recognizing faces appearing in video are essential tasks of many video indexing and retrieval applications. Due to large variations in pose changes, illumination conditions, occlusions, hairstyles, and facial expressions, robust face matching has been a challenging problem. In addition, when the number of faces in the dataset is huge, e.g. tens of millions of faces, a scalable method for matching is needed. To this end, we propose an efficient method for face retrieval in large video datasets. In order to make the face retrieval robust, the faces of the same person appearing in individual shots are grouped into a single face track by using a reliable tracking method. The retrieval is done by computing the similarity between face tracks in the databases and the input face track. For each face track, we select one representative face and the similarity between two face tracks is the similarity between their two representative faces. The representative face is the mean face of a subset selected from the original face track. In this way, we can achieve high accuracy in retrieval while maintaining low computational cost. For experiments, we extracted approximately 20 million faces from 370 hours of TRECVID video, of which scale has never been addressed by the former attempts. The results evaluated on a subset consisting of manually annotated 457,320 faces show that the proposed method is effective and scalable.