Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Name-It: Naming and Detecting Faces in News Videos
IEEE MultiMedia
Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns
IEEE Transactions on Pattern Analysis and Machine Intelligence
An Efficient Implementation and Evaluation of Robust Face Sequence Matching
ICIAP '99 Proceedings of the 10th International Conference on Image Analysis and Processing
Evaluation campaigns and TRECVid
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Robust Face Track Finding in Video Using Tracked Points
SITIS '08 Proceedings of the 2008 IEEE International Conference on Signal Image Technology and Internet Based Systems
Finding Important People in Large News Video Databases Using Multimodal and Clustering Analysis
ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop
From still image to video-based face recognition: an experimental analysis
FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition
Video-based face recognition using adaptive hidden markov models
CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Person spotting: video shot retrieval for face sets
CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Hi-index | 0.00 |
The human face is one of the most important objects in videos since it provides rich information for spotting certain people of interest, such as government leaders in news video, or the hero in a movie, and is the basis for interpreting facts. Therefore, detecting and recognizing faces appearing in video are essential tasks of many video indexing and retrieval applications. Due to large variations in pose changes, illumination conditions, occlusions, hairstyles, and facial expressions, robust face matching has been a challenging problem. In addition, when the number of faces in the dataset is huge, e.g. tens of millions of faces, a scalable method for matching is needed. To this end, we propose an efficient method for face retrieval in large video datasets. In order to make the face retrieval robust, the faces of the same person appearing in individual shots are grouped into a single face track by using a reliable tracking method. The retrieval is done by computing the similarity between face tracks in the databases and the input face track. For each face track, we select one representative face and the similarity between two face tracks is the similarity between their two representative faces. The representative face is the mean face of a subset selected from the original face track. In this way, we can achieve high accuracy in retrieval while maintaining low computational cost. For experiments, we extracted approximately 20 million faces from 370 hours of TRECVID video, of which scale has never been addressed by the former attempts. The results evaluated on a subset consisting of manually annotated 457,320 faces show that the proposed method is effective and scalable.