Finding people frequently appearing in news

Authors:
Derya Ozkan;Pınar Duygulu
Affiliations:
Department of Computer Engineering, Bilkent University, Ankara, Turkey;Department of Computer Engineering, Bilkent University, Ankara, Turkey
Venue:
CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Year:
2006

Citing 8
Cited 3

The LIMSI Broadcast News transcription system

Speech Communication - Special issue on automatic transcription of broadcast news data
Greedy approximation algorithms for finding dense components in a graph

APPROX '00 Proceedings of the Third International Workshop on Approximation Algorithms for Combinatorial Optimization
Name-It: Association of Face and Name in Video

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Face recognition: A literature survey

ACM Computing Surveys (CSUR)
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Names and faces in the news

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Person spotting: video shot retrieval for face sets

CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Person search made easy

CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval

Taking the bite out of automated naming of characters in TV video

Image and Vision Computing
Character identification in feature-length films using global face-name matching

IEEE Transactions on Multimedia
Role-based identity recognition for TV broadcasts

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a graph based method to improve the performance of person queries in large news video collections. The method benefits from the multi-modal structure of videos and integrates text and face information. Using the idea that a person appears more frequently when his/her name is mentioned, we first use the speech transcript text to limit our search space for a query name. Then, we construct a similarity graph with nodes corresponding to all of the faces in the search space, and the edges corresponding to similarity of the faces. With the assumption that the images of the query name will be more similar to each other than to other images, the problem is then transformed into finding the densest component in the graph corresponding to the images of the query name. The same graph algorithm is applied for detecting and removing the faces of the anchorpeople in an unsupervised way. The experiments are conducted on 229 news videos provided by NIST for TRECVID 2004. The results show that proposed method outperforms the text only based methods and provides cues for recognition of faces on the large scale.