Name-it: naming and detecting faces in video by the integration of image and natural language processing

Authors:
Shin'ichi Satoh;Yuichi Nakamura;Takeo Kanade
Affiliations:
National Center for Science Information Systems, Bunkyo, Tokyo, Japan and School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;University of Tsukuba, Tsukuba City, Ibaraki, Japan and School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
Venue:
IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Year:
1997

Citing 2
Cited 8

Name-It: Association of Face and Name in Video

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Eigenfaces for recognition

Journal of Cognitive Neuroscience

Name-It: Naming and Detecting Faces in News Videos

IEEE MultiMedia
Named Faces: Putting Names to Faces

IEEE Intelligent Systems
Perceptual Anchoring: A Key Concept for Plan Execution in Embedded Systems

Revised Papers from the International Seminar on Advances in Plan-Based Control of Robotic Agents,
Automatic Video Indexing Based on Shot Classification

AMCP '98 Proceedings of the First International Conference on Advanced Multimedia Content Processing
On the integration of grounding language and learning objects

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Identification of coreference between names and faces

CorefApp '99 Proceedings of the Workshop on Coreference and its Applications
Perceptual anchoring of symbols for action

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Training a multilingual sportscaster: using perceptual context to learn language

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We have been developing Name-It, a system that associates faces and names in news videos. First, as the only knowledge source, the system is given news videos which include image sequences and transcripts obtained from audio tracks or closed caption texts. The system can then either infer the name of a given face and output the name candidates, or can locate the faces in news videos by a name. To accomplish this task, the system extracts faces from image sequences and names from transcripts, both of which might correspond to key persons in news topics. The proposed system takes full advantage of advanced image and natural language processing. The image processing contributes to the extraction of face sequences which provide rich information for face-name association. The processing also helps to select the best frontal view of a face in a face sequence to enhance the face identification which is required for the processing. On the other hand, the natural language processing effectively extracts names by using lexical/grammatical analysis and knowledge of the news video topics structure. The success of our experiments demonstrates the benefits of the advanced image and natural language processing methods and their incorporation.