Fast multiresolution image querying
SIGGRAPH '95 Proceedings of the 22nd annual conference on Computer graphics and interactive techniques
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Pattern Recognition and Neural Networks
Pattern Recognition and Neural Networks
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Towards auto-documentary: tracking the evolution of news stories
Proceedings of the 12th annual ACM international conference on Multimedia
Learning people annotation from the web via consistency learning
Proceedings of the international workshop on Workshop on multimedia information retrieval
Identifying persons in news article images based on textual analysis
ICADL'10 Proceedings of the role of digital libraries in a time of global change, and 12th international conference on Asia-Pacific digital libraries
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Hi-index | 0.00 |
We discuss the properties of a collection of news photos and captions, collected from the Associated Press and Reuters. Captions have a vocabulary dominated by proper names. We have implemented various text clustering algorithms to organize these items by topic, as well as an iconic matcher that identifies articles that share a picture. We have found that the special structure of captions allows us to extract some names of people actually portrayed in the image quite reliably, using a simple syntactic analysis. We have been able to build a directory of face images of individuals from this collection.