Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Multiple-Instance Learning for Natural Scene Classification
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
The Journal of Machine Learning Research
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Effective automatic image annotation via a coherent language model and active learning
Proceedings of the 12th annual ACM international conference on Multimedia
Evaluating the impact of selection noise in community-based web search
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
Word sense disambiguation with pictures
Artificial Intelligence - Special volume on connecting language to the world
Multiple Bernoulli relevance models for image and video annotation
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Visual object-action recognition: Inferring object affordances from human demonstration
Computer Vision and Image Understanding
Every picture tells a story: generating sentences from images
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Learning what and how of contextual models for scene labeling
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Blocks world revisited: image understanding using qualitative geometry and mechanics
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Non-local characterization of scenery images: statistics, 3D reasoning, and a generative model
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Seeing people in social context: recognizing people and social relationships
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
A unified context assessing model for object categorization
Computer Vision and Image Understanding
Journal on Image and Video Processing - Special issue on advanced video-based surveillance
Semantic hierarchies for image annotation: A survey
Pattern Recognition
Fusing object detection and region appearance for image-text alignment
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Reading between the tags to predict real-world size-class for visually depicted objects in images
MM '11 Proceedings of the 19th ACM international conference on Multimedia
International Journal of Computer Vision
Object Detection using Geometrical Context Feedback
International Journal of Computer Vision
Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search
International Journal of Computer Vision
Synergistic methods for using language in robotics
Proceedings of the Workshop on Performance Metrics for Intelligent Systems
Learning human interaction by interactive phrases
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
Constrained semi-supervised learning using attributes and comparative attributes
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Hi-index | 0.00 |
Learning visual classifiers for object recognition from weakly labeled data requires determining correspondence between image regions and semantic object classes. Most approaches use co-occurrence of "nouns" and image features over large datasets to determine the correspondence, but many correspondence ambiguities remain. We further constrain the correspondence problem by exploiting additional language constructs to improve the learning process from weakly labeled data. We consider both "prepositions" and "comparative adjectives" which are used to express relationships between objects. If the models of such relationships can be determined, they help resolve correspondence ambiguities. However, learning models of these relationships requires solving the correspondence problem. We simultaneously learn the visual features defining "nouns" and the differential visual features defining such "binary-relationships" using an EM-based approach.