Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers

Authors:
Abhinav Gupta;Larry S. Davis
Affiliations:
Department of Computer Science, University of Maryland, College Park,;Department of Computer Science, University of Maryland, College Park,
Venue:
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Year:
2008

Citing 11
Cited 17

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Multiple-Instance Learning for Natural Scene Classification

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Matching words and pictures

The Journal of Machine Learning Research
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Effective automatic image annotation via a coherent language model and active learning

Proceedings of the 12th annual ACM international conference on Multimedia
Evaluating the impact of selection noise in community-based web search

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Supervised Learning of Semantic Classes for Image Annotation and Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
Word sense disambiguation with pictures

Artificial Intelligence - Special volume on connecting language to the world
Multiple Bernoulli relevance models for image and video annotation

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition

Visual object-action recognition: Inferring object affordances from human demonstration

Computer Vision and Image Understanding
Every picture tells a story: generating sentences from images

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Learning what and how of contextual models for scene labeling

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Blocks world revisited: image understanding using qualitative geometry and mechanics

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Non-local characterization of scenery images: statistics, 3D reasoning, and a generative model

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Seeing people in social context: recognizing people and social relationships

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
A unified context assessing model for object categorization

Computer Vision and Image Understanding
Contextual information and covariance descriptors for people surveillance: an application for safety of construction workers

Journal on Image and Video Processing - Special issue on advanced video-based surveillance
Semantic hierarchies for image annotation: A survey

Pattern Recognition
Fusing object detection and region appearance for image-text alignment

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Reading between the tags to predict real-world size-class for visually depicted objects in images

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Learning Behavioural Context

International Journal of Computer Vision
Object Detection using Geometrical Context Feedback

International Journal of Computer Vision
Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search

International Journal of Computer Vision
Synergistic methods for using language in robotics

Proceedings of the Workshop on Performance Metrics for Intelligent Systems
Learning human interaction by interactive phrases

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
Constrained semi-supervised learning using attributes and comparative attributes

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

Learning visual classifiers for object recognition from weakly labeled data requires determining correspondence between image regions and semantic object classes. Most approaches use co-occurrence of "nouns" and image features over large datasets to determine the correspondence, but many correspondence ambiguities remain. We further constrain the correspondence problem by exploiting additional language constructs to improve the learning process from weakly labeled data. We consider both "prepositions" and "comparative adjectives" which are used to express relationships between objects. If the models of such relationships can be determined, they help resolve correspondence ambiguities. However, learning models of these relationships requires solving the correspondence problem. We simultaneously learn the visual features defining "nouns" and the differential visual features defining such "binary-relationships" using an EM-based approach.