Hierarchical Chamfer Matching: A Parametric Edge Matching Algorithm
IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
The Journal of Machine Learning Research
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Towards a framework for learning structured shape models from text-annotated images
HLT-NAACL-LWM '04 Proceedings of the HLT-NAACL 2003 workshop on Learning word meaning from non-linguistic data - Volume 6
Using Language to Drive the Perceptual Grouping of Local Image Features
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Weakly supervised learning of part-based spatial models for visual object recognition
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
A boundary-fragment-model for object detection
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part II
Hi-index | 0.00 |
In this paper, we propose a new approach to learn structured visual compound models from shape-based feature descriptions. We use captioned text in order to drive the process of grouping boundary fragments detected in an image. In the learning framework, we transfer several techniques from computational linguistics to the visual domain and build on previous work in image annotation. A statistical translation model is used in order to establish links between caption words and image elements. Then, compounds are iteratively built up by using a mutual information measure. Relations between compound elements are automatically extracted and increase the discriminability of the visual models. We show results on different synthetic and realistic datasets in order to validate our approach.