Computer Vision and Image Understanding
Simultaneous image classification and annotation via biased random walk on tri-relational graph
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
We are not equally negative: fine-grained labeling for multimedia event detection
Proceedings of the 21st ACM international conference on Multimedia
Hi-index | 0.00 |
Visual attributes expose human-defined semantics to object recognition models, but existing work largely restricts their influence to mid-level cues during classifier training. Rather than treat attributes as intermediate features, we consider how learning visual properties in concert with object categories can regularize the models for both. Given a low-level visual feature space together with attribute-and object-labeled image data, we learn a shared lower-dimensional representation by optimizing a joint loss function that favors common sparsity patterns across both types of prediction tasks. We adopt a recent kernelized formulation of convex multi-task feature learning, in which one alternates between learning the common features and learning task-specific classifier parameters on top of those features. In this way, our approach discovers any structure among the image descriptors that is relevant to both tasks, and allows the top-down semantics to restrict the hypothesis space of the ultimate object classifiers. We validate the approach on datasets of animals and outdoor scenes, and show significant improvements over traditional multi-class object classifiers and direct attribute prediction models.