Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Composite Templates for Cloth Modeling and Sketching
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Scalable Recognition with a Vocabulary Tree
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Effective logo retrieval with adaptive local feature selection
Proceedings of the international conference on Multimedia
TILT: transform invariant low-rank textures
ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part III
Clothes search in consumer photos via color matching and attribute learning
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Finding suits in images of people
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Image ranking and retrieval based on multi-attribute queries
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Who Blocks Who: Simultaneous clothing segmentation for grouping images
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Describing people: A poselet-based approach to attribute classification
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Point-context descriptor based region search for logo recognition
Proceedings of the 4th International Conference on Internet Multimedia Computing and Service
Hi-index | 0.00 |
In this paper, we address the problem of large scale cross-scenario clothing retrieval with semantic-preserving visual phrases (SPVP). Since the human parts are important cues for clothing detection and segmentation, we firstly detect human parts as the semantic context, and refine the regions of human parts with sparse background reconstruction. Then, the semantic parts are encoded into the vocabulary tree under the bag-of-visual-word (BOW) framework, and the contextual constraint of visual words among different human parts is exploited through the SPVP. Moreover, the SPVP is integrated into the inverted index structure for accelerating the retrieval process. Experiments and comparisons on our clothing dataset indicate that the SPVP significantly enhances the discriminative power of local features with a slight increase of memory usage or runtime consumption compared to the BOW model. Therefore, the approach is superior to both the state-of-the-art approach and two clothing search engines.