WordNet: a lexical database for English
Communications of the ACM
Inverted files versus signature files for text indexing
ACM Transactions on Database Systems (TODS)
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Scalable Recognition with a Vocabulary Tree
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Large-Scale Concept Ontology for Multimedia
IEEE MultiMedia
Evaluation campaigns and TRECVid
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Semantic concept-based query expansion and re-ranking for multimedia retrieval
Proceedings of the 15th international conference on Multimedia
Pagerank for product image search
Proceedings of the 17th international conference on World Wide Web
Collaborative learning for image and video annotation
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Performance evaluation of local colour invariants
Computer Vision and Image Understanding
Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
AdOn: an intelligent overlay video advertising system
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Foundations and Trends in Information Retrieval
Unified video annotation via multigraph learning
IEEE Transactions on Circuits and Systems for Video Technology
Beyond distance measurement: constructing neighborhood similarity for video annotation
IEEE Transactions on Multimedia - Special section on communities and media computing
Logo detection based on spatial-spectral saliency and partial spatial context
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Can social tagged images aid concept-based video search?
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Learning automatic concept detectors from online video
Computer Vision and Image Understanding
Evaluating Color Descriptors for Object and Scene Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multi-modal multi-correlation person-centric news retrieval
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Active learning in multimedia annotation and retrieval: A survey
ACM Transactions on Intelligent Systems and Technology (TIST)
Harvesting Image Databases from the Web
IEEE Transactions on Pattern Analysis and Machine Intelligence
Scalable logo recognition in real-world images
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Towards a Relevant and Diverse Search of Social Images
IEEE Transactions on Multimedia
Assistive tagging: A survey of multimedia tagging with human-computer joint exploration
ACM Computing Surveys (CSUR)
Tagging photos using users' vocabularies
Neurocomputing
Multi-view hypergraph learning by patch alignment framework
Neurocomputing
Hi-index | 0.00 |
Product annotation in videos is of great importance for video browsing, search, and advertisement. However, most of the existing automatic video annotation research focuses on the annotation of high-level concepts, such as events, scenes, and object categories. This article presents a novel solution to the annotation of specific products in videos by mining information from the Web. It collects a set of high-quality training data for each product by simultaneously leveraging Amazon and Google image search engine. A visual signature for each product is then built based on the bag-of-visual-words representation of the training images. A correlative sparsification approach is employed to remove noisy bins in the visual signatures. These signatures are used to annotate video frames. We conduct experiments on more than 1,000 videos and the results demonstrate the feasibility and effectiveness of our approach.