The nature of statistical learning theory
The nature of statistical learning theory
Texture Features for Browsing and Retrieval of Image Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
A vector space model for automatic indexing
Communications of the ACM
The Earth Mover's Distance as a Metric for Image Retrieval
International Journal of Computer Vision
The Journal of Machine Learning Research
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Verbs semantics and lexical selection
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
An efficient parts-based near-duplicate and sub-image retrieval system
Proceedings of the 12th annual ACM international conference on Multimedia
Detecting image near-duplicate by stochastic attributed relational graph matching with learning
Proceedings of the 12th annual ACM international conference on Multimedia
A Performance Evaluation of Local Descriptors
IEEE Transactions on Pattern Analysis and Machine Intelligence
Creating Efficient Codebooks for Visual Recognition
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
A Comparison of Affine Region Detectors
International Journal of Computer Vision
A statistical method for system evaluation using incomplete judgments
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Diffusion Distance for Histogram Comparison
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Scalable Recognition with a Vocabulary Tree
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Fast tracking of near-duplicate keyframes in broadcast domain with transitivity propagation
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
International Journal of Computer Vision
Near-duplicate keyframe retrieval with visual keywords and semantic context
Proceedings of the 6th ACM international conference on Image and video retrieval
Towards optimal bag-of-features for object categorization and semantic video retrieval
Proceedings of the 6th ACM international conference on Image and video retrieval
Bag-of-visual-words expansion using visual relatedness for video indexing
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
WordNet: similarity - measuring the relatedness of concepts
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Hyperfeatures – multilevel local coding for visual recognition
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Sampling strategies for bag-of-features image classification
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Scalable detection of partial near-duplicate videos by visual-temporal consistency
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Large-scale near-duplicate web video search: challenge and opportunity
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Distances and weighting schemes for bag of visual words image retrieval
Proceedings of the international conference on Multimedia information retrieval
A visual word weighting scheme based on emerging itemsets for video annotation
Information Processing Letters
Max-margin dictionary learning for multiclass image categorization
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
A BOVW based query generative model
MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part I
Visual content representation using semantically similar visual words
Expert Systems with Applications: An International Journal
A visual approach for video geocoding using bag-of-scenes
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Improving bag-of-visual-words model with spatial-temporal correlation for video retrieval
Proceedings of the 21st ACM international conference on Information and knowledge management
Topic based pose relevance learning in dance archives
Proceedings of the 21st ACM international conference on Information and knowledge management
Near-duplicate video retrieval: Current research and future trends
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Bag-of-visual-words (BoW) has recently become a popular representation to describe video and image content. Most existing approaches, nevertheless, neglect inter-word relatedness and measure similarity by bin-to-bin comparison of visual words in histograms. In this paper, we explore the linguistic and ontological aspects of visual words for video analysis. Two approaches, soft-weighting and constraint-based earth mover's distance (CEMD), are proposed to model different aspects of visual word linguistics and proximity. In soft-weighting, visual words are cleverly weighted such that the linguistic meaning of words is taken into account for bin-to-bin histogram comparison. In CEMD, a cross-bin matching algorithm is formulated such that the ground distance measure considers the linguistic similarity of words. In particular, a BoW ontology which hierarchically specifies the hyponym relationship of words is constructed to assist the reasoning. We demonstrate soft-weighting and CEMD on two tasks: video semantic indexing and near-duplicate keyframe retrieval. Experimental results indicate that soft-weighting is superior to other popular weighting schemes such as term frequency (TF) weighting in large-scale video database. In addition, CEMD shows excellent performance compared to cosine similarity in near-duplicate retrieval.