Visual content representation using semantically similar visual words

Authors:
Kraisak Kesorn;Sutasinee Chimlek;Stefan Poslad;Punpiti Piamsa-nga
Affiliations:
Department of Computer Science and Information Technology, Naresuan University, Phitsanulok 65000, Thailand;Department of Computer Engineering, Kasetsart University, 50 Phahon Yothin Rd., Chatuchak, Bangkok 10900, Thailand;School of Electronic Engineering and Computer Science, Queen Mary, University of London, Mile End Rd., London E1 4NS, United Kingdom;Department of Computer Engineering, Kasetsart University, 50 Phahon Yothin Rd., Chatuchak, Bangkok 10900, Thailand
Venue:
Expert Systems with Applications: An International Journal
Year:
2011

Citing 20
Cited 0

Using corpus statistics to remove redundant words in text categorization

Journal of the American Society for Information Science
VisualSEEk: a fully automated content-based image query system

MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Body plans

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Matching words and pictures

The Journal of Machine Learning Research
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Locally adaptive metrics for clustering high dimensional data

Data Mining and Knowledge Discovery
From frequent itemsets to semantically meaningful visual patterns

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluating bag-of-visual-words representations in scene classification

Proceedings of the international workshop on Workshop on multimedia information retrieval
Language modeling for bag-of-visual words image categorization

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Automatic Identification of Stop Words in Chinese Text Classification

CSSE '08 Proceedings of the 2008 International Conference on Computer Science and Software Engineering - Volume 01
Toward a higher-level visual representation for object-based image retrieval

The Visual Computer: International Journal of Computer Graphics
Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval

Computer Vision and Image Understanding
Semantics-preserving bag-of-words models for efficient image annotation

LS-MMRM '09 Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining
PCA-SIFT: a more distinctive representation for local image descriptors

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Local and global feature extraction for face recognition

AVBPA'05 Proceedings of the 5th international conference on Audio- and Video-Based Biometric Person Authentication
Improved SIFT-features matching for object recognition

VoCS'08 Proceedings of the 2008 international conference on Visions of Computer Science: BCS International Academic Conference

Quantified Score

Hi-index	12.05

Visualization

Abstract

Local feature analysis of visual content, namely using Scale Invariant Feature Transform (SIFT) descriptors, have been deployed in the 'bag-of-visual words' model (BVW) as an effective method to represent visual content information and to enhance its classification and retrieval. The key contributions of this paper are first, a novel approach for visual words construction which takes physically spatial information, angle, and scale of keypoints into account in order to preserve semantic information of objects in visual content and to enhance the traditional bag-of-visual words, is presented. Second, a method to identify and eliminate similar key points, to form semantic visual words of high quality and to strengthen the discrimination power for visual content classification, is given. Third, an approach to discover a set of semantically similar visual words and to form visual phrases representing visual content more distinctively and leading to narrowing the semantic gap is specified.