Unsupervised Segmentation of Color-Texture Regions in Images and Video
IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Automatic multimedia cross-modal correlation discovery
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Image annotations by combining multiple evidence & wordNet
Proceedings of the 13th annual ACM international conference on Multimedia
Attention-driven image interpretation with application to image retrieval
Pattern Recognition
An efficient algorithm for attention-driven image interpretation from segments
Pattern Recognition
Image annotation via graph learning
Pattern Recognition
Semi-automatic dynamic auxiliary-tag-aided image annotation
Pattern Recognition
Statistical modeling and conceptualization of natural images
Pattern Recognition
Multiscale conditional random fields for image labeling
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
The effectiveness of image features based on fractal image coding for image annotation
Expert Systems with Applications: An International Journal
Semantic context based refinement for news video annotation
Multimedia Tools and Applications
Image annotation using high order statistics in non-Euclidean spaces
Journal of Visual Communication and Image Representation
Hi-index | 0.01 |
With the advancement of imaging techniques and IT technologies, image retrieval has become a bottle neck. The key for efficient and effective image retrieval is by a text-based approach in which automatic image annotation is a critical task. As an important issue, the metadata of the annotation, i.e., the basic unit of an image to be labeled, has not been fully studied. A habitual way is to label the segments which are produced by a segmentation algorithm. However, after a segmentation process an object has often been broken into pieces, which not only produces noise for annotation but also increases the complexity of the model. We adopt an attention-driven image interpretation method to extract attentive objects from an over-segmented image and use the attentive objects for annotation. By such doing, the basic unit of annotation has been upgraded from segments to attentive objects. Visual classifiers are trained and a concept association network (CAN) is constructed for object recognition. A CAN consists of a number of concept nodes in which each node is a trained neural network (visual classifier) to recognize a single object. The nodes are connected through their correlation links forming a network. Given that an image contains several unknown attentive objects, all the nodes in CAN generate their own responses which propagate to other nodes through the network simultaneously. For a combination of nodes under investigation, these loopy propagations can be characterized by a linear system. The response of a combination of nodes can be obtained by solving the linear system. Therefore, the annotation problem is converted into finding out the node combination with the maximum response. Annotation experiments show a better accuracy of attentive objects over segments and that the concept association network improves annotation performance.