Incorporating camera metadata for attended region detection and consumer photo classification

Authors:
Zhong Li;Hangzai Luo;Jianping Fan
Affiliations:
UNC-Charlotte, Charlotte, NC, USA;East China Normal University, Charlotte, China;UNC-Charlotte, Charlotte, NC, USA
Venue:
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Year:
2009

Citing 7
Cited 2

Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixations

IEEE Transactions on Pattern Analysis and Machine Intelligence
Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
Contrast-based image attention analysis by using fuzzy growing

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Rapid Biologically-Inspired Scene Classification Using Features Shared with Visual Attention

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Bayesian network-based framework for semantic image understanding

Pattern Recognition
Bayesian fusion of camera metadata cues in semantic scene classification

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Integrating Concept Ontology and Multitask Learning to Achieve More Effective Classifier Training for Multilevel Image Annotation

IEEE Transactions on Image Processing

Bilinear deep learning for image classification

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Capturing a great photo via learning from community-contributed photo collections

MM '11 Proceedings of the 19th ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Photos taken by human beings significantly differ from the pictures that are taken by a surveillance camera or a vision sensor on a robot, e.g., human beings may intentionally capture photos to express his/her feeling or record a memorial scene. Such a creative photo capture process is accomplished by adjusting two factors: (1) the parameters setting of a camera; and (2) the position between the camera and the interesting objects or scenes. To enable automatic understanding and interpretation of the semantics of photos, it is very important to take all these factors into account. Unfortunately, most existing algorithms for image understanding focus on only the content of the images while completely ignoring these two important factors. In this paper, we have developed a new algorithm to calculate what the interestingness of the photographer is and what the core content of a photo is. The gained information (i.e., attended regions and attention of the photographer) is further used to support more effective photo classification and retrieval. Our experiments on 70,000+ photos taken by 200+ different models of cameras have obtained very positive results.