Graph-based joint clustering of fixations and visual entities

Authors:
Yusuke Sugano;Yasuyuki Matsushita;Yoichi Sato
Affiliations:
The University of Tokyo, Japan;Microsoft Research Asia, China;The University of Tokyo, Japan
Venue:
ACM Transactions on Applied Perception (TAP)
Year:
2013

Citing 14
Cited 0

Topographic distance and watershed lines

Signal Processing - Special issue on mathematical morphology and its applications to signal processing
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust clustering of eye movement recordings for quantification of visual interest

Proceedings of the 2004 symposium on Eye tracking research & applications
"GrabCut": interactive foreground extraction using iterated graph cuts

ACM SIGGRAPH 2004 Papers
Boundary Extraction in Natural Images Using Ultrametric Contour Maps

CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
Computational Geometry: Algorithms and Applications

Computational Geometry: Algorithms and Applications
In the Eye of the Beholder: A Survey of Models for Eyes and Gaze

IEEE Transactions on Pattern Analysis and Machine Intelligence
An eye fixation database for saliency detection in images

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Contour Detection and Hierarchical Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Web-Scale Multimedia Analysis: Does Content Matter?

IEEE MultiMedia
Can computers learn from humans to see better?: inferring scene semantics from viewers' eye movements

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Scikit-learn: Machine Learning in Python

The Journal of Machine Learning Research
Fast Approximate Energy Minimization with Label Costs

International Journal of Computer Vision
Biased normalized cuts

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a method that extracts groups of fixations and image regions for the purpose of gaze analysis and image understanding. Since the attentional relationship between visual entities conveys rich information, automatically determining the relationship provides us a semantic representation of images. We show that, by jointly clustering human gaze and visual entities, it is possible to build meaningful and comprehensive metadata that offer an interpretation about how people see images. To achieve this, we developed a clustering method that uses a joint graph structure between fixation points and over-segmented image regions to ensure a cross-domain smoothness constraint. We show that the proposed clustering method achieves better performance in relating attention to visual entities in comparison with standard clustering techniques.