ACM Transactions on Computer-Human Interaction (TOCHI)
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Improved Computational Methods for Ray Tracing
ACM Transactions on Graphics (TOG)
Spatiotemporal sensitivity and visual attention for efficient rendering of dynamic environments
ACM Transactions on Graphics (TOG)
Data- and Model-Driven Gaze Control for an Active-Vision System
IEEE Transactions on Pattern Analysis and Machine Intelligence
Selective quality rendering by exploiting human inattentional blindness: looking but not seeing
VRST '02 Proceedings of the ACM symposium on Virtual reality software and technology
Visual attention-based polygon level of detail management
Proceedings of the 1st international conference on Computer graphics and interactive techniques in Australasia and South East Asia
An Attentional Prototype for Early Vision
ECCV '92 Proceedings of the Second European Conference on Computer Vision
An Introduction to the Kalman Filter
An Introduction to the Kalman Filter
Models of bottom-up and top-down visual attention
Models of bottom-up and top-down visual attention
Visual interest and NPR: an evaluation and manifesto
Proceedings of the 3rd international symposium on Non-photorealistic animation and rendering
Visual attention based information culling for Distributed Virtual Environments
Proceedings of the ACM symposium on Virtual reality software and technology
ACM SIGGRAPH 2005 Papers
A GPU based saliency map for high-fidelity selective rendering
AFRIGRAPH '06 Proceedings of the 4th international conference on Computer graphics, virtual reality, visualisation and interaction in Africa
Is bottom-up attention useful for object recognition?
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
A generic framework of user attention model and its application in video summarization
IEEE Transactions on Multimedia
Properties and performance of a center/surround retinex
IEEE Transactions on Image Processing
A psychophysical study of fixation behavior in a computer game
Proceedings of the 5th symposium on Applied perception in graphics and visualization
The whys, how tos, and pitfalls of user studies
ACM SIGGRAPH 2009 Courses
An empirical pipeline to derive gaze prediction heuristics for 3D action games
ACM Transactions on Applied Perception (TAP)
Focus and context in mixed reality by modulating first order salient features
SG'10 Proceedings of the 10th international conference on Smart graphics
Parallel implementation of a spatio-temporal visual saliency model
Journal of Real-Time Image Processing
Directing attention and influencing memory with visual saliency modulation
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Multimedia Tools and Applications
Hi-index | 0.00 |
This paper presents a real-time framework for computationally tracking objects visually attended by the user while navigating in interactive virtual environments. In addition to the conventional bottom-up (stimulus-driven) features, the framework also uses topdown (goal-directed) contexts to predict the human gaze. The framework first builds feature maps using preattentive features such as luminance, hue, depth, size, and motion. The feature maps are then integrated into a single saliency map using the center-surround difference operation. This pixel-level bottom-up saliency map is converted to an object-level saliency map using the item buffer. Finally, the top-down contexts are inferred from the user's spatial and temporal behaviors during interactive navigation and used to select the most plausibly attended object among candidates produced in the object saliency map. The computational framework was implemented using the GPU and exhibited extremely fast computing performance (5.68 msec for a 256X256 saliency map), substantiating its adequacy for interactive virtual environments. A user experiment was also conducted to evaluate the prediction accuracy of the visual attention tracking framework with respect to actual human gaze data. The attained accuracy level was well supported by the theory of human cognition for visually identifying a single and multiple attentive targets, especially due to the addition of top-down contextual information. The framework can be effectively used for perceptually based rendering without employing an expensive eye tracker, such as providing the depth-of-field effects and managing the level-of-detail in virtual environments.