Recognizing 3-D Objects Using Surface Descriptions
IEEE Transactions on Pattern Analysis and Machine Intelligence
The Combinatorics of Heuristic Search Termination for Object Recognition in Cluttered Environments
IEEE Transactions on Pattern Analysis and Machine Intelligence
Effective bandwidths for the multi-type UAS channel
Queueing Systems: Theory and Applications
Invariant Descriptors for 3D Object Recognition and Pose
IEEE Transactions on Pattern Analysis and Machine Intelligence - Special issue on interpretation of 3-D scenes—part I
Recognition by Linear Combinations of Models
IEEE Transactions on Pattern Analysis and Machine Intelligence - Special issue on interpretation of 3-D scenes—part I
Cognition and the visual arts
Twenty years of eye typing: systems and design issues
ETRA '02 Proceedings of the 2002 symposium on Eye tracking research & applications
Interacting with groups of computers
Communications of the ACM
Moving Target Classification and Tracking from Real-time Video
WACV '98 Proceedings of the 4th IEEE Workshop on Applications of Computer Vision (WACV'98)
Learning words from sights and sounds: a computational model
Learning words from sights and sounds: a computational model
Bandwidth provisioning and pricing for networks with multiple classes of service
Computer Networks: The International Journal of Computer and Telecommunications Networking - Special issue: Internet economics: Pricing and policies
How many pixels do we need to see things?
ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
Hi-index | 0.00 |
Attention, understanding and abstraction are three key elements in our visual communication that we have taken for granted. These interconnected elements constitute a Visual Digest Network. In this chapter, we investigate the conceptual design of Visual Digest Networks at three visual abstraction levels: gaze, object and word. The goal is to minimize the media footprint during visual communication while sustaining essential semantic communication. The Attentive Video Network is designed to detect the operator's gaze and adjust the video resolution at the sensor side across the network. Our results show significant improvements in network bandwidth utilization. The Object Video Network is designed for mobile video network applications, where faces and cars are detected. The multi-resolution profiles are configured for media according to the network footprint. The video is sent across the network with multiple resolutions and metadata; controlled by the bandwidth regulator. The results show that the video can be transmitted in the low-bandwidth conditions. Finally, the Image-Word Search Network is designed for face reconstruction across the network. In this study, we assume the hidden layer between the facial features and referral expressive words contain `control points' that can be articulated mathematically, visually and verbally. This experiment is a crude model of the semantic network. Nevertheless, we see the potential of the twoway mapping.