Communications of the ACM
Perceptual Components for Context Aware Computing
UbiComp '02 Proceedings of the 4th international conference on Ubiquitous Computing
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Attention-based design of augmented reality interfaces
CHI '05 Extended Abstracts on Human Factors in Computing Systems
A Mobile Vision System for Urban Detection with Informative Local Descriptors
ICVS '06 Proceedings of the Fourth IEEE International Conference on Computer Vision Systems
HPAT indexing for fast object/scene recognition based on local appearance
CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Searching the web with mobile images for location recognition
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Exploring the urban environment with a camera phone: lessons from a user study
Proceedings of the 11th International Conference on Human-Computer Interaction with Mobile Devices and Services
Geo-contextual priors for attentive urban object recognition
ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Hi-index | 0.00 |
The presented work settles attention in the architecture of ambient intelligence, in particular, for the application of mobile vision tasks in multimodal interfaces. A major issue for the performance of these services is uncertainty in the visual information which roots in the requirement to index into a huge amount of reference images. We propose a system implementation --the Attentive Machine Interface (AMI) --that enables contextual processing of multi-sensor information in a probabilistic framework, for example to exploit contextual information from geo-services with the purpose to cut down the visual search space into a subset of relevant object hypotheses. We present a proof-of-concept with results from bottom-up information processing from experimental tracks and image capture in an urban scenario, extracting object hypotheses in the local context from both (i) mobile image based appearance and (ii) GPS based positioning, and verify performance in recognition accuracy ( 10%) using Bayesian decision fusion. Finally, we demonstrate that top-down information processing --geo-information priming the recognition method in feature space --can yield even better results ( 13%) and more economic computing, verifying the advantage of multi-sensor attentive processing in multimodal interfaces.