Toward 3D scene understanding via audio-description: Kinect-iPad fusion for the visually impaired

Authors:
Juan Diego Gomez;Sinan Mohammed;Guido Bologna;Thierry Pun
Affiliations:
University of Geneva, Geneva, Switzerland;University of Geneva, Geneva, Switzerland;University of Geneva, Geneva, Switzerland;University of Geneva, Geneva, Switzerland
Venue:
The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility
Year:
2011

Citing 1
Cited 0

See ColOr: Seeing Colours with an Orchestra

Human Machine Interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Microsoft's Kinect 3-D motion sensor is a low cost 3D camera that provides color and depth information of indoor environments. In this demonstration, the functionality of this fun-only camera accompanied by an iPad's tangible interface is targeted to the benefit of the visually impaired. A computer-vision-based framework for real time objects localization and for their audio description is introduced. Firstly, objects are extracted from the scene and recognized using feature descriptors and machine-learning. Secondly, the recognized objects are labeled by instruments sounds, whereas their position in 3D space is described by virtual space sources of sound. As a result, the scene can be heard and explored while finger-triggering the sounds within the iPad, on which a top-view of the objects is mapped. This enables blindfolded users to build a mental occupancy grid of the environment. The approach presented here brings the promise of efficient assistance and could be adapted as an electronic travel aid for the visually-impaired in the near future.