Toward 3D scene understanding via audio-description: Kinect-iPad fusion for the visually impaired

  • Authors:
  • Juan Diego Gomez;Sinan Mohammed;Guido Bologna;Thierry Pun

  • Affiliations:
  • University of Geneva, Geneva, Switzerland;University of Geneva, Geneva, Switzerland;University of Geneva, Geneva, Switzerland;University of Geneva, Geneva, Switzerland

  • Venue:
  • The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microsoft's Kinect 3-D motion sensor is a low cost 3D camera that provides color and depth information of indoor environments. In this demonstration, the functionality of this fun-only camera accompanied by an iPad's tangible interface is targeted to the benefit of the visually impaired. A computer-vision-based framework for real time objects localization and for their audio description is introduced. Firstly, objects are extracted from the scene and recognized using feature descriptors and machine-learning. Secondly, the recognized objects are labeled by instruments sounds, whereas their position in 3D space is described by virtual space sources of sound. As a result, the scene can be heard and explored while finger-triggering the sounds within the iPad, on which a top-view of the objects is mapped. This enables blindfolded users to build a mental occupancy grid of the environment. The approach presented here brings the promise of efficient assistance and could be adapted as an electronic travel aid for the visually-impaired in the near future.