See ColOr: Seeing Colours with an Orchestra

  • Authors:
  • Benoît Deville;Guido Bologna;Michel Vinckenbosch;Thierry Pun

  • Affiliations:
  • Computer Vision and Multimedia Lab, University of Geneva, Carouge, Switzerland CH-1227;Laboratoire d'Informatique Industrielle, University of Applied Science, Geneva, Switzerlandd CH-1202;Laboratoire d'Informatique Industrielle, University of Applied Science, Geneva, Switzerlandd CH-1202;Computer Vision and Multimedia Lab, University of Geneva, Carouge, Switzerland CH-1227

  • Venue:
  • Human Machine Interaction
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The See Color interface transforms a small portion of a coloured video image into sound sources represented by spatialised musical instruments. Basically, the conversion of colours into sounds is achieved by quantisation of the HSL (Hue, Saturation and Luminosity) colour system. Our purpose is to provide visually impaired individuals with a capability of perception of the environment in real time. In this work we present the system's principles of design and several experiments that have been carried out by several blindfolded persons. The goal of the first experiment was to identify the colours of main features in static pictures in order to interpret the image scenes. Participants found that colours were helpful to limit the possible image interpretations. Afterwards, two experiments based on a head mounted camera have been performed. The first experiment pertains to object manipulation. It is based on the pairing of coloured socks, while the second experiment is related to outdoor navigation with the goal of following a coloured sinuous path painted on the ground. The socks experiment demonstrated that blindfolded individuals were able to accurately match pairs of coloured socks. The same participants successfully followed a red serpentine path painted on the ground for more than 80 meters. Finally, we propose an original approach for a real time alerting system, based on the detection of visual salient parts in videos. The particularity of our approach lies in the use of a new feature map constructed from the depth gradient. From the computed feature maps we infer conspicuity maps that indicate areas that are appreciably different from their surrounding. Then a specific distance function is described, which takes into account both stereoscopic camera limitations and user's choices. We also report how we automatically estimate the relative contribution of each conspicuity map, which enables the unsupervised determination of the final saliency map, indicating the visual salience of all points in the image. We demonstrate here that this additional depth-based feature map allows the system to detect salient regions with good accuracy in most situations, even in the presence of noisy disparity maps.