See ColOr: Seeing Colours with an Orchestra

Authors:
Benoît Deville;Guido Bologna;Michel Vinckenbosch;Thierry Pun
Affiliations:
Computer Vision and Multimedia Lab, University of Geneva, Carouge, Switzerland CH-1227;Laboratoire d'Informatique Industrielle, University of Applied Science, Geneva, Switzerlandd CH-1202;Laboratoire d'Informatique Industrielle, University of Applied Science, Geneva, Switzerlandd CH-1202;Computer Vision and Multimedia Lab, University of Geneva, Carouge, Switzerland CH-1227
Venue:
Human Machine Interaction
Year:
2009

Citing 13
Cited 3

3-D sound for virtual reality and multimedia

3-D sound for virtual reality and multimedia
Force and touch feedback for virtual reality

Force and touch feedback for virtual reality
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Saliency, Scale and Image Description

International Journal of Computer Vision
A Computational Model of Depth-Based Attention

ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume IV-Volume 7472 - Volume 7472
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Computing Visual Attention from Scene Depth

ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 1
An Integrated Model of Top-Down and Bottom-Up Attention for Optimizing Detection Speed

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Transforming 3D coloured pixels into musical instrument notes for vision substitution applications

Journal on Image and Video Processing
Image and video processing for visually handicapped people

Journal on Image and Video Processing
Identifying Major Components of Pictures by Audio Encoding of Colours

IWINAC '07 Proceedings of the 2nd international work-conference on Nature Inspired Problem-Solving Methods in Knowledge Engineering: Interplay Between Natural and Artificial Computation, Part II
Eye tracking in coloured image scenes represented by ambisonic fields of musical instrument sounds

IWINAC'05 Proceedings of the First international conference on Mechanisms, Symbols, and Models Underlying Cognition: interplay between natural and artificial computation - Volume Part I
SURF: speeded up robust features

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I

Toward local and global perception modules for vision substitution

Neurocomputing
Toward 3D scene understanding via audio-description: Kinect-iPad fusion for the visually impaired

The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility
Spatial awareness and intelligibility for the blind: audio-touch interfaces

CHI '12 Extended Abstracts on Human Factors in Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The See Color interface transforms a small portion of a coloured video image into sound sources represented by spatialised musical instruments. Basically, the conversion of colours into sounds is achieved by quantisation of the HSL (Hue, Saturation and Luminosity) colour system. Our purpose is to provide visually impaired individuals with a capability of perception of the environment in real time. In this work we present the system's principles of design and several experiments that have been carried out by several blindfolded persons. The goal of the first experiment was to identify the colours of main features in static pictures in order to interpret the image scenes. Participants found that colours were helpful to limit the possible image interpretations. Afterwards, two experiments based on a head mounted camera have been performed. The first experiment pertains to object manipulation. It is based on the pairing of coloured socks, while the second experiment is related to outdoor navigation with the goal of following a coloured sinuous path painted on the ground. The socks experiment demonstrated that blindfolded individuals were able to accurately match pairs of coloured socks. The same participants successfully followed a red serpentine path painted on the ground for more than 80 meters. Finally, we propose an original approach for a real time alerting system, based on the detection of visual salient parts in videos. The particularity of our approach lies in the use of a new feature map constructed from the depth gradient. From the computed feature maps we infer conspicuity maps that indicate areas that are appreciably different from their surrounding. Then a specific distance function is described, which takes into account both stereoscopic camera limitations and user's choices. We also report how we automatically estimate the relative contribution of each conspicuity map, which enables the unsupervised determination of the final saliency map, indicating the visual salience of all points in the image. We demonstrate here that this additional depth-based feature map allows the system to detect salient regions with good accuracy in most situations, even in the presence of noisy disparity maps.