Sonification of images for the visually impaired using a multi-level approach

Authors:
Michael Banf;Volker Blanz
Affiliations:
University of Siegen;University of Siegen
Venue:
Proceedings of the 4th Augmented Human International Conference
Year:
2013

Citing 9
Cited 2

Robust detection of lines using the progressive probabilistic Hough transform

Computer Vision and Image Understanding - Special issue on robusst statistical techniques in image understanding
Constructing sonified haptic line graphs for the blind student: first steps

Assets '00 Proceedings of the fourth international ACM conference on Assistive technologies
Tabu Search

Tabu Search
Bilateral Filtering for Gray and Color Images

ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Discriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
An improved box-counting method for image fractal dimension estimation

Pattern Recognition
Object Detection with Discriminatively Trained Part-Based Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
EdgeSonic: image feature sonification for the visually impaired

Proceedings of the 2nd Augmented Human International Conference
Toward local and global perception modules for vision substitution

Neurocomputing

Man made structure detection and verification of object recognition in images for the visually impaired

Proceedings of the 6th International Conference on Computer Vision / Computer Graphics Collaboration Techniques and Applications
A qualitative study to support a blind photography mobile application

Proceedings of the 6th International Conference on PErvasive Technologies Related to Assistive Environments

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a system that strives to give visually impaired persons direct perceptual access to images via an acoustic signal. The user explores the image actively on a touch screen and receives auditory feedback about the image content at the current position. The design of such a system involves two major challenges: what is the most useful and relevant image information, and how can as much information as possible be captured in an audio signal. We address both problems, and propose a general approach that combines low-level information, such as color, edges, and roughness, with mid- and high-level information obtained from Machine Learning algorithms. This includes object recognition and the classification of regions into the categories "man made" versus "natural". We argue that this multi-level approach gives users direct access to what is where in the image, yet it still exploits the potential of recent developments in Computer Vision and Machine Learning.