Towards computer-vision software tools to increase production and accessibility of video description for people with vision loss

  • Authors:
  • Langis Gagnon;Samuel Foucher;Maguelonne Heritier;Marc Lalonde;David Byrns;Claude Chapdelaine;James Turner;Suzanne Mathieu;Denis Laurendeau;Nath Tan Nguyen;Denis Ouellet

  • Affiliations:
  • Computer Research Institute of Montreal (CRIM), R&D Department, 550 Sherbrooke West, Suite 100, H3A 1B9, Montreal, QC, Canada;Computer Research Institute of Montreal (CRIM), R&D Department, 550 Sherbrooke West, Suite 100, H3A 1B9, Montreal, QC, Canada;Computer Research Institute of Montreal (CRIM), R&D Department, 550 Sherbrooke West, Suite 100, H3A 1B9, Montreal, QC, Canada;Computer Research Institute of Montreal (CRIM), R&D Department, 550 Sherbrooke West, Suite 100, H3A 1B9, Montreal, QC, Canada;Computer Research Institute of Montreal (CRIM), R&D Department, 550 Sherbrooke West, Suite 100, H3A 1B9, Montreal, QC, Canada;Computer Research Institute of Montreal (CRIM), R&D Department, 550 Sherbrooke West, Suite 100, H3A 1B9, Montreal, QC, Canada;Université de Montréal, École de bibliothéconomie et des sciences de l’information, H3C 3J7, Montreal, QC, Canada;Université de Montréal, École de bibliothéconomie et des sciences de l’information, H3C 3J7, Montreal, QC, Canada;Laval University, Department of Electrical and Computer Engineering, G1K 7P4, Quebec, QC, Canada;Laval University, Department of Electrical and Computer Engineering, G1K 7P4, Quebec, QC, Canada;Laval University, Department of Electrical and Computer Engineering, G1K 7P4, Quebec, QC, Canada

  • Venue:
  • Universal Access in the Information Society
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents the status of a R&D project targeting the development of computer-vision tools to assist humans in generating and rendering video description for people with vision loss. Three principal issues are discussed: (1) production practices, (2) needs of people with vision loss, and (3) current system design, core technologies and implementation. The paper provides the main conclusions of consultations with producers of video description regarding their practices and with end-users regarding their needs, as well as an analysis of described productions that lead to propose a video description typology. The current status of a prototype software is also presented (audio-vision manager) that uses many computer-vision technologies (shot transition detection, key-frame identification, key-face recognition, key-text spotting, visual motion, gait/gesture characterization, key-place identification, key-object spotting and image categorization) to automatically extract visual content, associate textual descriptions and add them to the audio track with a synthetic voice. A proof of concept is also briefly described for a first adaptive video description player which allows end users to select various levels of video description.