Improving speech playback using time-compression and speech recognition

  • Authors:
  • Sunil Vemuri;Philip DeCamp;Walter Bender;Chris Schmandt

  • Affiliations:
  • MIT Media Lab, Cambridge, MA;MIT Media Lab, Cambridge, MA;MIT Media Lab, Cambridge, MA;MIT Media Lab, Cambridge, MA

  • Venue:
  • Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

Despite the ready availability of digital recording technology and the continually decreasing cost of digital storage, browsing audio recordings remains a tedious task. This paper presents evidence in support of a system designed to assist with information comprehension and retrieval tasks from a large collection of recorded speech. Two techniques are employed to assist users with these tasks. First, a speech recognizer creates necessarily error-laden transcripts of the recorded speech. Second, audio playback is time-compressed using the SOLAFS technique. When used together, subjects are able to perform comprehension tasks with more speed and accuracy.