Searching in audio: the utility of transcripts, dichotic presentation, and time-compression

Authors:
Abhishek Ranjan;Ravin Balakrishnan;Mark Chignell
Affiliations:
University of Toronto;University of Toronto;University of Toronto
Venue:
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Year:
2006

Citing 10
Cited 5

AudioStreamer: exploiting simultaneity for listening

CHI '95 Conference Companion on Human Factors in Computing Systems
SpeechSkimmer: a system for interactively skimming recorded speech

ACM Transactions on Computer-Human Interaction (TOCHI) - Special issue on speech as data
Dynamic Soundscape: mapping time to space for audio browsing

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Audio hallway: a virtual acoustic environment for browsing

Proceedings of the 11th annual ACM symposium on User interface software and technology
SCAN: designing and evaluating user interfaces to support retrieval from speech archives

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Nomadic radio: speech and audio interaction for contextual messaging in nomadic environments

ACM Transactions on Computer-Human Interaction (TOCHI) - Special issue on human-computer interaction with mobile systems
SCANMail: a voicemail interface that makes speech browsable, readable and searchable

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Spatialized audioconferencing: what are the benefits?

CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
Improving speech playback using time-compression and speech recognition

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Semantic speech editing

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

Earpod: eyes-free menu selection using touch input and reactive audio feedback

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Disclosing spoken culture: user interfaces for access to spoken word archives

BCS-HCI '08 Proceedings of the 22nd British HCI Group Annual Conference on People and Computers: Culture, Creativity, Interaction - Volume 1
Faceted search and browsing of audio content on spoken web

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Two-stream indexing for spoken web search

Proceedings of the 20th international conference companion on World wide web
Social ranking for spoken web search

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.01

Visualization

Abstract

Searching audio data can potentially be facilitated by the use of automatic speech recognition (ASR) technology to generate text transcripts which can then be easily queried. However, since current ASR technology cannot reliably generate 100% accurate transcripts, additional techniques for fluid browsing and searching of the audio itself are required. We explore the impact of transcripts of various qualities, dichotic presentation, and time-compression on an audio search task. Results show that dichotic presentation and reasonably accurate transcripts can assist in the search process, but suggest that time-compression and low accuracy transcripts should be used carefully.