Term clouds as surrogates for user generated speech

Authors:
Manos Tsagkias;Martha Larson;Maarten de Rijke
Affiliations:
University of Amsterdam, Amsterdam, Netherlands;University of Amsterdam, Amsterdam, Netherlands;University of Amsterdam, Amsterdam, Netherlands
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 6
Cited 3

Extracting Keyphrases from Spoken Audio Documents

Information Retrieval Techniques for Speech Applications [this book is based on the workshop “Information Retrieval Techniques for Speech Applications”, held as part of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in New Orleans, USA, in September 2001].
The Technology Underlying Podcasts

Computer
Getting our head in the clouds: toward evaluation studies of tagclouds

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Tag clouds for summarizing web search results

Proceedings of the 16th international conference on World Wide Web
An assessment of tag presentation techniques

Proceedings of the 16th international conference on World Wide Web
DiscoverInfo: a tool for discovering information with relevance and novelty

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

The participation payoff: challenges and opportunities for multimedia access in networked communities

Proceedings of the international conference on Multimedia information retrieval
Word clouds of multiple search results

IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.01

Visualization

Abstract

User generated spoken audio remains a challenge for Automatic Speech Recognition (ASR) technology and content-based audio surrogates derived from ASR-transcripts must be error robust. An investigation of the use of term clouds as surrogates for podcasts demonstrates that ASR term clouds closely approximate term clouds derived from human-generated transcripts across a range of cloud sizes. A user study confirms the conclusion that ASR-clouds are viable surrogates for depicting the content of podcasts.