Faceted search and browsing of audio content on spoken web

  • Authors:
  • Mamadou Diao;Sougata Mukherjea;Nitendra Rajput;Kundan Srivastava

  • Affiliations:
  • Georgia Institute of Technology, Atlanta, GA, USA;IBM Research, New Delhi, India;IBM Research, New Delhi, India;IBM Research, New Delhi, India

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Spoken Web is a web of VoiceSites that can be accessed by a phone. The content in a VoiceSite is audio. Therefore Spoken Web provides an alternate to the World Wide Web (WWW) in developing regions where low Internet penetration and low literacy are barriers to accessing the conventional WWW. Searching of audio content in Spoken Web through an audio query-result interface presents two key challenges: indexing of audio content is not accurate, and the presentation of results in audio is sequential, and therefore cumbersome. In this paper, we apply the concepts of faceted search and browsing to the SpokenWeb search problem. We use the concepts of facets to index the meta-data associated with the audio content. We provide a mechanism to rank the facets based on the search results. We develop an interactive query interface that enables easy browsing of search results through the top ranked facets. To our knowledge, this is the first system to use the concepts of facets in audio search, and the first solution that provides an audio search for the rural population. We present quantitative results to illustrate the accuracy and effectiveness of the faceted search and qualitative results to highlight the usability of the interactive browsing system. The experiments have been conducted on more than 4000 audio documents collected from a live SpokenWeb VoiceSite and evaluations were carried out with 40 farmers who are the target users of the VoiceSite.