Two-stream indexing for spoken web search

Authors:
Jitendra Ajmera;Anupam Joshi;Sougata Mukherjea;Nitendra Rajput;Shrey Sahay;Mayank Shrivastava;Kundan Srivastava
Affiliations:
IBM Research, New Delhi, India;University of Maryland, Baltimore County, Baltimore, MD, USA;IBM Research, New Delhi, India;IBM Research, New Delih, India;IBM Research, New Delhi, India;Indian Institute of Technology, Kharagpur, India;IBM Research, New Delhi, India
Venue:
Proceedings of the 20th international conference companion on World wide web
Year:
2011

Citing 24
Cited 2

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
SPHINX: a framework for creating personal, site-specific Web crawlers

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Faceted metadata for image search and browsing

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Mining anchor text for query refinement

Proceedings of the 13th international conference on World Wide Web
Metadata creation system for mobile images

Proceedings of the 2nd international conference on Mobile systems, applications, and services
Swoogle: a search and metadata engine for the semantic web

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Automatic analysis of call-center conversations

Proceedings of the 14th ACM international conference on Information and knowledge management
Searching in audio: the utility of transcripts, dichotic presentation, and time-compression

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Position specific posterior lattices for indexing speech

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
HSTP: hyperspeech transfer protocol

Proceedings of the eighteenth conference on Hypertext and hypermedia
WWTW: the world wide telecom web

Proceedings of the 2007 workshop on Networked systems for developing regions
Organizing the unorganized - employing IT to empower the under-privileged

Proceedings of the 17th international conference on World Wide Web
Mobile findex: supporting mobile web search with automatic result categories

Proceedings of the 9th international conference on Human computer interaction with mobile devices and services
A lattice-based approach to query-by-example spoken document retrieval

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Quicklink selection for navigational query results

Proceedings of the 18th international conference on World wide web
An audio indexing system for election video material

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
General indexation of weighted automata: application to spoken utterance retrieval

SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004
Crowd translator: on building localized speech recognizers through micropayments

ACM SIGOPS Operating Systems Review
Avaaj Otalo: a field study of an interactive voice forum for small farmers in rural India

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Content creation and dissemination by-and-for users in rural areas

ICTD'09 Proceedings of the 3rd international conference on Information and communication technologies and development
Faceted search and browsing of audio content on spoken web

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Lucene in Action, Second Edition: Covers Apache Lucene 3.0

Lucene in Action, Second Edition: Covers Apache Lucene 3.0
Traffic properties, client side cachability and CDN usage of popular web sites

MMB&DFT'10 Proceedings of the 15th international GI/ITG conference on Measurement, Modelling, and Evaluation of Computing Systems and Dependability and Fault Tolerance
Organizational, social and operational implications in delivering ICT solutions: a telecom web case-study

Proceedings of the 4th ACM/IEEE International Conference on Information and Communication Technologies and Development

Social ranking for spoken web search

Proceedings of the 20th ACM international conference on Information and knowledge management
Query by babbling: a research agenda

Proceedings of the first workshop on Information and knowledge management for developing region

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents two-stream processing of audio to index the audio content for Spoken Web search. The first stream indexes the meta-data associated with a particular audio document. The meta-data is usually very sparse, but accurate. This therefore results in a high-precision, low-recall index. The second stream uses a novel language-independent speech recognition to generate text to be indexed. Owing to the multiple languages and the noise in user generated content on the Spoken Web, the speech recognition accuracy of such systems is not high, thus they result in a low-precision, high-recall index. The paper attempts to use these two complementary streams to generate a combined index to increase the precision-recall performance in audio content search. The problem of audio content search is motivated by the real world implication of the Web in developing regions, where due to literacy and affordability issues, people use Spoken Web which consists of interconnected VoiceSites, which have content in audio. The experiments are based on more than 20,000 audio documents spanning over seven live VoiceSites and four different languages. The results suggest significant improvement over a meta-data-only or a speech-recognitiononly system, thus justifying the two-stream processing approach. Audio content search is a growing problem area and this paper wishes to be a first step to solving this at a large scale, across languages, in a Web context.