Spoken document retrieval from call-center conversations

Authors:
Jonathan Mamou;David Carmel;Ron Hoory
Affiliations:
IBM Haifa Research Labs, Haifa, Israel;IBM Haifa Research Labs, Haifa, Israel;IBM Haifa Research Labs, Haifa, Israel
Venue:
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2006

Citing 5
Cited 15

Open-vocabulary speech indexing for voice and video mail retrieval

MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Document expansion for speech retrieval

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Automatic query wefinement using lexical affinities with maximal information gain

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic analysis of call-center conversations

Proceedings of the 14th ACM international conference on Information and knowledge management
Position specific posterior lattices for indexing speech

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics

An approximate multi-word matching algorithm for robust document retrieval

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Vocabulary independent spoken term detection

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Indexing confusion networks for morph-based spoken document retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Natural language processing for information retrieval: the time is ripe (again)

Proceedings of the ACM first Ph.D. workshop in CIKM
Automatic call section segmentation for contact-center calls

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A lattice-based approach to query-by-example spoken document retrieval

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Combining LVCSR and vocabulary-independent ranked utterance retrieval for robust speech search

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Statistical lattice-based spoken document retrieval

ACM Transactions on Information Systems (TOIS)
Performance analysis for lattice-based speech indexing approaches using words and subword units

IEEE Transactions on Audio, Speech, and Language Processing
Speech retrieval from unsegmented finnish audio using statistical morpheme-like units for segmentation, recognition, and retrieval

ACM Transactions on Speech and Language Processing (TSLP)
Leveraging word confusion networks for named entity modeling and detection from conversational telephone speech

Speech Communication
Beyond shot retrieval: searching for broadcast news items using language models of concepts

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval
Linking transcribed conversational speech

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
An approach for efficient open vocabulary spoken term detection

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

We are interested in retrieving information from conversational speech corpora, such as call-center data. This data comprises spontaneous speech conversations with low recording quality, which makes automatic speech recognition (ASR) a highly difficult task. For typical call-center data, even state-of-the-art large vocabulary continuous speech recognition systems produce a transcript with word error rate of 30% or higher. In addition to the output transcript, advanced systems provide word confusion networks (WCNs), a compact representation of word lattices associating each word hypothesis with its posterior probability. Our work exploits the information provided by WCNs in order to improve retrieval performance. In this paper, we show that the mean average precision (MAP) is improved using WCNs compared to the raw word transcripts. Finally, we analyze the effect of increasing ASR word error rate on search effectiveness. We show that MAP is still reasonable even under extremely high error rate.