Open-vocabulary speech indexing for voice and video mail retrieval
MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Document expansion for speech retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Automatic query wefinement using lexical affinities with maximal information gain
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic analysis of call-center conversations
Proceedings of the 14th ACM international conference on Information and knowledge management
Position specific posterior lattices for indexing speech
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
An approximate multi-word matching algorithm for robust document retrieval
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Vocabulary independent spoken term detection
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Indexing confusion networks for morph-based spoken document retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Natural language processing for information retrieval: the time is ripe (again)
Proceedings of the ACM first Ph.D. workshop in CIKM
Automatic call section segmentation for contact-center calls
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A lattice-based approach to query-by-example spoken document retrieval
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Combining LVCSR and vocabulary-independent ranked utterance retrieval for robust speech search
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Statistical lattice-based spoken document retrieval
ACM Transactions on Information Systems (TOIS)
Performance analysis for lattice-based speech indexing approaches using words and subword units
IEEE Transactions on Audio, Speech, and Language Processing
ACM Transactions on Speech and Language Processing (TSLP)
Beyond shot retrieval: searching for broadcast news items using language models of concepts
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Spoken Content Retrieval: A Survey of Techniques and Technologies
Foundations and Trends in Information Retrieval
Linking transcribed conversational speech
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
An approach for efficient open vocabulary spoken term detection
Speech Communication
Hi-index | 0.00 |
We are interested in retrieving information from conversational speech corpora, such as call-center data. This data comprises spontaneous speech conversations with low recording quality, which makes automatic speech recognition (ASR) a highly difficult task. For typical call-center data, even state-of-the-art large vocabulary continuous speech recognition systems produce a transcript with word error rate of 30% or higher. In addition to the output transcript, advanced systems provide word confusion networks (WCNs), a compact representation of word lattices associating each word hypothesis with its posterior probability. Our work exploits the information provided by WCNs in order to improve retrieval performance. In this paper, we show that the mean average precision (MAP) is improved using WCNs compared to the raw word transcripts. Finally, we analyze the effect of increasing ASR word error rate on search effectiveness. We show that MAP is still reasonable even under extremely high error rate.