Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Retrieving spoken documents by combining multiple index sources
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Information retrieval as statistical translation
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval
Proceedings of the eighth international conference on Information and knowledge management
A design principles of a weighted finite-state transducer library
Theoretical Computer Science - Special issue on implementing automata
Document language models, query models, and risk minimization for information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic models of indexing and searching
SIGIR '80 Proceedings of the 3rd annual ACM conference on Research and development in information retrieval
Implementation of the SMART Information Retrieval System
Implementation of the SMART Information Retrieval System
Integration of continuous speech recognition and information retrieval for mutually optimal performance
A study of smoothing methods for language models applied to information retrieval
ACM Transactions on Information Systems (TOIS)
A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents
ACM Transactions on Asian Language Information Processing (TALIP)
Generalized algorithms for constructing statistical language models
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Spoken document retrieval from call-center conversations
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Position specific posterior lattices for indexing speech
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Searching the audio notebook: keyword search in recorded conversations
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Soft indexing of speech content for search in spoken documents
Computer Speech and Language
Indexing confusion networks for morph-based spoken document retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
General indexation of weighted automata: application to spoken utterance retrieval
SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004
A general weighted grammar library
CIAA'04 Proceedings of the 9th international conference on Implementation and Application of Automata
ACM Transactions on Speech and Language Processing (TSLP)
Spoken Content Retrieval: A Survey of Techniques and Technologies
Foundations and Trends in Information Retrieval
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
Recent research efforts on spoken document retrieval have tried to overcome the low quality of 1-best automatic speech recognition transcripts, especially in the case of conversational speech, by using statistics derived from speech lattices containing multiple transcription hypotheses as output by a speech recognizer. We present a method for lattice-based spoken document retrieval based on a statistical n-gram modeling approach to information retrieval. In this statistical lattice-based retrieval (SLBR) method, a smoothed statistical model is estimated for each document from the expected counts of words given the information in a lattice, and the relevance of each document to a query is measured as a probability under such a model. We investigate the efficacy of our method under various parameter settings of the speech recognition and lattice processing engines, using the Fisher English Corpus of conversational telephone speech. Experimental results show that our method consistently achieves better retrieval performance than using only the 1-best transcripts in statistical retrieval, outperforms a recently proposed lattice-based vector space retrieval method, and also compares favorably with a lattice-based retrieval method based on the Okapi BM25 model.