Retrieving spoken documents by combining multiple index sources
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval as statistical translation
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval
Proceedings of the eighth international conference on Information and knowledge management
A design principles of a weighted finite-state transducer library
Theoretical Computer Science - Special issue on implementing automata
Topic detection and tracking in English and Chinese
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Document language models, query models, and risk minimization for information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Implementation of the SMART Information Retrieval System
Implementation of the SMART Information Retrieval System
Integration of continuous speech recognition and information retrieval for mutually optimal performance
Towards Modernised and Web-Specific Stoplists for Web Document Analysis
WI '03 Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence
A study of smoothing methods for language models applied to information retrieval
ACM Transactions on Information Systems (TOIS)
A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents
ACM Transactions on Asian Language Information Processing (TALIP)
Spoken document retrieval from call-center conversations
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Position specific posterior lattices for indexing speech
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Robust techniques for organizing and retrieving spoken documents
EURASIP Journal on Applied Signal Processing
Document retrieval for question answering: a quantitative evaluation of text preprocessing
Proceedings of the ACM first Ph.D. workshop in CIKM
Topic modeling for spoken document retrieval using word- and syllable-level information
SSCS '09 Proceedings of the third workshop on Searching spontaneous conversational speech
A brief survey of computational approaches in social computing
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Using confusion networks for speech summarization
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Faceted search and browsing of audio content on spoken web
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Two-stream indexing for spoken web search
Proceedings of the 20th international conference companion on World wide web
Social ranking for spoken web search
Proceedings of the 20th ACM international conference on Information and knowledge management
Beyond shot retrieval: searching for broadcast news items using language models of concepts
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
ACM Transactions on Information Systems (TOIS)
Content-based retrieval for heterogeneous domains: domain adaptation by relative aggregation points
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Interactive pattern mining on hidden data: a sampling-based solution
Proceedings of the 21st ACM international conference on Information and knowledge management
The uncertain representation ranking framework for concept-based video retrieval
Information Retrieval
Hi-index | 0.00 |
Recent efforts on the task of spoken document retrieval (SDR) have made use of speech lattices: speech lattices contain information about alternative speech transcription hypotheses other than the 1-best transcripts, and this information can improve retrieval accuracy by overcoming recognition errors present in the 1-best transcription. In this paper, we look at using lattices for the query-by-example spoken document retrieval task - retrieving documents from a speech corpus, where the queries are themselves in the form of complete spoken documents (query exemplars). We extend a previously proposed method for SDR with short queries to the query-by-example task. Specifically, we use a retrieval method based on statistical modeling: we compute expected word counts from document and query lattices, estimate statistical models from these counts, and compute relevance scores as divergences between these models. Experimental results on a speech corpus of conversational English show that the use of statistics from lattices for both documents and query exemplars results in better retrieval accuracy than using only 1-best transcripts for either documents, or queries, or both. In addition, we investigate the effect of stop word removal which further improves retrieval accuracy. To our knowledge, our work is the first to have used a lattice-based approach to query-by-example spoken document retrieval.