Automatic text processing
Phonetic confusion matrix based spoken document retrieval
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic models of information retrieval based on measuring the divergence from randomness
ACM Transactions on Information Systems (TOIS)
Term Weighting Approaches in Automatic Text Retrieval
Term Weighting Approaches in Automatic Text Retrieval
High-performance, open-domain question answering from large text collections
High-performance, open-domain question answering from large text collections
Algorithms for language reconstruction
Algorithms for language reconstruction
A new algorithm for the alignment of phonetic sequences
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Investigating cross-language speech retrieval for a spontaneous conversational speech collection
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Using various indexing schemes and multiple translations in the CL-SR task at CLEF 2005
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
CLEF-2005 CL-SR at maryland: document and query expansion using side collections and thesauri
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Overview of the CLEF-2006 cross-language speech retrieval track
CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Dublin City University at CLEF 2006: cross-language speech retrieval (CL-SR) experiments
CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Robust question answering for speech transcripts: UPC experience in QAst 2009
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Combining word and phonetic-code representations for spoken document retrieval
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Sibyl, a factoid question-answering system for spoken documents
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
This paper presents a new approach to spoken document information retrieval for spontaneous speech corpora. The classical approach to this problem is the use of an automatic speech recognizer (ASR) combined with standard information retrieval techniques. However, ASRs tend to produce transcripts of spontaneous speech with significant word error rate, which is a drawback for standard retrieval techniques. To overcome such a limitation, our method is based on an approximated sequence alignment algorithm to search "sounds like" sequences. Our approach does not depend on extra information from the ASR and outperforms up to 7 points the precision of state-of-the-art techniques in our experiments.