Program: Automated Library and Information Systems
The double metaphone search algorithm
C/C++ Users Journal
Experiments in spoken document retrieval using phoneme n-grams
Speech Communication - Special issue on accessing information in spoken audio
Integration of continuous speech recognition and information retrieval for mutually optimal performance
A Comparison of Personal Name Matching: Techniques and Practical Issues
ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Robust techniques for organizing and retrieving spoken documents
EURASIP Journal on Applied Signal Processing
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Overview of the CLEF-2007 Cross-Language Speech Retrieval Track
Advances in Multilingual and Multimodal Information Retrieval
Spoken Document Retrieval Based on Approximated Sequence Alignment
TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
A Soundex-Based Approach for Spoken Document Retrieval
MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
CLEF-2005 CL-SR at maryland: document and query expansion using side collections and thesauri
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Multimodal indexing based on semantic cohesion for image retrieval
Information Retrieval
Hi-index | 0.00 |
The traditional approach for spoken document retrieval (SDR) uses an automatic speech recognizer (ASR) in combination with a word-based information retrieval method. This approach has only showed limited accuracy, partially because ASR systems tend to produce transcriptions of spontaneous speech with significant word error rate. In order to overcome such limitation we propose a method which uses word and phonetic-code representations in collaboration. The idea of this combination is to reduce the impact of transcription errors in the processing of some (presumably complex) queries by representing words with similar pronunciations through the same phonetic code. Experimental results on the CLEF-CLSR-2007 corpus are encouraging; the proposed hybrid method improved the mean average precision and the number of retrieved relevant documents from the traditional word-based approach by 3% and 7% respectively.