Experiments for the cross language speech retrieval task at CLEF 2006

Authors:
Muath Alzghool;Diana Inkpen
Affiliations:
School of Information Technology and Engineering, University of Ottawa;School of Information Technology and Engineering, University of Ottawa
Venue:
CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Year:
2006

Citing 9
Cited 1

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
An information-theoretic approach to automatic query expansion

ACM Transactions on Information Systems (TOIS)
Probabilistic models of information retrieval based on measuring the divergence from randomness

ACM Transactions on Information Systems (TOIS)
Building an information retrieval test collection for spontaneous conversational speech

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The design, implementation, and use of the Ngram statistics package

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Using various indexing schemes and multiple translations in the CL-SR task at CLEF 2005

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Terrier information retrieval platform

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Overview of the CLEF-2006 cross-language speech retrieval track

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Dublin City University at CLEF 2006: cross-language speech retrieval (CL-SR) experiments

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval

Sibyl, a factoid question-answering system for spoken documents

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the second participation of the University of Ottawa group in the Cross-Language Speech Retrieval (CL-SR) task at CLEF 2006. We present the results of the submitted runs for the English collection and very briefly for the Czech collection, followed by many additional experiments. We have used two Information Retrieval systems in our experiments: SMART and Terrier, with several query expansion techniques (including a new method based on log-likelihood scores for collocations). Our experiments showed that query expansion methods do not help much for this collection. We tested different Automatic Speech Recognition transcripts and combinations. The retrieval results did not improve, probably because the speech recognition errors happened for the words that are important in retrieval. We present cross-language experiments, where the queries are automatically translated by combining the results of several online machine translation tools. Our experiments showed that high quality automatic translations (for French) led to results comparable with monolingual English, while the performance decreased for the other languages. Experiments on indexing the manual summaries and keywords gave the best retrieval results.