Benefit of proper language processing for Czech speech retrieval in the CL-SR task at CLEF 2006

Authors:
Pavel Ircing;Ludêk Müller
Affiliations:
University of West Bohemia, Faculty of Applied Sciences, Dept. of Cybernetics, Plzeň, Czech Republic;University of West Bohemia, Faculty of Applied Sciences, Dept. of Cybernetics, Plzeň, Czech Republic
Venue:
CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Year:
2006

Citing 5
Cited 2

Stemming algorithms: a case study for detailed evaluation

Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
Tagging inflective languages: prediction of morphological categories for a rich, structured tagset

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
One-sided measures for evaluating ranked retrieval effectiveness with spontaneous conversational speech

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Using various indexing schemes and multiple translations in the CL-SR task at CLEF 2005

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Overview of the CLEF-2006 cross-language speech retrieval track

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval

Comparison of different lemmatization approaches through the means of information retrieval performance

TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue
Fast phonetic/lexical searching in the archives of the Czech holocaust testimonies: advancing towards the MALACH project visions

TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper describes the system built by the team from the University of West Bohemia for participation in the CLEF 2006 CL-SR track. We have decided to concentrate only on the monolingual searching in the Czech test collection and investigate the effect of proper language processing on the retrieval performance. We have employed the Czech morphological analyser and tagger for that purposes. For the actual search system, we have used the classical tf.idf approach with blind relevance feedback as implemented in the Lemur toolkit. The results indicate that a suitable linguistic preprocessing is indeed crucial for the Czech IR performance.