Domain-specific CLIR of english, german and russian using fusion and subject metadata for query expansion

  • Authors:
  • Vivien Petras;Fredric Gey;Ray R. Larson

  • Affiliations:
  • School of Information Management and Systems, University of California, Berkeley, CA;UC Data Archive & Technical Assistance (UC DATA), University of California, Berkeley, CA;School of Information Management and Systems, University of California, Berkeley, CA

  • Venue:
  • CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the combined submissions of the Berkeley group for the domain-specific track at CLEF 2005. The data fusion technique being tested is the fusion of multiple probabilistic searches against different XML components using both Logistic Regression (LR) algorithms and a version of the Okapi BM-25 algorithm. We also combine multiple translations of queries in cross-language searching. The second technique analyzed is query enhancement with domain-specific metadata (thesaurus terms). We describe our technique of Entry Vocabulary Modules, which associates query words with thesaurus terms and suggest its use for monolingual as well as bilingual retrieval. Different weighting and merging schemes for adding keywords to queries as well as translation techniques are described.