The Effects of Conjunction, Facet Structure, and Dictionary Combinations in Concept-Based Cross-Language Retrieval

  • Authors:
  • Ari Pirkola;Heikki Keskustalo;Kalervo Järvelin

  • Affiliations:
  • Department of Information Studies, University of Tampere, 33101 Tampere, Finland. liarpi@uta.fi;Department of Information Studies, University of Tampere, Tampere, Finland 33101. ccheke@uta.fi;Department of Information Studies, University of Tampere, Tampere, Finland 33101.likaja@uta.fi

  • Venue:
  • Information Retrieval
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper studies concept-based cross-language information retrieval(CLIR). The document collection was a subset of the TREC collection. The testrequests were formed from TREC‘s health related topics. As translationdictionaries the study used a general dictionary and a domain-specific (=medical)dictionary. The effects of translation method, conjunction, and facet order onthe effectiveness of concept-based cross-language queries were studied, andconcept-based structuring of cross-language queries was compared to mechanicalstructuring based on the output of dictionaries. The performance of translatedFinnish queries against English documents was compared to the performance oforiginal English queries against the English documents, and the performance ofdifferent CLIR query types was compared with one another. No major difference wasfound between concept-based and mechanical structuring. The best translationmethod was a simultaneous look-up in the medical dictionary and the generaldictionary, in which case cross-language queries performed as well as the originalEnglish queries. The results showed that especially at high exhaustivity (thenumber of mutually restrictive concepts in a request) levels cross-languagequeries perform well in relation to monolingual queries. This suggests thatconjunction disambiguates cross-language queries. An extensive study was made ofthe relative importance of the concepts of requests. On the basis of theclassification data of request concepts it was shown how the order of facets in aquery affects cross-language as well as monolingual queries.