Effective arabic-english cross-language information retrieval via machine-readable dictionaries and machine translation

  • Authors:
  • Mohammed Aljlayl;Ophir Frieder

  • Affiliations:
  • Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL

  • Venue:
  • Proceedings of the tenth international conference on Information and knowledge management
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In Cross-Language Information Retrieval (CLIR), queries in one language retrieve relevant documents in other languages Machine-Readable Dictionary (MRD) and Machine Translation (MT) are important resources for query translation in CLIR. We investigate MT and MRD to Arabic-English CLIR. The translation ambiguity associated with these resources is the key problem. We present three methods of query translation using a bilingual dictionary for Arabic-English CLIR. First, we present the Every-Match (EM) method. This method yields ambiguous translations since many extraneous terms are added to the original query. To disambiguate the query translation, we present the First-Match (FM) method that considers the first match in the dictionary as the candidate term. Finally, we present the Two-Phase (TP) method. We show that good retrieval effectiveness can be achieved without complex resources using the Two-Phase method for Arabic-English CLIR. We also empirically evaluate the effectiveness of the MT-based method using short, medium, and long queries from TREC. The effects of the query length on the quality of the MT-based CLIR are investigated.