On bidirectional English-Arabic search

  • Authors:
  • M. Aljlayl;O. Frieder;D. Grossman

  • Affiliations:
  • Information Retrieval Laboratory, Illinois Institute of Technology, Stewart Building, Room 2370, 10 West 31st St., Chicago, IL;Information Retrieval Laboratory, Illinois Institute of Technology, Stewart Building, Room 2370, 10 West 31st St., Chicago, IL;Information Retrieval Laboratory, Illinois Institute of Technology, Stewart Building, Room 2370, 10 West 31st St., Chicago, IL

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In Cross-Language Information Retrieval (CLIR), queries in one language retrieve relevant documents in other languages. Machine-Readable Dictionaries (MRD) and Machine Translation (MT) systems are important resources for query translation in CLIR. We investigate the use of MT systems and MRD to Arabic-English and English-Arabic CLIR. The translation ambiguity associated with these resources is the key problem. We present three methods of query translation using a bilingual dictionary for Arabic-English CLIR. First, we present the Every-Match (EM) method. This method yields ambiguous translations because many extraneous terms are added to the original query. To disambiguate query translation, we present the First-Match (FM) method that considers the first match in the dictionary as the candidate term. Finally, we present the Two-Phase (TP) method. We show that good retrieval effectiveness can be achieved without complex resources using the Two-Phase method for Arabic-English CLIR. We also empirically evaluate the effectiveness of the Arabic-English MT approach using short, medium, and long queries of TREC7 and TREC9 topics and collections. The effects of the query length to the quality of the MT-based CLIR are investigated. English-Arabic CLIR is evaluated via MRD and English-Arabic MT. The query expansion via post-translation approach is used to deemphasize the extraneous terms introduced by the MRD and MT for English-Arabic CLIR.