Enhanced Query Expansion in English-Arabic CLIR

  • Authors:
  • Abdelghani Bellaachia;Ghita Amor-Tijani

  • Affiliations:
  • -;-

  • Venue:
  • DEXA '08 Proceedings of the 2008 19th International Conference on Database and Expert Systems Application
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Arabic is a language with a particularly large vocabulary rich in words with synonymous shades of meaning. Modern Standard Arabic, which is used in formal writings, is the ancient Arabic language incorporated with loanwords derived from foreign languages. Different synonyms and loanwords tend to be used in different writings. Indeed, the Arabic composition style tends to vary throughout the Arab countries (Abdelali, 2004). Relevant documents could be overlooked when the query terms are synonyms or related to the ones used in the document collection. This could deteriorate the performance of a Cross Lingual Information Retrieval (CLIR) system. Query Expansion (QE) using the document collection is the usual approach taken to enrich translated queries with context related terms. In this study, QE is explored for an English-Arabic CLIR system in which English queries are used to search Arabic documents. A thesaurus-based disambiguation approach is applied to further optimize the effectiveness of that technique. Indeed, experimental results show that QE enhanced by disambiguation gives an improved effectiveness.