Combination of information retrieval methods with LESK algorithm for Arabic word sense disambiguation

  • Authors:
  • Anis Zouaghi;Laroussi Merhbene;Mounir Zrigui

  • Affiliations:
  • UTIC Laboratory (unit of Monastir), ISI of Médenine, Médenine, Tunisia;UTIC Laboratory (unit of Monastir), ISI of Médenine, Médenine, Tunisia;UTIC Laboratory (unit of Monastir), ISI of Médenine, Médenine, Tunisia

  • Venue:
  • Artificial Intelligence Review
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose to use Harman, Croft and Okapi measures with Lesk algorithm to develop a system for Arabic word sense disambiguation, that combines unsupervised and knowledge based methods. This system must solve the lexical semantic ambiguity in Arabic language. The information retrieval measures are used to estimate the most relevant sense of the ambiguous word, by returning a semantic coherence score corresponding to the context that is semantically closest to the original sentence containing the ambiguous word. The Lesk algorithm is used to assign and select the adequate sense from those proposed by the information retrieval measures mentioned above. This selection is based on a comparison between the glosses of the word to be disambiguated, and its different contexts of use extracted from a corpus. Our experimental study proves that using of Lesk algorithm with Harman, Croft, and Okapi measures allows us to obtain an accuracy rate of 73%.