Combination of information retrieval methods with LESK algorithm for Arabic word sense disambiguation

Authors:
Anis Zouaghi;Laroussi Merhbene;Mounir Zrigui
Affiliations:
UTIC Laboratory (unit of Monastir), ISI of Médenine, Médenine, Tunisia;UTIC Laboratory (unit of Monastir), ISI of Médenine, Médenine, Tunisia;UTIC Laboratory (unit of Monastir), ISI of Médenine, Médenine, Tunisia
Venue:
Artificial Intelligence Review
Year:
2012

Citing 11
Cited 1

An experimental study of factors important in document ranking

Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Word sense disambiguation using Conceptual Density

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
An unsupervised method for word sense tagging using parallel corpora

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An HMM approach to vowel restoration in Arabic and Hebrew

SEMITIC '02 Proceedings of the ACL-02 workshop on Computational approaches to semitic languages
Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology)

Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology)
Maximum entropy based restoration of Arabic diacritics

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Word sense disambiguation: A survey

ACM Computing Surveys (CSUR)
A fully unsupervised word sense disambiguation method using dependency knowledge

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Arabic diacritization using weighted finite-state transducers

Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages

Contribution to semantic analysis of Arabic language

Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose to use Harman, Croft and Okapi measures with Lesk algorithm to develop a system for Arabic word sense disambiguation, that combines unsupervised and knowledge based methods. This system must solve the lexical semantic ambiguity in Arabic language. The information retrieval measures are used to estimate the most relevant sense of the ambiguous word, by returning a semantic coherence score corresponding to the context that is semantically closest to the original sentence containing the ambiguous word. The Lesk algorithm is used to assign and select the adequate sense from those proposed by the information retrieval measures mentioned above. This selection is based on a comparison between the glosses of the word to be disambiguated, and its different contexts of use extracted from a corpus. Our experimental study proves that using of Lesk algorithm with Harman, Croft, and Okapi measures allows us to obtain an accuracy rate of 73%.