Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology)
Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology)
SemEval-2007 task 10: English lexical substitution task
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
SemEval-2010 task 2: Cross-lingual lexical substitution
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
SemEval-2010 task 3: Cross-lingual word sense disambiguation
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Hi-index | 0.00 |
Word Sense Disambiguation (WSD) is considered one of the most important problems in Natural Language Processing [1]. It is claimed that WSD is essential for those applications that require of language comprehension modules such as search engines, machine translation systems, automatic answer machines, second life agents, etc. Moreover, with the huge amounts of information in Internet and the fact that this information is continuosly growing in different languages, we are encourage to deal with cross-lingual scenarios where WSD systems are also needed. On the other hand, Lexical Substitution (LS) refers to the process of finding a substitute word for a source word in a given sentence. The LS task needs to be approached by firstly disambiguating the source word, therefore, these two tasks (WSD and LS) are somehow related. In this paper, we present a naïve approach to tackle the problem of cross-lingual WSD and cross-lingual lexical substitution. We use a bilingual statistical dictionary, which is calculated with Giza++ by using the EUROPARL parallel corpus, in order to calculate the probability of a source word to be translated to a target word (which is assumed to be the correct sense of the source word but in a different language). Two versions of the probabilistic model are tested: unweighted and weighted. The results were compared with those of an international competition, obtaining a good performance.