Automatic word sense disambiguation and construction identification based on corpus multilevel annotation

Authors:
Olga Lyashevskaya;Olga Mitrofanova;Maria Grachkova;Sergey Romanov;Anastasia Shimorina;Alexandra Shurygina
Affiliations:
NRU Higher School of Economics, Moscow;St. Petersburg State University, St. Petersburg, Russia;St. Petersburg State University, St. Petersburg, Russia;St. Petersburg State University, St. Petersburg, Russia;St. Petersburg State University, St. Petersburg, Russia;St. Petersburg State University, St. Petersburg, Russia
Venue:
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Year:
2011

Citing 9
Cited 0

A Baseline Methodology for Word Sense Disambiguation

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Using corpus statistics and WordNet relations for sense identification

Computational Linguistics - Special issue on word sense disambiguation
Word sense disambiguation with pattern learning and automatic feature selection

Natural Language Engineering
Statistical Word Sense Disambiguation in Contexts for Russian Nouns Denoting Physical Objects

TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
Word sense disambiguation: A survey

ACM Computing Surveys (CSUR)
Word Sense Disambiguation: Algorithms and Applications

Word Sense Disambiguation: Algorithms and Applications
Parsing the SynTagRus treebank of Russian

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
StringNet as a computational resource for discovering and investigating linguistic constructions

EUCCL '10 Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The research project reported in this paper aims at automatic extraction of linguistic information from contexts in the Russian National Corpus (RNC) and its subsequent use in building a comprehensive lexicographic resource - the Index of Russian lexical constructions. The proposed approach implies automatic context classification intended for word sense disambiguation (WSD) and construction identification (CxI). The automatic context processing procedure takes into account the following types of contextual information represented in the RNC multilevel annotation: lexical (lemma) tags (lex), morphological (grammatical) tags (gr), semantic (taxonomy) tags (sem), and combinations of the various types of tags. Multiple experiments on WSD and CxI are performed using RNC representative context samples for nouns. In each series of experiments we analyze (1) different context markers of meaning of target words and (2) constructions including context markers and target words.