Word sense disambiguation using a second language monolingual corpus
Computational Linguistics
A systematic comparison of various statistical alignment models
Computational Linguistics
Empirical methods for exploiting parallel texts
Empirical methods for exploiting parallel texts
Computational Linguistics - Special issue on web as corpus
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Two languages are more informative than one
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Word-sense disambiguation using statistical methods
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Inducing multilingual text analysis tools via robust projection across aligned corpora
HLT '01 Proceedings of the first international conference on Human language technology research
Inducing information extraction systems for new languages via cross-language projection
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
An unsupervised method for word sense tagging using parallel corpora
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
WWSM '00 Proceedings of the ACL-2000 workshop on Word senses and multi-linguality - Volume 8
Experiments in word domain disambiguation for parallel texts
WWSM '00 Proceedings of the ACL-2000 workshop on Word senses and multi-linguality - Volume 8
Sense discrimination with parallel corpora
WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
Knowledge intensive word alignment with KNOWA
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Crossing parallel corpora and multilingual lexical databases for WSD
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Word sense disambiguation: A survey
ACM Computing Surveys (CSUR)
New features for FrameNet: WordNet mapping
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Automatic identification of semantic relations in Italian complex nominals
IWCS-8 '09 Proceedings of the Eighth International Conference on Computational Semantics
Cross-lingual annotation projection of semantic roles
Journal of Artificial Intelligence Research
From Italian text to TimeML document via dependency parsing
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Spanish all-words semantic class disambiguation using Cast3LB corpus
MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Crossing parallel corpora and multilingual lexical databases for WSD
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Can projected chains in parallel corpora help coreference resolution?
DAARC'11 Proceedings of the 8th international conference on Anaphora Processing and Applications
Towards a model of formal and informal address in English
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Wikipedia-based WSD for multilingual frame annotation
Artificial Intelligence
Hi-index | 0.00 |
In this article we illustrate and evaluate an approach to create high quality linguistically annotated resources based on the exploitation of aligned parallel corpora. This approach is based on the assumption that if a text in one language has been annotated and its translation has not, annotations can be transferred from the source text to the target using word alignment as a bridge. The transfer approach has been tested and extensively applied for the creation of the MultiSemCor corpus, an English/Italian parallel corpus created on the basis of the English SemCor corpus. In MultiSemCor the texts are aligned at the word level and word sense annotated with a shared inventory of senses. A number of experiments have been carried out to evaluate the different steps involved in the methodology and the results suggest that the transfer approach is one promising solution to the resource bottleneck. First, it leads to the creation of a parallel corpus, which represents a crucial resource per se. Second, it allows for the exploitation of existing (mostly English) annotated resources to bootstrap the creation of annotated corpora in new (resource-poor) languages with greatly reduced human effort.