Assessing agreement on classification tasks: the kappa statistic
Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
A graph kernel for protein-protein interaction extraction
BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Using automated feature optimisation to create an adaptable relation extraction system
BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Comparative experiments on learning information extractors for proteins and their interactions
Artificial Intelligence in Medicine
Data-driven computational linguistics at FaMAF-UNC, Argentina
YIWCALA '10 Proceedings of the NAACL HLT 2010 Young Investigators Workshop on Computational Approaches to Languages of the Americas
Hi-index | 0.00 |
In this paper we present different ways to improve a basic machine learning approach to identify relations between biological named entities as annotated in the Genia corpus. The main difficulty with learning from the Genia event-annotated corpus is the small amount of examples that are available for each relation type. We compare different ways to address the data sparseness problem: using the corpus as the initial seed of a bootstrapping procedure, generalizing classes of relations via the Genia ontology and generalizing classes via clustering.