Learning to Learn Biological Relations from a Small Training Set

Authors:
Laura Alonso I Alemany;Santiago Bruno
Affiliations:
NLP Group Facultad de Matemática Astronomía y Física (FaMAF), UNC, Córdoba, Argentina;NLP Group Facultad de Matemática Astronomía y Física (FaMAF), UNC, Córdoba, Argentina
Venue:
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Year:
2009

Citing 6
Cited 1

Assessing agreement on classification tasks: the kappa statistic

Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
A graph kernel for protein-protein interaction extraction

BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Using automated feature optimisation to create an adaptable relation extraction system

BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Comparative experiments on learning information extractors for proteins and their interactions

Artificial Intelligence in Medicine

Data-driven computational linguistics at FaMAF-UNC, Argentina

YIWCALA '10 Proceedings of the NAACL HLT 2010 Young Investigators Workshop on Computational Approaches to Languages of the Americas

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present different ways to improve a basic machine learning approach to identify relations between biological named entities as annotated in the Genia corpus. The main difficulty with learning from the Genia event-annotated corpus is the small amount of examples that are available for each relation type. We compare different ways to address the data sparseness problem: using the corpus as the initial seed of a bootstrapping procedure, generalizing classes of relations via the Genia ontology and generalizing classes via clustering.