Constructing Biological Knowledge Bases by Extracting Information from Text Sources
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Classifying semantic relations in bioscience texts
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Journal of Biomedical Informatics
The DDI corpus: An annotated corpus with pharmacological substances and drug-drug interactions
Journal of Biomedical Informatics
Hi-index | 0.00 |
Corpora with specific entities and relationships annotated are essential to train and evaluate text-mining systems that are developed to extract specific structured information from a large corpus. In this paper we describe an approach where a named-entity recognition system produces a first annotation and annotators revise this annotation using a web-based interface. The agreement figures achieved show that the inter-annotator agreement is much better than the agreement with the system provided annotations. The corpus has been annotated for drugs, disorders, genes and their inter-relationships. For each of the drug-disorder, drug-target, and target-disorder relations three experts have annotated a set of 100 abstracts. These annotated relationships will be used to train and evaluate text-mining software to capture these relationships in texts.