Semantics-aware open information extraction in the biomedical domain

Authors:
Victoria Nebot;Rafael Berlanga
Affiliations:
Lenguajes y Sistemas Informáticos, Castellón, Spain;Lenguajes y Sistemas Informáticos, Castellón, Spain
Venue:
Proceedings of the 4th International Workshop on Semantic Web Applications and Tools for the Life Sciences
Year:
2011

Citing 13
Cited 0

Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Survey of semantic annotation platforms

Proceedings of the 2005 ACM symposium on Applied computing
Automatic assignment of biomedical categories: toward a generic approach

Bioinformatics
Classifying semantic relations in bioscience texts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Learning to extract relations for protein annotation

Bioinformatics
Kernel-based learning for biomedical relation extraction

Journal of the American Society for Information Science and Technology
Text processing through Web services

Bioinformatics
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Semantic annotation, indexing, and retrieval

Web Semantics: Science, Services and Agents on the World Wide Web
Measuring prediction capacity of individual verbs for the identification of protein interactions

Journal of Biomedical Informatics
Discovering drug–drug interactions

Bioinformatics
Automatic integration of drug indications from multiple health resources

Proceedings of the 1st ACM International Health Informatics Symposium
Using text to build semantic networks for pharmacogenomics

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The increasing amount of biomedical scientific literature published on the Web is demanding new tools and methods to automatically process and extract relevant information. Traditional information extraction has focused on recognizing well-defined entities such as genes or proteins, which constitutes the basis for extracting the relations between the recognized entities. Most of the work has focused on harvesting domain-specific, pre-specified relations, which usually requires manual labor and heavy machinery. The intrinsic features and scale of the Web demand new approaches able to cope with the diversity of documents, where the number of relations is unbounded and not known in advance. This paper presents a scalable method for the extraction of biomedical relations from text. The method is not geared to any specific sub-domain (e.g. protein-protein interactions, drug-drug interactions, etc.) and does not require any manual input or deep processing. Even better, the method uses the extracted relations to infer a set of abstract semantic relations and their signature types, which constitutes a valuable source of knowledge when constructing formal knowledge bases. We enable seamless integration of the extracted relations with the available biomedical resources through the process of semantic annotation. The proposed approach has successfully been applied to the CALBC corpus (i.e. almost a million text documents) and UMLS has been used as knowledge resource for semantic annotation.