Linking Life Sciences Data Using Graph-Based Mapping

Authors:
Jan Taubert;Matthew Hindle;Artem Lysenko;Jochen Weile;Jacob Köhler;Christopher J. Rawlings
Affiliations:
Department of Biomathematics and Bioinformatics, Rothamsted Research, Harpenden, UK;Department of Biomathematics and Bioinformatics, Rothamsted Research, Harpenden, UK;Department of Biomathematics and Bioinformatics, Rothamsted Research, Harpenden, UK;School of Computing Science, Newcastle University, Newcastle upon Tyne, UK;Protein Research Group, University of Tromsø, Norway;Department of Biomathematics and Bioinformatics, Rothamsted Research, Harpenden, UK
Venue:
DILS '09 Proceedings of the 6th International Workshop on Data Integration in the Life Sciences
Year:
2009

Citing 3
Cited 1

Graph-based analysis and visualization of experimental results with ONDEX

Bioinformatics
Ontology based text indexing and querying for the semantic web

Knowledge-Based Systems
A probabilistic interpretation of precision, recall and F-score, with implication for evaluation

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research

Getting the meaning right: a complementary distributional layer for the web semantics

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

There are over 1100 different databases available containing primary and derived data of interest to research biologists. It is inevitable that many of these databases contain overlapping, related or conflicting information. Data integration methods are being developed to address these issues by providing a consolidated view over multiple databases. However, a key challenge for data integration is the identification of links between closely related entries in different life sciences databases when there is no direct information that provides a reliable cross-reference. Here we describe and evaluate three data integration methods to address this challenge in the context of a graph-based data integration framework (the ONDEX system). A key result presented in this paper is a quantitative evaluation of their performance in two different situations: the integration and analysis of different metabolic pathways resources and the mapping of equivalent elements between the Gene Ontology and a nomenclature describing enzyme function.