Linking Biological Databases Semantically for Knowledge Discovery

Authors:
Sudha Ram;Kunpeng Zhang;Wei Wei
Affiliations:
Department of MIS Eller College of Management, University of Arizona, Tucson, AZ 85721;Department of MIS Eller College of Management, University of Arizona, Tucson, AZ 85721;Department of MIS Eller College of Management, University of Arizona, Tucson, AZ 85721
Venue:
ER '08 Proceedings of the ER 2008 Workshops (CMLSA, ECDM, FP-UML, M2AS, RIGiM, SeCoGIS, WISM) on Advances in Conceptual Modeling: Challenges and Opportunities
Year:
2008

Citing 6
Cited 1

The entity-relationship model—toward a unified view of data

ACM Transactions on Database Systems (TODS) - Special issue: papers from the international conference on very large data bases: September 22–24, 1975, Framingham, MA
GIMS - A Data Warehouse for Storage and Analysis of Genome Sequence and Functional Data

BIBE '01 Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering
Labeling and Enhancing Life Sciences Links

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
A Methodology to Enhance the Semantics of Links between PubMed Publications and Markers in the Human Genome

BIBE '05 Proceedings of the Fifth IEEE Symposium on Bioinformatics and Bioengineering
Semantic Model to Integrate Biological Resources

ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
Using annotations from controlled vocabularies to find meaningful associations

DILS'07 Proceedings of the 4th international conference on Data integration in the life sciences

Ontology consolidation in bioinformatics

APCCM '10 Proceedings of the Seventh Asia-Pacific Conference on Conceptual Modelling - Volume 110

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many important life sciences questions are aimed at studying the relationships and interactions between biological functions/processes and biological entities such as genes. The answers may be found by examining diverse types of biological/genomic databases. Finding these answers, however, requires accessing, and retrieving data, from diverse biological data sources. More importantly, sophisticated knowledge discovery processes involve traversing through large numbers of inherent links among various data sources. Currently, the links among data are either implemented as hyperlinks without explicitly indicating their meanings and labels, or hidden in a seemingly simple text format. Consequently, biologists spend numerous hours identifying potentially useful links and following each lead manually, which is time-consuming and error-prone. Our research is aimed at constructing semantic relationships among all biological entities. We have designed a semantic model to categorize and formally define the links. By incorporating ontologies such as Gene or Sequence ontology, we propose techniques to analyze the links embedded within and among data records, to explicitly label their semantics, and to facilitate link traversal, querying, and data sharing. Users may then ask complicated and ad hoc questions and even design their own workflow to support their knowledge discovery processes. In addition, we have performed an empirical analysis to demonstrate that our method can not only improve the efficiency of querying multiple databases, but also yield more useful information.