A methodology for semantic integration of metadata in bioinformatics data sources

  • Authors:
  • Lei Li;Roop G. Singh;Guangzhi Zheng;Art Vandenberg;Vijay Vaishnavi;Sham Navathe

  • Affiliations:
  • Georgia State University, Atlanta, GA;Georgia State University, Atlanta, GA;Georgia State University, Atlanta, GA;Georgia State University, Atlanta, GA;Georgia State University, Atlanta, GA;Georgia Institute of Technology, Atlanta, Georgia

  • Venue:
  • Proceedings of the 43rd annual Southeast regional conference - Volume 1
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Semantic heterogeneity is becoming increasingly prominent in bioinformatics domains that deal with constantly expanding, dynamic, often very large, datasets from various distributed sources. Metadata is the key component for effective information integration. Traditional approaches for reconciling semantic heterogeneity use standards or mediation-based methods. These approaches have had limited success in addressing the general semantic heterogeneity problem and by themselves are not likely to succeed in bioinformatics domains where one faces the additional complexity of keeping pace with the speed at which data and semantic heterogeneity is being generated. This paper presents a methodology for reconciliation of semantic heterogeneity of metadata in bioinformatics data sources. The approach is based on the proposition that by globally monitoring, clustering, and visualizing bioinformatics metadata across disparately created data sources, patterns of practice can be identified. This can facilitate semantic reconciliation of metadata in current data and mitigate semantic heterogeneity in future data by promoting sharing and reuse of existing metadata. To instantiate the methodology, a research architecture, MicroSEEDS, is presented and its implementation and envisioned uses are discussed.