Integrating Scientific Data through External, Concept-Based Annotations

  • Authors:
  • Michael Gertz;Kai-Uwe Sattler

  • Affiliations:
  • -;-

  • Venue:
  • Proceedings of the VLDB 2002 Workshop EEXTT and CAiSE 2002 Workshop DTWeb on Efficiency and Effectiveness of XML Tools and Techniques and Data Integration over the Web-Revised Papers
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In several scientific application domains, such as the computational sciences, the transparent and integrated access to distributed and heterogeneous data sources is key to leveraging the knowledge and findings of researchers. Standard database integration approaches, however, are either not applicable or insufficient because of lack of local and global schema structures. In these application domains, data integration often occurs manually in that researchers collect data and categorize them using "semantic indexing", in the most simple case through local bookmarking, which leaves them without appropriate data query, sharing, and management mechanisms.In this paper, we present a data integration technique suitable for such application domains. This technique is based on the notion of controlled data annotations, resembling the idea of associating semantic rich meta-data with diverse types of data, including images and text-based documents. Using concept like structures defined by scientists, data annotations allow scientists to link such Web-accessible data at different levels of granularity to concepts. Annotated data describing instances of such concepts then provide for sophisticated query schemes that researchers can employ to query the distributed data in an integrated and transparent fashion. We present our data annotation framework in the context of the Neurosciences where researchers employ concepts and annotations to integrate and query diverse types of data managed and distributed among individual research groups.