A model for matching and integrating heterogeneous relational biomedical databases schemas

  • Authors:
  • Yaser Karasneh;Hamidah Ibrahim;Mohamed Othman;Razali Yaakob

  • Affiliations:
  • Universiti Putra Malaysia, Selangar D.E., Malaysia;Universiti Putra Malaysia, Selangar D.E., Malaysia;Universiti Putra Malaysia, Selangar D.E., Malaysia;Universiti Putra Malaysia, Selangar D.E., Malaysia

  • Venue:
  • IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Database integration aims at providing a uniform and consistent view called global schema, over a set of autonomous and heterogeneous data sources, so that data residing in different sources can be accessed as if it was in a single schema. The integration of data sources can be performed in two steps, a matching and a data transformation step. Schema matching, the focus of this paper, is a fundamental operation in the manipulation of schema in formatting match, which takes two schemas that correspond semantically to each other. Manually specifying schema matches is a tedious, time consuming, error-prone, and therefore expensive process, which is a growing problem given the rapidly increasing number of data sources to integrate. As systems become able to handle more complex databases and applications such as biomedical databases schemas, their schemas become large, further increasing the number of matches to be performed. Several solutions in solving the issues of schema matching have been proposed. However, these solutions are still limited as (i) they do not explore most of the available information related to schemas, (ii) the approaches rely strictly on the assumption that the schemas to be matched are from the same application domain, and (iii) the approaches either match schemas by comparing the strings of the elements' names or by checking if those names are synonyms. This paper addresses the above limitations by proposing a model for matching heterogeneous relational biomedical databases' schemas that further improves the results of the integration.