MDSM: Microarray database schema matching using the Hungarian method

  • Authors:
  • Yi-Ping Phoebe Chen;Supawan Promparmote;Frederic Maire

  • Affiliations:
  • Faculty of Science and Technology, School of Information Technology, Deakin University, 221 Burwood Highway, Melbourne, Vic. 3125, Australia and Australia Research Council, Centre in Bioinformatic ...;Faculty of Science and Technology, School of Information Technology, Deakin University, 221 Burwood Highway, Melbourne, Vic. 3125, Australia;Centre for Information Technology Innovation, Faculty of Information Technology, School of Software Engineering and Data Communications, Queensland University of Technology, Australia

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2006

Quantified Score

Hi-index 0.07

Visualization

Abstract

Current microarray databases use different terminologies and structures and thereby limit the sharing of data and collating of results between laboratories. Consequently, an effective integrated microarray data model is required. One important process to develop such an integrated database is schema matching. In this paper, we propose an effective schema matching approach called MDSM, to syntactically and semantically map attributes of different microarray schemas. The contribution from this work will be used later to create microarray global schemas. Since microarray data is complex, we use microarray ontology to improve the measuring accuracy of the similarity between attributes. The similarity relations can be represented as weighted bipartite graphs. We determine the best schema matching by computing the optimal matching in a bipartite graph using the Hungarian optimisation method. Experimental results show that our schema matching approach is effective and flexible to use in different kinds of database models such as; database schema, XML schema, and web site map. Finally, a case study on an existing public microarray schema is carried out using the proposed method.