Constructing virtual documents for ontology matching using mapreduce

  • Authors:
  • Hang Zhang;Wei Hu;Yuzhong Qu

  • Affiliations:
  • State Key Laboratory for Novel Software Technology, Nanjing University, China;State Key Laboratory for Novel Software Technology, Nanjing University, China;State Key Laboratory for Novel Software Technology, Nanjing University, China

  • Venue:
  • JIST'11 Proceedings of the 2011 joint international conference on The Semantic Web
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ontology matching is a crucial task for data integration and management on the Semantic Web. The ontology matching techniques today can solve many problems from heterogeneity of ontologies to some extent. However, for matching large ontologies, most ontology matchers take too long run time and have strong requirements on running environment. Based on the MapReduce framework and the virtual document technique, in this paper, we propose a 3-stage MapReduce-based approach called V-Doc+ for matching large ontologies, which significantly reduces the run time while keeping good precision and recall. Firstly, we establish four MapReduce processes to construct virtual document for each entity (class, property or instance), which consist of a simple process for the descriptions of entities, an iterative process for the descriptions of blank nodes and two processes for exchanging the descriptions with neighbors. Then, we use a word-weight-based partition method to calculate similarities between entities in the corresponding reducers. We report our results from two experiments on an OAEI dataset and a dataset from the biology domain. Its performance is assessed by comparing with existing ontology matchers. Additionally, we show how run time is reduced with increasing the size of cluster.