Towards query optimization for the data web: disk-based algorithms: trace equivalence and bisimilarity

  • Authors:
  • Ala' Hawash;Anton Deik;Bilal Farraj;Mustafa Jarrar

  • Affiliations:
  • Birzeit University, Palestine;Birzeit University, Palestine;Birzeit University, Palestine;Birzeit University, Palestine

  • Venue:
  • Proceedings of the 1st International Conference on Intelligent Semantic Web-Services and Applications
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Companies, Communities, Research Labs, and even Governments are all competing on publishing structured data in the web in many forms such as RDF and XML. Many Datasets are now being published and linked together, including Wikipedia, Yago, DBLP, IEEE, IBM, Flickr, and US and UK government data. Most of these datasets are published in RDF which is a graph-based data model. However, querying RDF graphs is a major problem which has brought the attention of the research community. Among the many approaches proposed to tune up the performance of queries over data graphs, a number of them proposed to summarize RDF graphs for query optimization; instead of querying a dataset, queries are executed over the summary of the dataset. In order to summarize a dataset, two well known algorithms are being used, namely, Trace Equivalence and Bisimilarity. Nevertheless, these are memory based and thus suffer from scalability problems because of the limitations imposed by the memory. In this paper, we propose disk-based versions of those memory-based algorithms and we adapt them to RDF data. Our proposed algorithms are experimented on relatively large datasets and using different sizes of memory to prove that they are indeed disk based.