Marvin: Distributed reasoning over large-scale Semantic Web data

  • Authors:
  • Eyal Oren;Spyros Kotoulas;George Anadiotis;Ronny Siebes;Annette ten Teije;Frank van Harmelen

  • Affiliations:
  • Vrije Universiteit Amsterdam, de Boelelaan 1081a, Amsterdam, Netherlands;Vrije Universiteit Amsterdam, de Boelelaan 1081a, Amsterdam, Netherlands;IMC Technologies, Athens, Greece;Vrije Universiteit Amsterdam, de Boelelaan 1081a, Amsterdam, Netherlands;Vrije Universiteit Amsterdam, de Boelelaan 1081a, Amsterdam, Netherlands;Vrije Universiteit Amsterdam, de Boelelaan 1081a, Amsterdam, Netherlands

  • Venue:
  • Web Semantics: Science, Services and Agents on the World Wide Web
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many Semantic Web problems are difficult to solve through common divide-and-conquer strategies, since they are hard to partition. We present Marvin, a parallel and distributed platform for processing large amounts of RDF data, on a network of loosely coupled peers. We present our divide-conquer-swap strategy and show that this model converges towards completeness. Within this strategy, we address the problem of making distributed reasoning scalable and load-balanced. We present SpeedDate, a routing strategy that combines data clustering with random exchanges. The random exchanges ensure load balancing, while the data clustering attempts to maximise efficiency. SpeedDate is compared against random and deterministic (DHT-like) approaches, on performance and load-balancing. We simulate parameters such as system size, data distribution, churn rate, and network topology. The results indicate that SpeedDate is near-optimally balanced, performs in the same order of magnitude as a DHT-like approach, and has an average throughput per node that scales with i for i items in the system. We evaluate our overall Marvin system for performance, scalability, load balancing and efficiency.