Rapid benchmarking for semantic web knowledge base systems

  • Authors:
  • Sui-Yu Wang;Yuanbo Guo;Abir Qasem;Jeff Heflin

  • Affiliations:
  • Computer Science and Engineering Department, Lehigh University, Bethlehem, PA;Computer Science and Engineering Department, Lehigh University, Bethlehem, PA;Computer Science and Engineering Department, Lehigh University, Bethlehem, PA;Computer Science and Engineering Department, Lehigh University, Bethlehem, PA

  • Venue:
  • ISWC'05 Proceedings of the 4th international conference on The Semantic Web
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a method for rapid development of benchmarks for Semantic Web knowledge base systems. At the core, we have a synthetic data generation approach for OWL that is scalable and models the real world data. The data-generation algorithm learns from real domain documents and generates benchmark data based on the extracted properties relevant for benchmarking. We believe that this is important because relative performance of systems will vary depending on the structure of the ontology and data used. However, due to the novelty of the Semantic Web, we rarely have sufficient data for benchmarking. Our approach helps overcome the problem of having insufficient real world data for benchmarking and allows us to develop benchmarks for a variety of domains and applications in a very time efficient manner. Based on our method, we have created a new Lehigh BibTeX Benchmark and conducted an experiment on four Semantic Web knowledge base systems. We have verified our hypothesis about the need for representative data by comparing the experimental result to that of our previous Lehigh University Benchmark. The difference in both experiments has demonstrated the influence of ontology and data on the capability and performance of the systems and thus the need of using a representative benchmark for the intended application of the systems.