Wheat and chaff - practically feasible interactive ontology revision

  • Authors:
  • Nadeschda Nikitina;Birte Glimm;Sebastian Rudolph

  • Affiliations:
  • Institute AIFB, Karlsruhe Institute of Technology, DE;Ulm University, Institute of Artificial Intelligence, DE;Institute AIFB, Karlsruhe Institute of Technology, DE

  • Venue:
  • ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

When ontological knowledge is acquired automatically, quality control is essential. We consider the tightest possible approach - an exhaustive manual inspection of the acquired data. By using automated reasoning, we partially automate the process: after each expert decision, axioms that are entailed by the already approved statements are automatically approved, whereas axioms that would lead to an inconsistency are declined. Adequate axiom ranking strategies are essential in this setting to minimize the amount of expert decisions. In this paper, we present a generalization of the previously proposed ranking techniques which works well for arbitrary validity ratios - the proportion of valid statements within a dataset - whereas the previously described ranking functions were either tailored towards validity ratios of exactly 100% and 0% or were optimizing the worst case. The validity ratio - generally not known a priori - is continuously estimated over the course of the inspection process. We further employ partitioning techniques to significantly reduce the computational effort. We provide an implementation supporting all these optimizations as well as featuring a user front-end for successive axiom evaluation, thereby making our proposed strategy applicable to practical scenarios. This is witnessed by our evaluation showing that the novel parameterized ranking function almost achieves the maximum possible automation and that the computation time needed for each reasoning-based, automatic decision is reduced to less than one second on average for our test dataset of over 25,000 statements.