Scalable cleanup of information extraction data using ontologies

  • Authors:
  • Julian Dolby;James Fan;Achille Fokoue;Aditya Kalyanpur;Aaron Kershenbaum;Li Ma;William Murdock;Kavitha Srinivas;Christopher Welty

  • Affiliations:
  • IBM Watson Research Center, Yorktown Heights, NY;IBM Watson Research Center, Yorktown Heights, NY;IBM Watson Research Center, Yorktown Heights, NY;IBM Watson Research Center, Yorktown Heights, NY;IBM Watson Research Center, Yorktown Heights, NY;IBM China Research Lab, Beijing, China;IBM Watson Research Center, Yorktown Heights, NY;IBM Watson Research Center, Yorktown Heights, NY;IBM Watson Research Center, Yorktown Heights, NY

  • Venue:
  • ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The approach of using ontology reasoning to cleanse the output of information extraction tools was first articulated in SemantiClean. A limiting factor in applying this approach has been that ontology reasoning to find inconsistencies does not scale to the size of data produced by information extraction tools. In this paper, we describe techniques to scale inconsistency detection, and illustrate the use of our techniques to produce a consistent subset of a knowledge base with several thousand inconsistencies.