A framework for the automatic extraction of rules from online text

  • Authors:
  • Saeed Hassanpour;Martin J. O'Connor;Amar K. Das

  • Affiliations:
  • Stanford Center for Biomedical Informatics Research, Stanford, CA;Stanford Center for Biomedical Informatics Research, Stanford, CA;Stanford Center for Biomedical Informatics Research, Stanford, CA

  • Venue:
  • RuleML'2011 Proceedings of the 5th international conference on Rule-based reasoning, programming, and applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The majority of knowledge on the Web is encoded in unstructured text and is not linked to formalized knowledge, such as ontologies and rules. A potential solution to this problem is to acquire this knowledge through natural language processing and text mining methods. Prior work has focused on automatically extracting RDF- or OWL-based ontologies from text; however, the type of knowledge acquired is generally restricted to simple term hierarchies. This paper presents a general-purpose framework for acquiring more complex relationships from text and then encoding this knowledge as rules. Our approach starts with existing domain knowledge in the form of OWL ontologies and Semantic Web Rule Language (SWRL) rules and applies natural language processing and text matching techniques to deduce classes and properties. It then captures deductive knowledge in the form of new rules. We have evaluated our framework by applying it to web-based text on car rental requirements. We show that our approach can automatically and accurately generate rules for requirements of car rental companies not in the knowledge base. Our framework thus rapidly acquires complex knowledge from free text sources. We are expanding it to handle richer domains, such as medical science.