Extracting knowledge from XML document repository: a semantic Web-based approach

  • Authors:
  • Henry M. Kim;Arijit Sengupta

  • Affiliations:
  • Schulich School of Business, York University, Toronto, Canada M3J 1P3;Raj Soin College of Business, Wright State University, Dayton, USA

  • Venue:
  • Information Technology and Management
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML plays an important role as the standard language for representing structured data for the traditional Web, and hence many Web-based knowledge management repositories store data and documents in XML. If semantics about the data are formally represented in an ontology, then it is possible to extract knowledge: This is done as ontology definitions and axioms are applied to XML data to automatically infer knowledge that is not explicitly represented in the repository. Ontologies also play a central role in realizing the burgeoning vision of the semantic Web, wherein data will be more sharable because their semantics will be represented in Web-accessible ontologies. In this paper, we demonstrate how an ontology can be used to extract knowledge from an exemplar XML repository of Shakespeare's plays. We then implement an architecture for this ontology using de facto languages of the semantic Web including OWL and RuleML, thus preparing the ontology for use in data sharing. It has been predicted that the early adopters of the semantic Web will develop ontologies that leverage XML, provide intra-organizational value such as knowledge extraction capabilities that are irrespective of the semantic Web, and have the potential for inter-organizational data sharing over the semantic Web. The contribution of our proof-of-concept application, KROX, is that it serves as a blueprint for other ontology developers who believe that the growth of the semantic Web will unfold in this manner.