Ontology extraction and integration from semi-structured data

  • Authors:
  • Shaobo Wang;Yi Zeng;Ning Zhong

  • Affiliations:
  • International WIC Institute, Beijing University of Technology, P.R. China;International WIC Institute, Beijing University of Technology, P.R. China;International WIC Institute, Beijing University of Technology, P.R. China and Department of Life Science and Informatics, Maebashi Institute of Technology, Japan

  • Venue:
  • AMT'11 Proceedings of the 7th international conference on Active media technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Domain ontologies are usually built by domain expert manually. They are accurate and professional from the perspective of domain dependent concepts, instances and relations among them, nevertheless, maintaining and creating new ontologies need too much manual work, especially when the ontology goes to large scale. Semi-structured data usually contain some semantic relations for concepts and instances, and there are many domain ontologies implicitly exist in these types of data sources. In this paper, we investigate automatic hierarchical domain ontology generation from semistructured data, more specifically, from HTML and XML documents. The main process of our work includes domain terms extraction, pruning, union and hierarchical structure representation. We illustrate our study based on Artificial Intelligence related conference data represented in HTML and XML documents.