Mining Ontological Knowledge from Domain-Specific Text Documents

  • Authors:
  • Xing Jiang;Ah-Hwee Tan

  • Affiliations:
  • Nanyang Technological University;Nanyang Technological University

  • Venue:
  • ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.02

Visualization

Abstract

Traditional text mining systems employ shallow parsing techniques and focus on concept extraction and taxonomic relation extraction. This paper presents a novel system called CRCTOL for mining rich semantic knowledge in the form of ontology from domain-specific text documents. By using a full text parsing technique and incorporating both statistical and lexico-syntactic methods, the knowledge extracted by our system is more concise and contains a richer semantics compared with alternative systems. We conduct a case study wherein CRCTOL extracts ontological knowledge, specifically key concepts and semantic relations, from a terrorism domain text collection. Quantitative evaluation, by comparing with a state-of-the-art ontology learning system known as Text-To-Onto, has shown that CRCTOL produces much better precision and recall for both concept and relation extraction, especially from sentences with complex structures.