Extracting significant words from corpora for ontology extraction

  • Authors:
  • Dileep Damle;Victoria Uren

  • Affiliations:
  • The Open University, Milton Keynes, UK;The Open University, Milton Keynes, UK

  • Venue:
  • Proceedings of the 3rd international conference on Knowledge capture
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We show a new method for term extraction from a domain relevant corpus using natural language processing for the purposes of semi-automatic ontology learning. Literature shows that topical words occur in bursts. We find that the ranking of extracted terms is insensitive to the choice of population model, but calculating frequencies relative to the burst size rather than the document length in words yields significantly different results.