Semi-automatic construction of topic ontologies

  • Authors:
  • Blaž Fortuna;Dunja Mladenič;Marko Grobelnik

  • Affiliations:
  • Jožef Stefan Institute, Ljubljana, Slovenia;Jožef Stefan Institute, Ljubljana, Slovenia;Jožef Stefan Institute, Ljubljana, Slovenia

  • Venue:
  • EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we review two techniques for topic discovery in collections of text documents (Latent Semantic Indexing and K-Means clustering) and present how we integrated them into a system for semi-automatic topic ontology construction. The OntoGen system offers support to the user during the construction process by suggesting topics and analyzing them in real time. It suggests names for the topics in two alternative ways both based on extracting keywords from a set of documents inside the topic. The first set of descriptive keyword is extracted using document centroid vectors, while the second set of distinctive keyword is extracted from the SVM classification model dividing documents in the topic from the neighboring documents.