Commonsense-based topic modeling

  • Authors:
  • Dheeraj Rajagopal;Daniel Olsher;Erik Cambria;Kenneth Kwok

  • Affiliations:
  • NUS Temasek Laboratories, Singapore;NUS Temasek Laboratories, Singapore;NUS Temasek Laboratories, Singapore;NUS Temasek Laboratories, Singapore

  • Venue:
  • Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Topic modeling is a technique used for discovering the abstract 'topics' that occur in a collection of documents, which is useful for tasks such as text auto-categorization and opinion mining. In this paper, a commonsense knowledge based algorithm for document topic modeling is presented. In contrast to probabilistic models, the proposed approach does not involve training of any kind and does not depend on word co-occurrence or particular word distributions, making the algorithm effective on texts of any length and composition. 'Semantic atoms' are used to generate feature vectors for document concepts. These features are then clustered using group average agglomerative clustering, providing much improved performance over existing algorithms.