On the use of consensus clustering for incremental learning of topic hierarchies

  • Authors:
  • Ricardo M. Marcacini;Eduardo R. Hruschka;Solange O. Rezende

  • Affiliations:
  • Mathematical and Computer Sciences Institute - ICMC, University of São Paulo - USP, São Carlos, SP, Brazil;Mathematical and Computer Sciences Institute - ICMC, University of São Paulo - USP, São Carlos, SP, Brazil;Mathematical and Computer Sciences Institute - ICMC, University of São Paulo - USP, São Carlos, SP, Brazil

  • Venue:
  • SBIA'12 Proceedings of the 21st Brazilian conference on Advances in Artificial Intelligence
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Incremental learning of topic hierarchies is very useful to organize and manage growing text collections, thereby summarizing the implicit knowledge from textual data. However, currently available methods have some limitations to perform the incremental learning phase. In particular, when the initial topic hierarchy is not suitable for modeling the data, new documents are inserted into inappropriate topics and this error gets propagated into future hierarchy updates, thus decreasing the quality of the knowledge extraction process. We introduce a method for obtaining more robust initial topic hierarchies by using consensus clustering. Experimental results on several text collections show that our method significantly reduces the degradation of the topic hierarchies during the incremental learning compared to a traditional method.