Mining evolutionary multi-branch trees from text streams

  • Authors:
  • Xiting Wang;Shixia Liu;Yangqiu Song;Baining Guo

  • Affiliations:
  • Tsinghua University, Beijing, China;Microsoft Research Asia, Beijing, China;Hong Kong University of Science and Technology, Hong Kong, Hong Kong;Microsoft Research Asia, Beijing, China

  • Venue:
  • Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Understanding topic hierarchies in text streams and their evolution patterns over time is very important in many applications. In this paper, we propose an evolutionary multi-branch tree clustering method for streaming text data. We build evolutionary trees in a Bayesian online filtering framework. The tree construction is formulated as an online posterior estimation problem, which considers both the likelihood of the current tree and conditional prior given the previous tree. We also introduce a constraint model to compute the conditional prior of a tree in the multi-branch setting. Experiments on real world news data demonstrate that our algorithm can better incorporate historical tree information and is more efficient and effective than the traditional evolutionary hierarchical clustering algorithm.