Summarizing web forum threads based on a latent topic propagation process

  • Authors:
  • Zhaochun Ren;Jun Ma;Shuaiqiang Wang;Yang Liu

  • Affiliations:
  • Shandong University, Jinan, China;Shandong University, Jinan, China;Shandong University of Finance, Jinan, China;Shandong University, Jinan, China

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

With an increasingly amount of information in web forums, quick comprehension of threads in web forums has become a challenging research problem. To handle this issue, this paper investigates the task of Web Forum Thread Summarization (WFTS), aiming to give a brief statement of each thread that involving multiple dynamic topics. When applied to the task of WFTS, traditional summarization methods are cramped by topic dependencies, topic drifting and text sparseness. Consequently, we explore an unsupervised topic propagation model in this paper, the Post Propagation Model (PPM), to burst through these problems by simultaneously modeling the semantics and the reply relationship existing in each thread. Each post in PPM is considered as a mixture of topics, and a product of Dirichlet distributions in previous posts is employed to model each topic dependencies during the asynchronous discussion. Based on this model, the task of WFTS is accomplished by extracting most significant sentences in a thread. The experimental results on two different forum data sets show that WFTS based on the PPM outperforms several state-of-the-art summarization methods in terms of ROUGE metrics.