Document-Based HITS Model for Multi-document Summarization

  • Authors:
  • Xiaojun Wan

  • Affiliations:
  • Institute of Computer Science and Technology, Peking University, Beijing, China 100871

  • Venue:
  • PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The PageRank model has been successfully exploited for multi-document summarization by making use of the link relationships between sentences in the document set, under the assumption that all the sentences are indistinguishable from each other. However, different documents in the set are usually not equally important, and the sentences in an important document are deemed more salient than the sentences in a trivial document. This paper proposes the document-based HITS model (DocHITS) to fully leverage the document-level information by considering documents and sentences as hubs and authorities. Experimental results on the DUC2001 and DUC2002 datasets demonstrate the good effectiveness of our proposed model.