Intertopic information mining for query-based summarization

  • Authors:
  • You Ouyang;Wenjie Li;Sujian Li;Qin Lu

  • Affiliations:
  • Department of Computing, The Hong Kong Polytechnic University, Hong Kong;Department of Computing, The Hong Kong Polytechnic University, Hong Kong;Key Laboratory of Computational Linguistics (Peking University), Ministry of Education, China;Department of Computing, The Hong Kong Polytechnic University, Hong Kong

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this article, the authors address the problem of sentence ranking in summarization. Although most existing summarization approaches are concerned with the information embodied in a particular topic (including a set of documents and an associated query) for sentence ranking, they propose a novel ranking approach that incorporates intertopic information mining. Intertopic information, in contrast to intratopic information, is able to reveal pairwise topic relationships and thus can be considered as the bridge across different topics. In this article, the intertopic information is used for transferring word importance learned from known topics to unknown topics under a learning-based summarization framework. To mine this information, the authors model the topic relationship by clustering all the words in both known and unknown topics according to various kinds of word conceptual labels, which indicate the roles of the words in the topic. Based on the mined relationships, we develop a probabilistic model using manually generated summaries provided for known topics to predict ranking scores for sentences in unknown topics. A series of experiments have been conducted on the Document Understanding Conference (DUC) 2006 data set. The evaluation results show that intertopic information is indeed effective for sentence ranking and the resultant summarization system performs comparably well to the best-performing DUC participating systems on the same data set. © 2010 Wiley Periodicals, Inc.