A new approach to unsupervised text summarization

  • Authors:
  • Tadashi Nomoto;Yuji Matsumoto

  • Affiliations:
  • National Institute of Japanese Literature, Tokyo, Japan;Nara Institute of Science and Technology, Nara, Japan

  • Venue:
  • Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper presents a novel approach to unsupervised text summarization. The novelty lies in exploiting the diversity of concepts in text for summarization, which has not received much attention in the summarization literature. A diversity-based approach here is a principled generalization of Maximal Marginal Relevance criterion by Carbonell and Goldstein \cite{carbonell-goldstein98}.We propose, in addition, aninformation-centricapproach to evaluation, where the quality of summaries is judged not in terms of how well they match human-created summaries but in terms of how well they represent their source documents in IR tasks such document retrieval and text categorization.To find the effectiveness of our approach under the proposed evaluation scheme, we set out to examine how a system with the diversity functionality performs against one without, using the BMIR-J2 corpus, a test data developed by a Japanese research consortium. The results demonstrate a clear superiority of a diversity based approach to a non-diversity based approach.