Text Summarization by Sentence Extraction Using Unsupervised Learning

  • Authors:
  • René Arnulfo García-Hernández;Romyna Montiel;Yulia Ledeneva;Eréndira Rendón;Alexander Gelbukh;Rafael Cruz

  • Affiliations:
  • Pattern Recognition Laboratory, Toluca Institute of Technology, Mexico, Autonomous University of the State of Mexico, Mexico, Center for Computing Research, National Polytechnic Institute, Mexico, ...;Pattern Recognition Laboratory, Toluca Institute of Technology, Mexico, Autonomous University of the State of Mexico, Mexico, Center for Computing Research, National Polytechnic Institute, Mexico, ...;Pattern Recognition Laboratory, Toluca Institute of Technology, Mexico, Autonomous University of the State of Mexico, Mexico, Center for Computing Research, National Polytechnic Institute, Mexico, ...;Pattern Recognition Laboratory, Toluca Institute of Technology, Mexico, Autonomous University of the State of Mexico, Mexico, Center for Computing Research, National Polytechnic Institute, Mexico, ...;Pattern Recognition Laboratory, Toluca Institute of Technology, Mexico, Autonomous University of the State of Mexico, Mexico, Center for Computing Research, National Polytechnic Institute, Mexico, ...;Pattern Recognition Laboratory, Toluca Institute of Technology, Mexico, Autonomous University of the State of Mexico, Mexico, Center for Computing Research, National Polytechnic Institute, Mexico, ...

  • Venue:
  • MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The main problem for generating an extractive automatic text summary is to detect the most relevant information in the source document. Although, some approaches claim being domain and language independent, they use high dependence knowledge like key-phrases or golden samples for machine-learning approaches. In this work, we propose a language- and domain-independent automatic text summarization approach by sentence extraction using an unsupervised learning algorithm. Our hypothesis is that an unsupervised algorithm can help for clustering similar ideas (sentences). Then, for composing the summary, the most representative sentence is selected from each cluster. Several experiments in the standard DUC-2002 collection show that the proposed method obtains more favorable results than other approaches.