EM clustering algorithm for automatic text summarization

  • Authors:
  • Yulia Ledeneva;René García Hernández;Romyna Montiel Soto;Rafael Cruz Reyes;Alexander Gelbukh

  • Affiliations:
  • Unidad Académica Profesional Tianguistenco, Universidad Autónoma del Estado de México, Estado de México;Unidad Académica Profesional Tianguistenco, Universidad Autónoma del Estado de México, Estado de México;Laboratorio de Reconocimiento de Patrones, Instituto Tecnológico de Toluca, Metepec, México;Laboratorio de Reconocimiento de Patrones, Instituto Tecnológico de Toluca, Metepec, México;Centro de Investigación en Computación, Instituto Politécnico Nacional, México

  • Venue:
  • MICAI'11 Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic text summarization has emerged as a technique for accessing only to useful information. In order to known the quality of the automatic summaries produced by a system, in DUC 2002 (Document Understanding Conference) has developed a standard human summaries called gold collection of 567 documents of single news. In this conference only five systems could outperforms the baseline heuristic in single extractive summarization task. So far, some approaches have got good results combining different strategies with language-dependent knowledge. In this paper, we present a competitive method based on an EM clustering algorithm for improving the quality of the automatic summaries using practically non language-dependent knowledge. Also, a comparison of this method with three text models is presented.