EM clustering algorithm for automatic text summarization

Authors:
Yulia Ledeneva;René García Hernández;Romyna Montiel Soto;Rafael Cruz Reyes;Alexander Gelbukh
Affiliations:
Unidad Académica Profesional Tianguistenco, Universidad Autónoma del Estado de México, Estado de México;Unidad Académica Profesional Tianguistenco, Universidad Autónoma del Estado de México, Estado de México;Laboratorio de Reconocimiento de Patrones, Instituto Tecnológico de Toluca, Metepec, México;Laboratorio de Reconocimiento de Patrones, Instituto Tecnológico de Toluca, Metepec, México;Centro de Investigación en Computación, Instituto Politécnico Nacional, México
Venue:
MICAI'11 Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Year:
2011

Citing 19
Cited 0

A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic condensation of electronic publications by sentence selection

Information Processing and Management: an International Journal - Special issue: summarizing text
Summarizing text documents: sentence selection and evaluation metrics

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
New Methods in Automatic Extracting

Journal of the ACM (JACM)
The Theory and Practice of Discourse Parsing and Summarization

The Theory and Practice of Discourse Parsing and Summarization
The rhetorical parsing, summarization, and generation of natural language texts

The rhetorical parsing, summarization, and generation of natural language texts
The rhetorical parsing, summarization, and generation of natural language texts

The rhetorical parsing, summarization, and generation of natural language texts
Centroid-based summarization of multiple documents

Information Processing and Management: an International Journal
Text summarization using a trainable summarizer and latent semantic analysis

Information Processing and Management: an International Journal - Special issue: An Asian digital libraries perspective
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Manual and automatic evaluation of summaries

AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Automatic generic document summarization based on non-negative matrix factorization

Information Processing and Management: an International Journal
Text Summarization by Sentence Extraction Using Unsupervised Learning

MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Document summarization using conditional random fields

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
The automatic creation of literature abstracts

IBM Journal of Research and Development
Comparing Commercial Tools and State-of-the-Art Methods for Generating Text Summaries

MICAI '09 Proceedings of the 2009 Eighth Mexican International Conference on Artificial Intelligence
A new hybrid summarizer based on vector space model, statistical physics and linguistics

MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Approach to construction of automatic morphological analysis systems for inflective languages with little effort

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic text summarization has emerged as a technique for accessing only to useful information. In order to known the quality of the automatic summaries produced by a system, in DUC 2002 (Document Understanding Conference) has developed a standard human summaries called gold collection of 567 documents of single news. In this conference only five systems could outperforms the baseline heuristic in single extractive summarization task. So far, some approaches have got good results combining different strategies with language-dependent knowledge. In this paper, we present a competitive method based on an EM clustering algorithm for improving the quality of the automatic summaries using practically non language-dependent knowledge. Also, a comparison of this method with three text models is presented.