Using word sequences for text summarization

Authors:
Esaú Villatoro-Tello;Luis Villaseñor-Pineda;Manuel Montes-y-Gómez
Affiliations:
Language Technologies Group, Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Mexico;Language Technologies Group, Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Mexico;Language Technologies Group, Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Mexico
Venue:
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Year:
2006

Citing 6
Cited 3

A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Automatic Text Summarization Using a Machine Learning Approach

SBIA '02 Proceedings of the 16th Brazilian Symposium on Artificial Intelligence: Advances in Artificial Intelligence
Text Summarization by Sentence Segment Extraction Using Machine Learning Algorithms

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Using N-Grams to understand the nature of summaries

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers

Effect of Preprocessing on Extractive Summarization with Maximal Frequent Sequences

MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Text Summarization by Sentence Extraction Using Unsupervised Learning

MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Terms derived from frequent sequences for extractive text summarization

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional approaches for extractive summarization score/classify sentences based on features such as position in the text, word frequency and cue phrases These features tend to produce satisfactory summaries, but have the inconvenience of being domain dependent In this paper, we propose to tackle this problem representing the sentences by word sequences (n-grams), a widely used representation in text categorization The experiments demonstrated that this simple representation not only diminishes the domain and language dependency but also enhances the summarization performance.