Using word sequences for text summarization

  • Authors:
  • Esaú Villatoro-Tello;Luis Villaseñor-Pineda;Manuel Montes-y-Gómez

  • Affiliations:
  • Language Technologies Group, Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Mexico;Language Technologies Group, Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Mexico;Language Technologies Group, Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Mexico

  • Venue:
  • TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional approaches for extractive summarization score/classify sentences based on features such as position in the text, word frequency and cue phrases These features tend to produce satisfactory summaries, but have the inconvenience of being domain dependent In this paper, we propose to tackle this problem representing the sentences by word sequences (n-grams), a widely used representation in text categorization The experiments demonstrated that this simple representation not only diminishes the domain and language dependency but also enhances the summarization performance.