Title Generation Using a Training Corpus

  • Authors:
  • Rong Jin;Alexander G. Hauptmann

  • Affiliations:
  • -;-

  • Venue:
  • CICLing '01 Proceedings of the Second International Conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper discusses fundamental issues involved in word selection for title generation. We review several methods for title generation, namely extractive summarization and two versions of a Naïve Bayesian, and compare the performance of those methods using an F1 metric. In addition, we introduce a novel approach to title generation using the k-nearest neighbor (KNN) algorithm. Both the KNN method and a limited-vocabulary Naïve Bayesian method outperform the other evaluated methods with an F1 score of around 20%. Since KNN produces complete and legible titles, we conclude that KNN is a very promising method for title generation, provided good content overlap exists between the training corpus and the test documents.