An example-based mapping method for text categorization and retrieval
ACM Transactions on Information Systems (TOIS)
A trainable document summarizer
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic text structuring and summarization
Information Processing and Management: an International Journal - Special issue: methods and tools for the automatic construction of hypertext
Summarizing text documents: sentence selection and evaluation metrics
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Selecting text spans for document summaries: heuristics and metrics
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Information Retrieval
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
How can catchy titles be generated without loss of informativeness?
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
This paper discusses fundamental issues involved in word selection for title generation. We review several methods for title generation, namely extractive summarization and two versions of a Naïve Bayesian, and compare the performance of those methods using an F1 metric. In addition, we introduce a novel approach to title generation using the k-nearest neighbor (KNN) algorithm. Both the KNN method and a limited-vocabulary Naïve Bayesian method outperform the other evaluated methods with an F1 score of around 20%. Since KNN produces complete and legible titles, we conclude that KNN is a very promising method for title generation, provided good content overlap exists between the training corpus and the test documents.