New Methods in Automatic Extracting
Journal of the ACM (JACM)
Text summarization via hidden Markov models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Latent dirichlet allocation based multi-document summarization
Proceedings of the second workshop on Analytics for noisy unstructured text data
Introduction to Information Retrieval
Introduction to Information Retrieval
Latent Dirichlet learning for document summarization
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Communications of the ACM
Hi-index | 0.00 |
In this paper, we present a novel approach that makes use of topic models based on Latent Dirichlet allocation(LDA) for generating single document summaries. Our approach is distinguished from other LDA based approaches in that we identify the summary topics which best describe a given document and only extract sentences from those paragraphs within the document which are highly correlated given the summary topics. This ensures that our summaries always highlight the crux of the document without paying any attention to the grammar and the structure of the documents. Finally, we evaluate our summaries on the DUC 2002 Single document summarization data corpus using ROUGE measures. Our summaries had higher ROUGE values and better semantic similarity with the documents than the DUC summaries.