Foundations of statistical natural language processing
Foundations of statistical natural language processing
The automated acquisition of topic signatures for text summarization
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Topic-focused multi-document summarization using an approximate oracle score
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Automatic text summarization of newswire: lessons learned from the document understanding conference
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
FastSum: fast and accurate query-based multi-document summarization
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Towards automatic generation of gene summary
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
An exploration of document impact on graph-based multi-document summarization
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
An extractive supervised two-stage method for sentence compression
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Learning web query patterns for imitating Wikipedia articles
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
GEMS: generative modeling for evaluation of summaries
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Exploiting relevance, coverage, and novelty for query-focused multi-document summarization
Knowledge-Based Systems
Hi-index | 0.00 |
The increasing complexity of summarization systems makes it difficult to analyze exactly which modules make a difference in performance. We carried out a principled comparison between the two most commonly used schemes for assigning importance to words in the context of query focused multi-document summarization: raw frequency (word probability) and log-likelihood ratio. We demonstrate that the advantages of log-likelihood ratio come from its known distributional properties which allow for the identification of a set of words that in its entirety defines the aboutness of the input. We also find that LLR is more suitable for query-focused summarization since, unlike raw frequency, it is more sensitive to the integration of the information need defined by the user.