Making large-scale support vector machine learning practical
Advances in kernel methods
Foundations of statistical natural language processing
Foundations of statistical natural language processing
The Journal of Machine Learning Research
Centroid-based summarization of multiple documents
Information Processing and Management: an International Journal
Text summarization using a trainable summarizer and latent semantic analysis
Information Processing and Management: an International Journal - Special issue: An Asian digital libraries perspective
Automatic evaluation of summaries using N-gram co-occurrence statistics
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Bayesian query-focused summarization
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Topic-focused multi-document summarization using an approximate oracle score
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Exploring content models for multi-document summarization
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Learning document-level semantic properties from free-text annotations
Journal of Artificial Intelligence Research
Document summarization using conditional random fields
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies
Journal of the ACM (JACM)
Discovery of topically coherent sentences for extractive summarization
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A class of submodular functions for document summarization
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Using bilingual information for cross-language document summarization
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Summarizing the differences in multilingual news
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Extractive multi-document summaries should explicitly not contain document-specific content
WASDGML '11 Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages
Applied Computational Intelligence and Soft Computing
Unsupervised topic modeling approaches to decision summarization in spoken meetings
SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Combining co-clustering with noise detection for theme-based summarization
ACM Transactions on Speech and Language Processing (TSLP)
Journal of Information Science
Hi-index | 0.00 |
Scoring sentences in documents given abstract summaries created by humans is important in extractive multi-document summarization. In this paper, we formulate extractive summarization as a two step learning problem building a generative model for pattern discovery and a regression model for inference. We calculate scores for sentences in document clusters based on their latent characteristics using a hierarchical topic model. Then, using these scores, we train a regression model based on the lexical and structural characteristics of the sentences, and use the model to score sentences of new documents to form a summary. Our system advances current state-of-the-art improving ROUGE scores by ~7%. Generated summaries are less redundant and more coherent based upon manual quality evaluations.