A preference learning approach to sentence ordering for multi-document summarization

Authors:
Danushka Bollegala;Naoaki Okazaki;Mitsuru Ishizuka
Affiliations:
Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan;Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan;Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
Venue:
Information Sciences: an International Journal
Year:
2012

Citing 27
Cited 2

The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Towards multidocument summarization by reformulation: progress and prospects

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Building natural language generation systems

Building natural language generation systems
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
Generating natural language summaries from multiple on-line sources

Computational Linguistics - Special issue on natural language generation
Empirically estimating order constraints for content planning in generation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Inferring temporal ordering of events in news

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Probabilistic text structuring: experiments with sentence ordering

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Robust temporal processing of news

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Assigning time-stamps to event-clauses

TASIP '01 Proceedings of the workshop on Temporal and spatial information processing - Volume 13
A bottom-up approach to sentence ordering for multi-document summarization

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improving chronological sentence ordering by precedence relation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Automatic Evaluation of Information Ordering: Kendall's Tau

Computational Linguistics
Finding and linking incidents in news

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Listwise approach to learning to rank: theory and algorithm

Proceedings of the 25th international conference on Machine learning
The Evaluation of Sentence Similarity Measures

DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Sentence ordering with manifold-based classification in multi-document summarization

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Learning sentence-internal temporal relations

Journal of Artificial Intelligence Research
Inferring strategies for sentence ordering in multidocument news summarization

Journal of Artificial Intelligence Research
Learning to order things

Journal of Artificial Intelligence Research
A bottom-up approach to sentence ordering for multi-document summarization

Information Processing and Management: an International Journal
Improving generalization of fuzzy IF-THEN rules by maximizing fuzzy entropy

IEEE Transactions on Fuzzy Systems
Similarity measures for short segments of text

ECIR'07 Proceedings of the 29th European conference on IR research
RankDE: learning a ranking function for information retrieval using differential evolution

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Probabilistic models of similarity in syntactic context

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Maximum Ambiguity-Based Sample Selection in Fuzzy Decision Tree Induction

IEEE Transactions on Knowledge and Data Engineering

Ontology-enriched multi-document summarization in disaster management using submodular function

Information Sciences: an International Journal
Automatic preference learning on numeric and multi-valued categorical attributes

Knowledge-Based Systems

Quantified Score

Hi-index	0.07

Visualization

Abstract

Ordering information is a difficult but an important task for applications generating natural-language texts such as multi-document summarization, question answering, and concept-to-text generation. In multi-document summarization, information is selected from a set of source documents. Therefore, the optimal ordering of those selected pieces of information to create a coherent summary is not obvious. Improper ordering of information in a summary can both confuse the reader and deteriorate the readability of the summary. Therefore, it is vital to properly order the information in multi-document summarization. We model the problem of sentence ordering in multi-document summarization as a one of learning the optimal combination of preference experts that determine the ordering between two given sentences. To capture the preference of a sentence against another sentence, we define five preference experts: chronology, probabilistic, topical-closeness, precedence, and succession. We use summaries ordered by human annotators as training data to learn the optimal combination of the different preference experts. Finally, the learnt combination is applied to order sentences extracted in a multi-document summarization system. The proposed sentence ordering algorithm considers pairwise comparisons between sentences to determine a total ordering, using a greedy search algorithm, thereby avoiding the combinatorial time complexity typically associated with total ordering tasks. This enables us to efficiently order sentences in longer summaries, thereby rendering the proposed approach useable in real-world text summarization systems. We evaluate the sentence orderings produced by the proposed method and numerous other baselines using both semi-automatic evaluation measures as well as performing a subjective evaluation.