Exploring clustering for multi-document arabic summarisation

Authors:
Mahmoud El-Haj;Udo Kruschwitz;Chris Fox
Affiliations:
Computer Science and Electronic Engineering, University of Essex, United Kingdom;Computer Science and Electronic Engineering, University of Essex, United Kingdom;Computer Science and Electronic Engineering, University of Essex, United Kingdom
Venue:
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Year:
2011

Citing 17
Cited 0

Generating summaries of multiple news articles

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic condensation of electronic publications by sentence selection

Information Processing and Management: an International Journal - Special issue: summarizing text
A vector space model for automatic indexing

Communications of the ACM
Generic text summarization using relevance measure and latent semantic analysis

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient Phrase-Based Document Indexing for Web Document Clustering

IEEE Transactions on Knowledge and Data Engineering
Centroid-based summarization of multiple documents

Information Processing and Management: an International Journal
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Query-relevant summarization using FAQs

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Towards Fast Digestion of IMF Staff Reports with Automated Text Summarization Systems

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
QCS: A system for querying, clustering and summarizing documents

Information Processing and Management: an International Journal
iSpreadRank: Ranking sentences for extraction-based summarization using feature weight propagation in the sentence similarity network

Expert Systems with Applications: An International Journal
Multi-document summarization using cluster-based link analysis

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Using query expansion in graph-based approach for query-focused multi-document summarization

Information Processing and Management: an International Journal
Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies

NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization
Automatic summarization of MEDLINE citations for evidence-based medical treatment: A topic-oriented evaluation

Journal of Biomedical Informatics
Arabic/English multi-document summarization with CLASSY: the past and the future

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Using parallel corpora for multilingual (multi-document) summarisation evaluation

CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we explore clustering for multi-document Arabic summarisation. For our evaluation we use an Arabic version of the DUC-2002 dataset that we previously generated using Google Translate. We explore how clustering (at the sentence level) can be applied to multi-document summarisation as well as for redundancy elimination within this process. We use different parameter settings including the cluster size and the selection model applied in the extractive summarisation process. The automatically generated summaries are evaluated using the ROUGE metric, as well as precision and recall. The results we achieve are compared with the top five systems in the DUC-2002 multi-document summarisation task.