Using wikipedia anchor text and weighted clustering coefficient to enhance the traditional multi-document summarization

Authors:
Niraj Kumar;Kannan Srinathan;Vasudeva Varma
Affiliations:
IIIT-Hyderabad, Hyderabad, India;IIIT-Hyderabad, Hyderabad, India;IIIT-Hyderabad, Hyderabad, India
Venue:
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Year:
2012

Citing 12
Cited 3

Generic text summarization using relevance measure and latent semantic analysis

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
Centroid-based summarization of multiple documents

Information Processing and Management: an International Journal
From single to multi-document summarization: a prototype system and its evaluation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Summary in context: Searching versus browsing

ACM Transactions on Information Systems (TOIS)
Orthogonal nonnegative matrix t-factorizations for clustering

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic keyphrase extraction from scientific documents using N-gram filtration technique

Proceedings of the eighth ACM symposium on Document engineering
Integrating clustering and multi-document summarization to improve document understanding

Proceedings of the 17th ACM conference on Information and knowledge management
Multi-document summarization by maximizing informative content-words

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Document summarization using conditional random fields

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Multi-document summarization using sentence-based topic models

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

Application of Text Summarization techniques to the Geographical Information Retrieval task

Expert Systems with Applications: An International Journal
A knowledge induced graph-theoretical model for extract and abstract single document summarization

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
PSG: a two-layer graph model for document summarization

Frontiers of Computer Science: Selected Publications from Chinese Universities

Quantified Score

Hi-index	0.00

Visualization

Abstract

Similar to the traditional approach, we consider the task of summarization as selection of top ranked sentences from ranked sentence-clusters. To achieve this goal, we rank the sentence clusters by using the importance of words calculated by using page rank algorithm on reverse directed word graph of sentences. Next, to rank the sentences in every cluster we introduce the use of weighted clustering coefficient. We use page rank score of words for calculation of weighted clustering coefficient. Finally the most important issue is the presence of a lot of noisy entries in the text, which downgrades the performance of most of the text mining algorithms. To solve this problem, we introduce the use of Wikipedia anchor text based phrase mapping scheme. Our experimental results on DUC-2002 and DUC-2004 dataset show that our system performs better than unsupervised systems and better than/comparable with novel supervised systems of this area.