Cross-document summarization by concept classification

Authors:
Hilda Hardy;Nobuyuki Shimizu;Tomek Strzalkowski;Liu Ting;Xinyang Zhang;G. Bowden Wise
Affiliations:
NLIP Laboratory, University at Albany, Albany, NY;NLIP Laboratory, University at Albany, Albany, NY;NLIP Laboratory, University at Albany, Albany, NY;NLIP Laboratory, University at Albany, Albany, NY;NLIP Laboratory, University at Albany, Albany, NY;GE Global Research Center, Niskayuna, NY
Venue:
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2002

Citing 6
Cited 26

Recent trends in hierarchic document clustering: a critical review

Information Processing and Management: an International Journal
Generating summaries of multiple news articles

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
WordNet: a lexical database for English

Communications of the ACM
Pivoted document length normalization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics

Query association for effective retrieval

Proceedings of the eleventh international conference on Information and knowledge management
HITIQA: an interactive question answering system a preliminary report

MultiSumQA '03 Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering - Volume 12
HITIQA: towards analytical question answering

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Using Cross-Document Random Walks for Topic-Focused Multi-Document

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Automatic multidocument summarization of research abstracts: Design and user evaluation

Journal of the American Society for Information Science and Technology
Design and development of a concept-based multi-document summarization system for research abstracts

Journal of Information Science
Multi-document summarization using cluster-based link analysis

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Document-Based HITS Model for Multi-document Summarization

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Hitiqa: High-quality intelligence through interactive question answering

Natural Language Engineering
Enhancing diversity, coverage and balance for summarization through structure learning

Proceedings of the 18th international conference on World wide web
Estimating Risk of Picking a Sentence for Document Summarization

CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
An exploration of document impact on graph-based multi-document summarization

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
HITIQA: a data driven approach to interactive analytical question answering

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Improved affinity graph based multi-document summarization

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Manifold-ranking based topic-focused multi-document summarization

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Graph-based multi-modality learning for topic-focused multi-document summarization

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Studies on intrinsic summary evaluation

International Journal of Artificial Intelligence and Soft Computing
VCA: an experiment with a multiparty virtual chat agent

CDS '10 Proceedings of the 2010 Workshop on Companionable Dialogue Systems
Simultaneous ranking and clustering of sentences: a reinforcement approach to multi-document summarization

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
The great importance of cross-document relationships for multi-document summarization

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
An adjacency model for sentence ordering in multi-document summarization

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Multi-document summarization based on unsupervised clustering

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Multi-document summarization using a clustering-based hybrid strategy

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Multi-document summarization based on BE-Vector clustering

CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Methods of estimating the number of clusters for person cross document coreference task

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Towards a unified approach based on affinity graph to various multi-document summarizations

ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we describe a Cross Document Summarizer XDoX designed specifically to summarize large document sets (50-500 documents and more). Such sets of documents are typically obtained from routing or filtering systems run against a continuous stream of data, such as a newswire. XDoX works by identifying the most salient themes within the set (at the granularity level that is regulated by the user) and composing an extraction summary, which reflects these main themes. In the current version, XDoX is not optimized to produce a summary based on a few unrelated documents; indeed, such summaries are best obtained simply by concatenating summaries of individual documents. We show examples of summaries obtained in our tests as well as from our participation in the first Document Understanding Conference (DUC).