Content coverage maximization on word networks for hierarchical topic summarization

Authors:
Chi Wang;Xiao Yu;Yanen Li;Chengxiang Zhai;Jiawei Han
Affiliations:
University of Illinois at Urbana-Champaign, Champaign, USA;University of Illinois at Urbana-Champaign, Champaign, USA;University of Illinois at Urbana-Champaign, Champaign, USA;University of Illinois at Urbana-Champaign, Champaign, USA;University of Illinois at Urbana-Champaign, Champaign, USA
Venue:
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Year:
2013

Citing 24
Cited 0

The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Mining the network value of customers

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining knowledge-sharing sites for viral marketing

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Latent dirichlet allocation

The Journal of Machine Learning Research
Maximizing the spread of influence through a social network

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A practical web-based approach to generating topic hierarchy for text segments

Proceedings of the thirteenth ACM international conference on Information and knowledge management
A common theory of information fusion from multiple text sources step one: cross-document structure

SIGDIAL '00 Proceedings of the 1st SIGdial workshop on Discourse and dialogue - Volume 10
Pachinko allocation: DAG-structured mixture models of topic correlations

ICML '06 Proceedings of the 23rd international conference on Machine learning
Mixtures of hierarchical topics with Pachinko allocation

Proceedings of the 24th international conference on Machine learning
Cost-effective outbreak detection in networks

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Using the wisdom of the crowds for keyword generation

Proceedings of the 17th international conference on World Wide Web
Efficient influence maximization in social networks

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Social influence analysis in large-scale networks

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies

NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization
Deriving a large scale taxonomy from Wikipedia

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
LexRank: graph-based lexical centrality as salience in text summarization

Journal of Artificial Intelligence Research
Learning influence probabilities in social networks

Proceedings of the third ACM international conference on Web search and data mining
Exploiting neighborhood knowledge for single document summarization and keyphrase extraction

ACM Transactions on Information Systems (TOIS)
Approximation Algorithms

Approximation Algorithms
Tractable models for information diffusion in social networks

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Automatic taxonomy construction from keywords

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A phrase mining framework for recursive construction of a topical hierarchy

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper studies text summarization by extracting hierarchical topics from a given collection of documents. We propose a new approach of text modeling via network analysis. We convert documents into a word influence network, and find the words summarizing the major topics with an efficient influence maximization algorithm. Besides, the influence capability of the topic words on other words in the network reveal the relations among the topic words. Then we cluster the words and build hierarchies for the topics. Experiments on large collections of Web documents show that a simple method based on the influence analysis is effective, compared with existing generative topic modeling and random walk based ranking.