Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Summarizing text documents: sentence selection and evaluation metrics
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Generic text summarization using relevance measure and latent semantic analysis
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Text summarization via hidden Markov models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Bipartite graph partitioning and data clustering
Proceedings of the tenth international conference on Information and knowledge management
Modern Information Retrieval
Document clustering with cluster refinement and model selection capabilities
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Summarization beyond sentence extraction: a probabilistic approach to sentence compression
Artificial Intelligence
A Min-max Cut Algorithm for Graph Partitioning and Data Clustering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A unified framework for model-based clustering
The Journal of Machine Learning Research
Cut and paste based text summarization
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Multidocument summarization: An added value to clustering in interactive retrieval
ACM Transactions on Information Systems (TOIS)
Document clustering by concept factorization
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Document clustering via adaptive subspace iteration
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Centroid-based summarization of multiple documents
Information Processing and Management: an International Journal
From single to multi-document summarization: a prototype system and its evaluation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic evaluation of summaries using N-gram co-occurrence statistics
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A general model for clustering binary data
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Generative model-based document clustering: a comparative study
Knowledge and Information Systems
ICML '06 Proceedings of the 23rd international conference on Machine learning
Orthogonal nonnegative matrix t-factorizations for clustering
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised learning on k-partite graphs
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data
IEEE Transactions on Knowledge and Data Engineering
Regularized clustering for documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Fast generation of result snippets in web search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
QCS: A system for querying, clustering and summarizing documents
Information Processing and Management: an International Journal
Tracking and summarizing news on a daily basis with Columbia's Newsblaster
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Multi-document summarization using cluster-based link analysis
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Multi-document Summarization Based on Cluster Using Non-negative Matrix Factorization
SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
Integrating clustering and multi-document summarization to improve document understanding
Proceedings of the 17th ACM conference on Information and knowledge management
Topic-driven multi-document summarization with encyclopedic knowledge and spreading activation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Multi-document summarization by maximizing informative content-words
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Document summarization using conditional random fields
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Manifold-ranking based topic-focused multi-document summarization
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Graph-based multi-modality learning for topic-focused multi-document summarization
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Multi-document summarization using sentence-based topic models
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Multi-document Summarization by Information Distance
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Multilevel spectral hypergraph partitioning with arbitrary vertex sizes
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
GenDocSum+MCLR: Generic document summarization based on maximum coverage and less redundancy
Expert Systems with Applications: An International Journal
CDDS: Constraint-driven document summarization models
Expert Systems with Applications: An International Journal
Multiple documents summarization based on evolutionary optimization algorithm
Expert Systems with Applications: An International Journal
Towards focused knowledge extraction: query-based extraction of structured summaries
Proceedings of the 22nd international conference on World Wide Web companion
Multi-document summarization based on the Yago ontology
Expert Systems with Applications: An International Journal
Summaries on the fly: query-based extraction of structured knowledge from web documents
ICWE'13 Proceedings of the 13th international conference on Web Engineering
Modeling and broadening temporal user interest in personalized news recommendation
Expert Systems with Applications: An International Journal
Hi-index | 0.01 |
Document understanding techniques such as document clustering and multidocument summarization have been receiving much attention recently. Current document clustering methods usually represent the given collection of documents as a document-term matrix and then conduct the clustering process. Although many of these clustering methods can group the documents effectively, it is still hard for people to capture the meaning of the documents since there is no satisfactory interpretation for each document cluster. A straightforward solution is to first cluster the documents and then summarize each document cluster using summarization methods. However, most of the current summarization methods are solely based on the sentence-term matrix and ignore the context dependence of the sentences. As a result, the generated summaries lack guidance from the document clusters. In this article, we propose a new language model to simultaneously cluster and summarize documents by making use of both the document-term and sentence-term matrices. By utilizing the mutual influence of document clustering and summarization, our method makes; (1) a better document clustering method with more meaningful interpretation; and (2) an effective document summarization method with guidance from document clustering. Experimental results on various document datasets show the effectiveness of our proposed method and the high interpretability of the generated summaries.