Characterization and detection of noise in clustering
Pattern Recognition Letters
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Query-sensitive similarity measures for the calculation of interdocument relationships
Proceedings of the tenth international conference on Information and knowledge management
Introduction to the special issue on summarization
Computational Linguistics - Summarization
Fast Outlier Detection in High Dimensional Spaces
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Distance-based outliers: algorithms and applications
The VLDB Journal — The International Journal on Very Large Data Bases
Multiclass Spectral Clustering
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Kernel k-means: spectral clustering and normalized cuts
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Centroid-based summarization of multiple documents
Information Processing and Management: an International Journal
Automatic evaluation of summaries using N-gram co-occurrence statistics
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Graph-based ranking algorithms for sentence extraction, applied to text summarization
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Automatic summarising: The state of the art
Information Processing and Management: an International Journal
A tutorial on spectral clustering
Statistics and Computing
Consensus unsupervised feature ranking from multiple views
Pattern Recognition Letters
Semantic text similarity using corpus-based word similarity and string similarity
ACM Transactions on Knowledge Discovery from Data (TKDD)
Multi-document summarization using cluster-based link analysis
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Unsupervised data pruning for clustering of noisy data
Knowledge-Based Systems
Integrating clustering and multi-document summarization to improve document understanding
Proceedings of the 17th ACM conference on Information and knowledge management
A matrix-based approach for semi-supervised document co-clustering
Proceedings of the 17th ACM conference on Information and knowledge management
Spectral Clustering, Ordering and Ranking: Statistical Learning with Matrix Factorizations
Spectral Clustering, Ordering and Ranking: Statistical Learning with Matrix Factorizations
Scientific paper summarization using citation summary networks
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
LexRank: graph-based lexical centrality as salience in text summarization
Journal of Artificial Intelligence Research
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
HCC: a hierarchical co-clustering algorithm
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A hybrid hierarchical model for multi-document summarization
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Document update summarization using incremental hierarchical clustering
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A context-sensitive manifold ranking approach to query-focused multi-document summarization
PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Efficient Semi-supervised Spectral Co-clustering with Constraints
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
A comparative study on ranking and selection strategies for multi-document summarization
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Journal of the American Society for Information Science and Technology
Information Processing and Management: an International Journal
Robust clustering methods: a unified view
IEEE Transactions on Fuzzy Systems
Extracting multi-document summaries with a double clustering approach
NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
Hi-index | 0.00 |
To overcome the fact that the length of sentences is short and their content is limited, we regard words as independent text objects rather than features of sentences in sentence clustering and develop two co-clustering frameworks, namely integrated clustering and interactive clustering, to cluster sentences and words simultaneously. Since real-world datasets always contain noise, we incorporate noise detection and removal to enhance clustering of sentences and words. Meanwhile, a semisupervised approach is explored to incorporate the query information (and the sentence information in early document sets) in theme-based summarization. Thorough experimental studies are conducted. When evaluated on the DUC2005-2007 datasets and TAC 2008-2009 datasets, the performance of the two noise-detecting co-clustering approaches is comparable with that of the top three systems. The results also demonstrate that the interactive with noise detection algorithm is more effective than the noise-detecting integrated algorithm.