Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
ACM Computing Surveys (CSUR)
Semi-supervised support vector machines
Proceedings of the 1998 conference on Advances in neural information processing systems II
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A Min-max Cut Algorithm for Graph Partitioning and Data Clustering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Clustering with Instance-level Constraints
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Segmentation Given Partial Grouping Constraints
IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Restrictive clustering and metaclustering for self-organizing document collections
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Evolutionary spectral clustering by incorporating temporal smoothness
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Co-clustering based classification for out-of-domain documents
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Deriving semantics for image clustering from accumulated user feedbacks
Proceedings of the 15th international conference on Multimedia
Towards effective document clustering: A constrained K-means based approach
Information Processing and Management: an International Journal
Topic-bridged PLSA for cross-domain text classification
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Spectral domain-transfer learning
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Non-negative matrix factorization for semi-supervised data clustering
Knowledge and Information Systems
Knowledge Supervised Text Classification with No Labeled Documents
PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Pairwise Constrained Clustering for Sparse and High Dimensional Feature Spaces
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Query result clustering for object-level search
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Avoiding Bias in Text Clustering Using Constrained K-means and May-Not-Links
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Relaxed Transfer of Different Classes via Spectral Partition
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Semi-supervised Document Clustering with Simultaneous Text Representation and Categorization
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
On evolutionary spectral clustering
ACM Transactions on Knowledge Discovery from Data (TKDD)
Dual fuzzy-possibilistic coclustering for categorization of documents
IEEE Transactions on Fuzzy Systems
Supervised Dual-PLSA for Personalized SMS Filtering
AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Document Clustering with Cluster Refinement and Non-negative Matrix Factorization
ICONIP '09 Proceedings of the 16th International Conference on Neural Information Processing: Part II
Constrained Laplacian Eigenmap for dimensionality reduction
Neurocomputing
Using topic themes for multi-document summarization
ACM Transactions on Information Systems (TOIS)
Flexible constrained spectral clustering
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
Costco: robust content and structure constrained clustering of networked documents
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Document clustering using NMF and fuzzy relation
Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
Interactive feature selection for document clustering
Proceedings of the 2011 ACM Symposium on Applied Computing
A framework for personalized and collaborative clustering of search results
Proceedings of the 20th ACM international conference on Information and knowledge management
An experimental study of constrained clustering effectiveness in presence of erroneous constraints
Information Processing and Management: an International Journal
The optimum clustering framework: implementing the cluster hypothesis
Information Retrieval
Semi-supervised document clustering with dual supervision through seeding
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Enhancing semi-supervised document clustering with feature supervision
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Language modelling of constraints for text clustering
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Using the overlapping community structure of a network of tags to improve text clustering
Proceedings of the 23rd ACM conference on Hypertext and social media
A unified framework for document clustering with dual supervision
ACM SIGAPP Applied Computing Review
Constrained co-clustering with non-negative matrix factorisation
International Journal of Business Intelligence and Data Mining
Fuzzy semi-supervised co-clustering for text documents
Fuzzy Sets and Systems
Clustering tagged documents with labeled and unlabeled documents
Information Processing and Management: an International Journal
Clustering documents with labeled and unlabeled documents using fuzzy semi-Kmeans
Fuzzy Sets and Systems
Information-theoretic term weighting schemes for document clustering
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
On Knowledge-Enhanced Document Clustering
International Journal of Information Retrieval Research
Hybrid entity clustering using crowds and data
The VLDB Journal — The International Journal on Very Large Data Bases
On constrained spectral clustering and its applications
Data Mining and Knowledge Discovery
Constrained instance clustering in multi-instance multi-label learning
Pattern Recognition Letters
Adaptive evolutionary clustering
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Document clustering is an important tool for text analysis and is used in many different applications. We propose to incorporate prior knowledge of cluster membership for document cluster analysis and develop a novel semi-supervised document clustering model. The method models a set of documents with weighted graph in which each document is represented as a vertex, and each edge connecting a pair of vertices is weighted with the similarity value of the two corresponding documents. The prior knowledge indicates pairs of documents that known to belong to the same cluster. Then, the prior knowledge is transformed into a set of constraints. The document clustering task is accomplished by finding the best cuts of the graph under the constraints. We apply the model to the Normalized Cut method to demonstrate the idea and concept. Our experimental evaluations show that the proposed document clustering model reveals remarkable performance improvements with very limited training samples, and hence is a very effective semi-supervised classification tool.